Need SRE AND Onsite position at Manhattan, New York, USA |
Email: [email protected] |
Hi, This is Lucy from American Unit. I hope you are doing great, please see the job description below and if you are really interested and comfortable with this position . Please forward your updated resume to [email protected] JOB DESCRIPTION:- ROLE:- Sr. Site Reliability Engineer Location: Manhattan, NY - Onsite Duration: 12+months Mandatory Skills: SRE, Standard RPE + excellent communication skills both written and verbal. Pl. create RR etc. The Application Infrastructure (AI) SRE Ops & Support department is seeking a Site Reliability Engineer to drive the reliability engineering, operations and customer support services for client suite of IT Service Management (ITSM) products. AI SRE & Ops Support is a cornerstone of the Application Infrastructure organization in clients Technology Division. Responsibilities Include: Building and maintaining knowledge front to back of Application Infrastructures IT Service Management products, and then specializing in one or two of them Maximizing the availability and performance of supported systems through optimized and automated plant management, ongoing problem management, and architecture reviews with dev-side peers Reduction of the cost of support (hours of effort) through the elimination of operational issues, optimization and automation of tasks, development of operational tools and driving client self-service to minimize constraints Identification and prioritization of technical debt that is impacting client developer productivity, reliability or the efficiency of the ops team Complex troubleshooting in a Linux environment Consult with clients (the Firms internal development community, IT service practitioners) to maximize their productivity, including troubleshooting the issues they have using the departments products Minimizing the escalation rate to the dev-side product delivery team members to ensure the department has the greatest possible flow of feature delivery Being operationally responsive, including sharing on-call rotation with the rest of the global team (with a time-off in lieu system) Required Qualifications / Skills: Standard RPE and excellent communication skills both written and verbal. Strong Linux skills Experience w/ Python for task automation Good communication skills Experience with Incident management processes Oncall support is required Strong Linux troubleshooting skills Task automation experience in any programming language Practical experience of at least one pillar of observability (metrics, logs or traces) Exhibit working knowledge in at least ONE of the following areas SQL REST services (API) Load balancing and networking Performance troubleshooting and resolution Confident collaboration skills Desired Skills : Python development for task automation Experience with site reliability engineering practices, like service level objectives (SLOs), error budgets, blameless postmortems, toil reduction Prior experience creating operational dashboards (Splunk, Grafana, etc) Experience administering and/or supporting ServiceNow Thanks & Regards .. Lucy (Priyanka Moyila) American Unit Inc. 2901 N Dallas Pkwy, Suite 333 Plano, TX 75093 Email: [email protected] -- Keywords: artificial intelligence information technology procedural language New York Texas |
[email protected] View all |
Wed Aug 02 02:14:00 UTC 2023 |