Lead SRE Hybrid- Hartford, CT (Needed Local with DL or State ID) USC GC at Hartford, Connecticut, USA |
Email: [email protected] |
From: Ankit Upadhyay, Pivotal Technologies [email protected] Reply to: [email protected] Title: Lead SRE Location: Hybrid- Hartford, CT Duration: 6+ Months (C2H) Rate: $60/hr Visa: USC/GC Client Domain: Insurance experience Note: Lead SREs to define Observability Processes/Center of Excellence for other SREs for Insurance systems- Automation to fix issues to meet SLA, setting up SLO/SLIs/SLAs- Setup Holistic, Open Source processes from beginning to end IaaC/IaaS, Automation, DevOps, Observability, CI/CD Pipelines, use metrics, create Dashboards. Someone to Champion migration to Open-Source Platforms to establish standards. Agile managing Backlogs/ Backlog refinement, metrics, golden signals. Key Responsibilities: Looking for more of a Process guy to work as an SRE to setup observability and Monitoring metrics for each LOB to support Cloud systems. Wants them to lead Transformations end to end from development to deployment Blue/Green/Canary. Recomending SLO/SLIs/SLAs- Setup Holistic, Open Source processes from beginning to end IaaC/IaaS, Automation, DevOps, Observability, CI/CD Pipelines, use metrics, create Dashboards. Someone to Champion migration to Open-Source Platforms to establish standards- Agile managing Backlogs/ Backlog refinement, metrics, golden signals. The ideal candidate should have a strong background in SRE and IT operations, as well as proficiency in various programming languages. Position requires a strong technical understanding of complex IT environments, cloud, and evolving technologies. Responsibilities: Influence and design architecture, infrastructure, standards and methods for large-scale cloud systems Engage in and improve the software development life-cycle through CI/CD; Improve build to deployment process to establish greater reliability and a sustainable release process; Oversee release gating; establish deployment metrics (DORA). Monitor and develop SLOs and SLIs through customer user journey; Advise on SLA; Establish error budgets What is SLI SLO and SLA Observability and custom monitoring tool integrations; introduce telemetry to support SLOs Automate system scalability and continually work to improve system resiliency, performance, and efficiency; Makes recommendations for design changes for improved reliability for HA Systems Deploy software through highly available deployments; rolling, blue-green or canary Provide mentorship to reliability engineering squads under a consistent framework for the Development, Testing and Alerting processes Practice sustainable incident response through blameless RCA and postmortems Advise performance testing and capacity planning Communicate proactively with colleagues and formally present work product outcomes and risk analysis to product team and management. Follow the Agile/Scrum working methodologies Establish dashboarding for monitoring capabilities and metrics Qualifications: 8 + years of relevant technical experience BS degree in Engineering, Computer Science, or equivalent practical experience Expertise designing, analyzing, and troubleshooting large-scale distributed systems. Experience in implementing Infrastructure as code Experience building software and maintaining systems in a highly secure, regulated or compliant industry Experience in monitoring infrastructure and application service level objectives to ensure functional and performance objectives. Experience in implementing service dashboards for monitoring. objectives, and metrics Experience developing and/or administering software in AWS cloud infrastructure System administration skills, including automation and orchestration of environments using Terraform or CloudFormation and configuration management 3-5 years of experience in languages such as Python, Ruby, Bash, Powershell Experience with container orchestration tools and container management (Docker, Kubernetes, etc.) Proficiency with continuous integration and continuous delivery tooling and practices Must have exceptional communication skills (written, oral, presentation and facilitation) Skills: Solid understanding of AWS, DevSecOps practices, SAFe Agile methodologies Knowledgeable of Amazon Web Services including but not limited to EC2, S3, ECS, RDS, CloudWatch, SNS, CloudTrail, SQS, Service Catalog. Expertise with cloud platforms like AWS and microservices architecture Familiarity with enterprise software solutions such as GitHub, Jenkins, Nexus, Ansible, Jira, Rally.. etc. Observability and Monitoring Tools and Metrics- Dynatrace, Splunk,Nagios, Cloudwatch, ELK, Grafana,Prometheus..... Familiarity with programming languages (Python, Lambda, Go ) Experience in Infrastructure as Code (IaC) using CloudFormation & Terraform templates, YAML files, build specifications Must have exceptional communication skills (written, oral, presentation and facilitation) Solid understanding of technologies that support the services offered for cloud applications Thanks & Regards, Ankit Upadhyay IT Recruiter Office: +1 ( 703) 570-8775 (Ext-217) Email- [email protected] Keywords: continuous integration continuous deployment sthree information technology golang green card Connecticut |
[email protected] View all |
Fri Jan 26 01:11:00 UTC 2024 |