Home

Expert - Director - Site Reliability Engineering (SRE) at Remote, Remote, USA
Email: [email protected]
From:

Nishant Chaudhary,

Sonitalentcorp

[email protected]

Reply to: [email protected]

Hi,

Please have a look at the description below
.

Job Role - Expert / Director - Site Reliability Engineering (SRE)

Location Remote

Durations 12+ Months

Seeking - Expert / Director - Site Reliability Engineering (SRE)

Top 5 Technical Skills:

Proficiency in cloud computing platforms such as Azure and AWS, with a focus on implementing SRE principles and best practices within cloud-native environments.

Expertise in containerization technologies such as Docker and Kubernetes, including experience with Kubernetes operators and implementing SRE patterns for containerized applications.

Strong programming skills in languages like Python and/or Java, with a focus on developing automation scripts, tooling, and monitoring solutions to support SRE objectives.

Deep understanding of infrastructure as code (IaC) tools such as Terraform or Ansible, with experience architecting and managing infrastructure as code deployments to support SRE initiatives.

Experience with SRE-specific tools and technologies, such as Prometheus, Grafana, ELK stack (Elasticsearch, Logstash, Kibana), and Service Mesh (e.g., Istio), for monitoring, logging, tracing, and observability in distributed systems architectures.

Top 3 Soft Skills:

Excellent leadership and communication skills, with the ability to effectively collaborate with cross-functional teams.

Strong problem-solving and critical thinking abilities, with a focus on driving continuous improvement.

Exceptional organizational and time-management skills, with the ability to prioritize tasks and manage competing priorities in a fast-paced environment.

Job Description:

Overview:

Seeking a highly skilled and experienced Director-Level Site Reliability Engineer to lead our Site Reliability Engineering (SRE) efforts. The ideal candidate will have a strong technical background, exceptional leadership abilities, and a proven track record of designing and implementing SRE programs for large-scale enterprises. You will be responsible for guiding and leading the SRE Program by developing standards and repeatable processes and procedures providing the SRE Teams direction and disciplines.

Responsibilities:

Lead the design, development, and implementation of the Site Reliability Engineering (SRE) program for the enterprise. Establish and maintain service level objectives (SLOs), service level indicators (SLIs), and error budgets to ensure system reliability and performance. Collaborating with cross-functional teams to identify and address reliability and scalability issues across the organization and provide solutions that remediate issues and enhance all environments. Develop and Publish best practices for incident management, including incident response, post-mortem analysis, and resolution. Develop and maintain monitoring, alerting, and observability solutions to proactively identify and mitigate potential issues.

Lead and mentor a team of SRE engineers, providing guidance, support, and professional development opportunities.

Thanks and Regards

Nishant Chaudhary

Sr. Technical Recruiter

SONITALENT CORP

Desk no. 859-659-1315

Email:

[email protected]

Linkedin:

linkedin.com/in/nishant-chaudhary-642a9a193

Keywords:
[email protected]
View all
Wed Mar 13 04:55:00 UTC 2024

To remove this job post send "job_kill 1211238" as subject from [email protected] to [email protected]. Do not write anything extra in the subject line as this is a automatic system which will not work otherwise.


Your reply to [email protected] -
To       

Subject   
Message -

Your email id:

Captcha Image:
Captcha Code:


Pages not loading, taking too much time to load, server timeout or unavailable, or any other issues please contact admin at [email protected]
Time Taken: 0

Location: ,