Home

Senior Site Reliability Engineer Need 10+ Years Profile at Remote, Remote, USA
Email: [email protected]
From:

Patricia,

W3Global

[email protected]

Reply to:   [email protected]

Responsibilities:

Deploy and manage innovative modern cloud technologies using infrastructure-as-code, self-healing, and security automation patterns.

Develop useful telemetry, alerts, and response to reduce Mean Time To Repair (MTTR).

Collaborate and provide technical excellence within and across teams.

Consult on best practices and develop tools to enable smooth adoptions of good service reliability practices and methods.

Identify areas of improvement in reliability, efficiency, and operations.

Build tools to help your SRE team quickly pinpoint, isolate, and resolve issues related to infrastructure, platform services and applications.

Continuously refine monitoring processes, configurations, and thresholds.

Practice and promote sustainable incident response and blameless postmortems

Develop runbooks and tools to streamline processes and shorten problem resolution time.

Write code that improves scalability, performance, maintainability, and security.

Add, tune, and maintain alert configurations and documentation as needed.

Operate in the high-pressure environment and troubleshoot complex issues across distributed applications quickly, while successfully handling multiple priorities.

Cultivate full-team participation in high quality, thoughtful software.

Develop and improve CI/CD processes to improve release cadence and success.

Use Chaos Engineering principles and methodologies to test what you build under real-world conditions.

Mentor SREs in technical and non-technical SRE responsibilities.

Take primary responsibility for large (multi-person) efforts, including planning, execution, and training

Basic Qualifications:

Creative and innovative outside the box thinking

5-7 years of experience in SRE, devops, technical operations, systems engineering, software engineering or related discipline

Proficient, collaborative, & experienced in building reliable, scalable, enterprise systems

Excellent communication skills, both verbal and written

Passionate and curious about ways to leverage technology while continually learning

Ability to identify root-cause sources of instability in a high-traffic, large-scale distributed systems

Experience in designing, building, and operating large-scale production systems

Efficiently skilled with the use of containers in enterprise production environments (e.g. Docker, Kubernetes, LXC, AWS ECS and EKS)

Configuration management and orchestration (e.g. Terraform, Cloud Formation, Ansible)

Comfortable in one or more of the following languages (Python, Java, Scala, Go, Rust, Ruby, or similar)

Scripting languages like Ruby, Bash, PowerShell, or Python.

Skilled in Cloud/PaaS/SaaS Environments (e.g. AWS, Azure, Google Cloud Compute)

Hands-on experience using source control (Git, GitHub) and feature branching strategies

Experience with continuous integration tools (e.g. Jenkins, Gitlab CI/CD, AWS CodeBuild, CodeDeploy, CodePipeline, AWS DevOps, Spinnaker)

Knowledge of best practices and IT operations in an always-up, always-available service.

Possess expertise in scalable testing, automation, continuous integration frameworks and best practices.

Experience in SDLC, distributed systems, networking, hardware, logistics and operations or capacity planning.

UNIX/Linux administration, troubleshooting, performance tuning, and security

Preferred:

Experience with DevOps methodologies and/or SRE

Experience with container orchestration systems, such as AWS ECS or Kubernetes

Experience with monitoring and observability tooling such as Datadog, Prometheus, Grafana

Experience with automating infrastructure, deployment and testing using tools like Cloudformation, Ansible or Terraform, and can explain the Infrastructure as Code paradigm

Experience with Service Level Objectives and Error Budgets

Experience with configuration management, such as Puppet and Ansible

Understanding of the principles and methodologies behind Chaos Engineering

Experience with software development in Java, Scala, etc

Preferred Education:

BS Degree in Computer Science, Electrical & Computer Engineering or Mathematics; or equivalent experience.

Keywords: continuous integration continuous deployment information technology golang
[email protected]
View all
Wed Jan 11 20:11:00 UTC 2023

To remove this job post send "job_kill 269560" as subject from [email protected] to [email protected]. Do not write anything extra in the subject line as this is a automatic system which will not work otherwise.


Your reply to [email protected] -
To       

Subject   
Message -

Your email id:

Captcha Image:
Captcha Code:


Pages not loading, taking too much time to load, server timeout or unavailable, or any other issues please contact admin at [email protected]
Time Taken: 0

Location: ,