Home

Site Reliability Engineer (SRE) at Southfield, Michigan, USA
Email: [email protected]
From:

albert,

involgix

[email protected]

Reply to:   [email protected]

Site Reliability Engineer (SRE)

 Location: Southfield, MI

Job Description:

We count on our site reliability engineer (SRE) to empower users with a rich feature set, high availability, and stellar performance level to pursue their missions. As we expand customer deployments, were seeking an experienced SRE to deliver insights from massive-scale data in real time. Specifically, were searching for someone who has fresh ideas and a unique viewpoint, and who enjoys collaborating with a cross-functional team to develop real-world solutions and positive user experiences for every interaction.

Objectives of this role:
Run the production environment by monitoring availability and taking a holistic view of system health.
Build software and systems to manage platform infrastructure and applications.
Improve reliability, quality, and time-to-market of our suite of software solutions.
Measure and optimize system performance, with an eye toward pushing our capabilities forward, getting ahead of customer needs, and innovating for continual improvement.
Provide primary operational support and engineering for multiple large-scale distributed software applications.

Responsibilities:
At day-to-day level, SREs will be focused on Automation, Monitoring, Incident 

Resolution and Culture.

A love of SRE, open-source, self-service tools, and micro-services.
Gather and analyze metrics from operating systems as well as applications to assist in performance tuning and fault finding.
Partner with development teams to improve services through rigorous testing and release procedures.
Participate in system design consulting, platform management, and capacity planning.
Create sustainable systems and services through automation and uplifts.
Balance feature development speed and reliability with well-defined service-level objectives.

After incidents, document actions to create automated solutions during incident response.
Monitor infrastructure using SRE tools and suggest tools as necessary.
Build monitoring alerts and incident response processes.
Improve operational processes and team practices.
Coding infrastructure automation across the CI/CD pipeline.
As the solution scales, ensure reliability through designing, building, and maintaining the core infrastructure. 
Demonstrate strong programming skills and thorough knowledge of systems.
Bring about cultural shifts to provide a foundation for process changes.

Experience with AWS multi-region/multi-AZ deployed systems, auto scaling of EC2 instances, CloudFormation, ELBs, VPCs, CloudWatch, SNS, SQS, S3, Route53, RDS, IAM roles, security groups, blue/green deployments, and A/B testing.

Required skills and Qualifications:
Bachelors degree (or equivalent) in computer science or related discipline

Comfortable with large scale production systems and technologies, for example load balancing, monitoring, distributed systems, and configuration management

Strong coding skills in at least one programming language, and a desire to pick up more.

Familiarity with and enthusiasm for software engineering best practices such as testing, continuous integration and continuous delivery.

Exposure with cloud and Amazon Web Services (AWS) and APIs

The ability to thrive in a rapidly evolving, globally distributed environment.

Strong Security mindset.
Proactive approach to identifying problems, performance bottlenecks, and areas for improvement.
Solid understanding of fundamental technologies like TCP/IP, HTTP.
Strong working knowledge of Linux systems and applications.
Experience with automation tooling such as Chef, Docker, AWS.
Experience with JavaScript Frameworks, Angular JS/ReactJS/NodeJS and with cloud automation/orchestration technologies.

Ability and willingness to collaborate.

Strong problem-solving skills and ability to think under pressure.

Strong analytical skills and management skills.

Communication and documentation skills.

Preferred skills and qualifications
Previous success in technical engineering
Coding experience beyond simple scripts

Keywords: continuous integration continuous deployment javascript sthree Arizona Michigan
[email protected]
View all
Fri Sep 22 18:43:00 UTC 2023

To remove this job post send "job_kill 670932" as subject from [email protected] to [email protected]. Do not write anything extra in the subject line as this is a automatic system which will not work otherwise.


Your reply to [email protected] -
To       

Subject   
Message -

Your email id:

Captcha Image:
Captcha Code:


Pages not loading, taking too much time to load, server timeout or unavailable, or any other issues please contact admin at [email protected]
Time Taken: 19

Location: Southfield, Michigan