Job Details

Home

Hiring - Site Reliability Engineer, Houston TX(Onsite) at Houston, Texas, USA

Email: [email protected]

From:

Sarfaraz,

Convextech Inc.

[email protected]

Reply to: [email protected]

Hi

Hope you are doing good.!!

Please let me know if you are interested in the below position

!!

Title : Site Reliability Engineer

Location : On-Site-Houston

Duration : 6 months

Visa : No H1b/CPT

Job description

Its an urgent need. NOTE: This person needs to have OpenShift experience and proven experience as an SRE.

Please pay close attention to required qualifications and technology stack sections as well.

The preference is for them to go on-site in Houston, TX. Please send me any candidates who you have who might be a fit.

SRE

Our client needs a Site Reliability Engineer (SRE) to become a part of our growing Digital IT team focused on the Integrated Production Surveillance & Optimization (IPS&O) function. The SRE will support the reliability of Digital IT/OT critical applications. This transformative role involves automating IT infrastructure tasks and driving SRE best practices, tools, and processes. The ideal candidate should exhibit a growth mindset and proactively monitor and respond to incidents for optimal user experience.

The candidate must have senior level experience deploying and supporting applications in OpenShift/Kubernetes container platforms.

The successful candidate will possess a strong developer background as well as interpersonal skills needed to communicate design requirements and objectives while providing thought leadership to peers and leadership.

Candidates should be self-motivated and collaborative IT professionals with a strong background in software development, systems administration and IT automation.

Responsibilities:

* Maintaining survivability and reliability of IT/OT critical resources.

* Write and build CI/CD pipelines and build/release processes for IT/OT workflow applications.

* Provide mentoring to the IT/OT Devops team in the best practices associated with CI/CD deployments using ADO, and GIT.

* Perform periodic load and scalability testing to establish baselines, drift, and capacity planning.

* Conduct weekly operational state reviews covering performance trends, anomalies, errors, and other availability events with SREs, product owners, and development teams.

* Participate in quarterly business and operational reviews aligning on roadmaps, development velocity, efficiency, growth trends, etc.

* Plan and execute periodic Disaster Recovery exercises including both tabletop and simulated failures (fault injection).

Required Qualifications

* Candidates must have a bachelors degree and 8 years of IT experience.

* Senior level experience with OCP and Kubernetes.

* Familiarity with continuous integration/deployment processes and tools such as IDEs (Eclipse), Source Code management. (GIT/Stash), ADO Pipelines, Maven, Nexus artifacts, etc.

* Strong understanding of SRE practices: incident response, change/release management, capacity planning, infrastructure automation, elastic environments, chaos engineering and blameless postmortems.

* Expertise in application performance monitoring, observability, and proactive alert correlation, including monitoring containers and failure-based alerting.

* Scripting experience such as Python and Bash

* Experienced in deploying applications in OCP in both public and private cloud.

* Excellent written and oral communications skills

* Demonstrated ability to communicate to nontechnical audience on technical issues.

* Demonstrated ability to communicate on a technical level to a technical audience.

* Strong interpersonal skills, adaptable and able to learn quickly.

* Requires limited supervision and have excellent time management skills.

* Self-motivated and self-starter.

* Ability to work and interact with others in a structured/team environment.

Technology Stack

Experience with at least one technology in each of the tech stack categories below:

* Monitoring and Logging Tools(s): AppDynamics, Splunk, ELK Stack, DataDog, Prometheus, AWS CloudWatch/X-Ray, Grafana

* Programming: C# .NET, PowerShell, Python, YAML

* Containers: Docker, Helm Chart

* OS: Linux RHEL, Ubuntu, CentOS

* Code Repos: Azure Repos, GitHub

* Infrastructure as code: Terraform, Ansible

* Automation Tools: Jenkins, Chef, Puppet

* Agile: JIRA, SAFe

Desired Qualifications

* Experience in cloud/virtual technologies and management VMware, AWS, Azure, etc.

* Knowledge, skills and abilities to support web server technologies Apache, Nginx, IIS.

* Knowledge, skills and abilities to automate the creation of Platform as a Services (PaaS) infrastructure using industry standard tools such as Ansible and Chef.

* Familiarity with Industrial Control System (ICS) security architecture Purdue model.

Work Location:

On-Site-Houston (preferred)

Thanks and Regards

Sarfaraz Khan

US IT Recruiter | Convex Tech In

Email:
[email protected]

Contact: 469-300-7818

LinkedIn:https://www.linkedin.com/in/sarfaraz-khan-stellar/

Keywords: csharp continuous integration continuous deployment information technology golang Texas

[email protected]
View all

Wed Oct 04 03:09:00 UTC 2023

Your reply to [email protected] -

Time Taken: 0

Location: ,