Home

Opening for SRE in Remote at Remote, Remote, USA
Email: [email protected]
https://tinyurl.com/4fj4s2y9
https://jobs.nvoids.com/job_details.jsp?id=2395545&uid=547bf14f3dea42c094b6873a92e94a46
Hi Professionals 
Hope you are doing well 

Job Title: SRE (Site Reliability Engineer)
P1 req, Must Serve, we worked on SRE in the past!
USA- Remote & Must work in EST/CST time zone
Pay Rate: $45/Hr on C2C

Required Skills: ***** Please screen as per below required skills & do not share generic SRE/Site Reliability Engineer resumes *****

Bachelors degree in computer science, information technology, or a related field.t
8+ years in software or operations engineering
3+ years of DevOps and Site Reliability engineering: Proven experience working on large-scale, cloud-based, enterprise-level software platforms and deep understanding of multi-cloud architectures
3+ years of practical experience in Infrastructure-as-Code and CI/CD tools like Terraform, Git Actions and alike.
2+ years of documented hands-on experience with Azure (Azure ML is a plus)
2+ years of practical experience in containerization technologies (Kubernetes, Docker) and orchestration
2+ years of practical experience in Scripting & Automation
Advanced proficiency in scripting languages such as Bash to support automation and system integration efforts.

Job Description:

Act as production Gatekeeper for all changes (Product and infrastructure changes)
Perform detailed deep dive (root cause analysis) on the repeated system issues and work with engineering team for permanent solution
Provide support as Tier2 application/platform support for client AI applications
Periodic on call rotations and available outside of normal business hours on evenings and weekends during critical production release or issue escalation periods
Site Reliability Engineer (SRE) is responsible for the availability, latency, performance, efficiency, change management, monitoring, emergency response, and capacity planning
This role will be a member of a team that focuses on DevOps, Dev SecOps and SRE for the client AI Organization
The role drives continuous improvement in delivery of resilient, scalable, performant, secure, and high-quality cloud-native services
Collaborating with SecOps, and development teams the SRE identifies cross-team issues which create risk for operations across the organization and resolving those issues with a mixture of engineering, troubleshooting expertise, and general operational guidance
Proactively drive improvement of enterprise cloud capabilities while creating best practices and tools to empower developers to create, deploy, and operationally support services
As a key contributor in the organization this role is responsible for the working with the Principal SRE and guiding junior team members in DevOps culture, highly scalable architectures, and lean development utilizing agile practices
Educate yourself and others on anything that helps service teams more quickly and easily build, test, deploy & run their services to be more reliable
Plan, design, deploy, and operate Site Reliability Engineering capabilities for cloud products & services
Recognize and address sub-standard performance based on key performance indicators (KPIs)
Build monitoring that alerts on symptoms rather than outages
Continuously build, automate, and improve upon capabilities that are secure, scalable, performant, and resilient
Work closely with Infrastructure, Network, Security, Architecture, and Development teams to build highly performing, scalable, and secure Azure/AWS/GCP (cloud) environments
Define needs by documenting processes; includes research, planning and writing supporting documentation Participate in regulatory and compliance activities as necessary
Periodic on call rotations and available outside of normal business hours on evenings and weekends during critical production release or issue escalation periods
Responsible for remediating the security vulnerabilities which are discovered in the non-production and production scans.
Participate in the new vendor/product/service onboarding and assess partner technical readiness (Such as Azure AI studio, Azure model catalog, AWS sagemaker).
Develop or maintain dashboard for operational analysis and status reports.
Perform Operational Readiness testing for every release package to proactively predict any performance degradations across all components of a critical asset.(For example Portal, Workspace creation, Project creation, Model Inference and API Response times

Thanks & Regards
Shuruthi Saravanan
Senior IT Recruiter
SHAKTECH CORP
41880 RESTFUL TER, ALDIE, VA 20105
Direct: +1 (703)-793 7710; EXT:- 106
LinkedIn URL: https://www.linkedin.com/company/shaktech-corp/

--

Keywords: continuous integration continuous deployment artificial intelligence machine learning information technology Virginia
Opening for SRE in Remote
[email protected]
https://tinyurl.com/4fj4s2y9
https://jobs.nvoids.com/job_details.jsp?id=2395545&uid=547bf14f3dea42c094b6873a92e94a46
[email protected]
View All
09:47 PM 02-May-25


To remove this job post send "job_kill 2395545" as subject from [email protected] to [email protected]. Do not write anything extra in the subject line as this is a automatic system which will not work otherwise.


Your reply to [email protected] -
To       

Subject   
Message -

Your email id:

Captcha Image:
Captcha Code:


Pages not loading, taking too much time to load, server timeout or unavailable, or any other issues please contact admin at [email protected]


Time Taken: 0

Location: ,