Site Reliability Engineering at Remote, Remote, USA |
Email: [email protected] |
From: AJAY, KK Software Associates [email protected] Reply to: [email protected] Job Role :- Site Reliability Engineering (SRE) Work Location:- TCS - Boynton Beach, FL - Miami Beach, FL Role Description: Implement and maintain observability using Dynatrace to monitor system health and performance, enabling proactive identification and resolution of issues.- Automate deployment, scaling, and management of containerized applications using Kubernetes and similar technologies.- Collaborate with development teams to enhance the scalability and reliability of applications through reviews and implementing changes to the system architecture.- Participate in on-call rotations, providing timely responses to incidents and ensuring rapid resolution.- Develop automation tools for efficient system management and to reduce human intervention in operational tasks.- Utilize strong problem-solving skills to troubleshoot and resolve infrastructure issues.- Continuously improve system performance by analyzing issues and existing solutions, and implementing changes to hardware, software, or network setups.- Document system architecture and operational procedures to ensure clarity and consistency across the team.- Develop, configure, and optimize cloud-based services and infrastructure to ensure high availability and performance. Competencies: Digital : Site Reliability Engineering (SRE) Essential Skills: Site Reliability Engineering (SRE)We are seeking a dedicated and experienced Site Reliability Engineer to join our technology team, focusing on maintaining and improving the reliability and observability of our cloud-based services. The ideal candidate will have a strong background in software engineering and system administration, with expertise in Azure cloud environments, the Dynatrace monitoring tool, and observability principles. Desirable Skills: Bachelors degree in Computer Science, Engineering, or related technical field.- Proven experience as a Site Reliability Engineer, with a strong background in software engineering and systems administration.- Expertise in using Azure cloud services and managing cloud infrastructure.- Proficiency in implementing and managing observability and monitoring tools, specifically Dynatrace.- Strong knowledge of Kubernetes, CI/CD pipelines, and automation tools.- Experience in scripting languages such as Python, Bash, or PowerShell.- Familiarity with networking principles, including routing, VPNs, firewalls, and load balancers.- Excellent problem-solving, troubleshooting, and communication skills.- Ability to work collaboratively in a cross-functional team and with stakeholders across various departments.- Certifications in Azure and Dynatrace are highly desirable. Keywords: Site Reliability Engineering (SRE) Keywords: continuous integration continuous deployment Florida Site Reliability Engineering [email protected] |
[email protected] View all |
Fri May 24 01:31:00 UTC 2024 |