Site Reliability Engineer (SRE) || Orlando FL at Orlando, Florida, USA |
Email: [email protected] |
Rate- $55/hr on C2C Job Title: Site Reliability Engineer (SRE) or Production Reliability Engineer (PRE) Job Description:We are seeking an accomplished and driven SRE/PRE Layer 3 to join our forward-thinking team. As an SRE/PRE Layer 3, you will be a key contributor in architecting, designing, and maintaining highly reliable and scalable systems. You will collaborate with cross-functional teams to develop advanced automation, implement best practices, and drive the evolution of our infrastructure and reliability initiatives. Responsibilities: Lead the design, implementation, and management of complex systems architecture that emphasizes reliability, scalability, and performance. Collaborate closely with engineering teams to set and uphold service-level objectives (SLOs) and work on continuous improvements to achieve these goals. Mentor and guide junior members of the SRE/PRE team, fostering their technical growth and professional development. Solve intricate technical challenges across the entire technology stack, from hardware and infrastructure to applications and databases. Develop and implement robust automation solutions for deployment, configuration management, and infrastructure provisioning. Play a pivotal role in capacity planning, performance tuning, and optimizing systems for seamless scalability. Drive the establishment of comprehensive monitoring, alerting, and logging strategies to ensure prompt identification and resolution of issues. Participate in on-call rotations and respond promptly to incidents, taking ownership of resolution and post-incident analysis. Continuously advance best practices and processes, promoting a culture of reliability and operational excellence. Collaborate with stakeholders to ensure alignment between development and operations, contributing to product evolution and enhancements. Qualifications: Bachelor's degree in Computer Science, Engineering, or related field (or equivalent experience). 7+ years of experience in an SRE, PRE, or similar role, demonstrating a proven track record in driving system reliability and performance. Proficiency in programming languages such as Python, Go, or similar for automation and tool development. Expertise in cloud platforms (e.g., AWS, GCP, Azure) and container technologies (e.g., Kubernetes, Docker). Deep understanding of networking, operating systems, and distributed systems architecture. Experience with infrastructure as code tools (e.g., Terraform, Ansible) for provisioning and configuration management. Strong grasp of observability tools and practices (e.g., Prometheus, Grafana, ELK stack). Exceptional troubleshooting skills and the ability to diagnose complex technical issues. Outstanding communication skills to collaborate effectively with diverse teams. Proactive mindset and a focus on delivering exceptional customer experiences. Optional: Relevant certifications such as Certified Kubernetes Administrator, AWS DevOps Professional, or similar. (1.) To ensure customer engagement or satisfaction and referenceability (2.) To plan for Program and Delivery Management and ensure that the agreed deliverables in terms of margin are met. (3.) To anchor process improvementorcompliance (human error reporting) and other organizational initiatives (automation , Lean IT implemetation) (4.) To guide, manage, develop, engage the team therby ensuring employee retention (5.) To ensure upskillor creation of resources through internal academiesor trainings and growth rotation Thanks & Regards, Trayambkeshwer Dwivedi (Trayam), Sr. Technical Recruiter Raas infotek corporation 262 Chapman road, Suite 105A, Newark, DE-19702 Email: [email protected] Website: raasinfotek.com LinkedIn: linkedin.com/in/trayambkeshwar-dwivedi-792283218 -- Keywords: information technology golang Delaware Site Reliability Engineer (SRE) || Orlando FL [email protected] |
[email protected] View all |
Thu Jun 27 23:08:00 UTC 2024 |