Senior Site Reliability Engineer at Sunnyvale, California, USA |
Email: [email protected] |
From: madhavan, AI Spins INC, [email protected] Reply to: [email protected] Job Title: Senior Site Reliability Engineer (SRE) Experience: 8+ years Location: Sunnyvale, CA About the Role: We are seeking a highly skilled and experienced Senior Site Reliability Engineer (SRE) to join our team. The ideal candidate will have extensive experience with Apache Kafka, AWS, and Core Java, and will be responsible for ensuring the reliability, availability, and performance of our services. Key Responsibilities: Design, implement, and manage scalable, reliable, and secure systems using AWS and Kafka. Develop and maintain automation scripts and tools to streamline deployments and operations. Monitor system performance and reliability, proactively addressing issues and optimizing systems. Collaborate with development teams to ensure smooth deployments and efficient incident resolution. Implement and maintain robust monitoring, alerting, and logging systems. Participate in on-call rotations and respond to incidents, conducting root cause analysis and implementing corrective actions. Mentor and provide technical guidance to junior SREs and other team members. Qualifications: Bachelor's degree in Computer Science, Engineering, or a related field. 8+ years of experience in site reliability engineering, system administration, or a related field. Strong expertise in Apache Kafka, including setup, configuration, monitoring, and optimization. Extensive experience with AWS services such as EC2, S3, RDS, Lambda, and more. Proficiency in Core Java programming, with experience in building and maintaining high-performance applications. Experience with containerization technologies such as Docker and orchestration tools like Kubernetes. Strong knowledge of infrastructure-as-code tools such as Terraform or CloudFormation. Familiarity with CI/CD pipelines and tools like Jenkins, GitLab CI, or similar. Excellent problem-solving skills, with the ability to troubleshoot complex systems and network issues. Strong communication and collaboration skills, with the ability to work effectively in a team environment. Keywords: continuous integration continuous deployment artificial intelligence sthree California Senior Site Reliability Engineer [email protected] |
[email protected] View all |
Wed Jul 31 02:08:00 UTC 2024 |