Urgent need Site Reliability Engineer (SRE)---Hybrid In Altanta GA ---In person inetrview at Remote, Remote, USA |
Email: [email protected] |
Job Title: Site Reliability Engineer (SRE) Location: Hybrid in ATL no exceptions no relo candidates In person interview Duration: 12+ months Visa: USC and GC MOI: Skype+F2F Client: Delta Airline Candidates must be local to GA Requirement 4+ years of experience. Experience: 4-6 years Job Overview: As a Senior Site Reliability Engineer (SRE) with our Retail Technology team, you will be at the forefront of Cloud and Big Data technology. You'll play a key role in ensuring the reliability and performance of our critical applications and services. This position offers the opportunity to work with industry-leading technologies and establish yourself as a technical leader. Key Responsibilities: Implement, improve, and maintain monitoring, alerting, and logging solutions to detect and respond to incidents. Collaborate closely with the development team to deploy applications and services, ensuring they meet reliability and performance standards. Automate deployment, configuration management, and troubleshooting processes to streamline operations. Participate in on-call rotation, triage production incidents, lead Root Cause Analysis (RCA) efforts, and implement preventive actions. Serve as the escalation point for complex issues in both on-premise and AWS environments. Qualifications: Deep understanding of AWS services: (Lambda, S3, SQS, IAM, Route 53, etc.) and proficiency in infrastructure as code (e.g., Terraform, CloudFormation). Hands-on experience with monitoring tools: (e.g., CloudWatch, Sumo Logic, Dynatrace, Grafana) for application performance monitoring and alerting. Proficiency in scripting and automation: (e.g., Python, Bash) to build and maintain deployment pipelines and infrastructure. Strong analytical and troubleshooting skills to diagnose and resolve complex infrastructure, application, and data issues. Experience with containerization: (Docker, Kubernetes) and serverless architecture (AWS Lambda). Core Responsibilities: Manage and optimize data streaming and API components in OpenShift On-premise and AWS. Proactively review application APIs and processes to identify opportunities for optimizing response times. Automate various types of testing, including data quality checks, delivery to production, and deployment processes. Develop integrations between On-premise applications, AWS, and third-party tools (ServiceNow, VersionOne, Sumo). Collaborate with teams to create SLI/SLOs. Monitor and lead troubleshooting of performance issues for platform applications, develop solutions, and document artifacts from root cause analysis. Evolve the cloud infrastructure ecosystem by experimenting with emerging technologies. Design and develop CI/CD pipelines for deploying application artifacts, APIs, and Data Process Jobs. Maintain data integrity and access control using AWS security tools (e.g., HSM, IAM). Develop tools to monitor AWS billing, generate cost-related reports, and implement cost optimization strategies. Design and implement data security tools in collaboration with enterprise security architects. Monitor and analyze platform capacity and performance, implementing elastic infrastructure as needed. Contribute to backup strategies and disaster recovery solutions. Provide continuous improvement input on design, performance, and security enhancements. Desired Skillset: Deep understanding of AWS cloud platforms. Proficiency in automation, scripting, and monitoring using tools like OpenShift, CloudFormation, Terraform, Ansible, Shell, Python. Strong technical knowledge of infrastructure layers (Linux OS, virtualization platforms, networking, storage, backup strategies). Experience in end-to-end operations for enterprise systems and applications, including issue resolution for mission-critical systems. Familiarity with CI/CD tools (Gitlab, Github, Jenkins, Maven, Gradle, Nexus). Experience with Software Release Management. Required Qualifications: Education: BS in Computer Science or related technical field (or equivalent practical experience). Experience: 3+ years of DevOps/SysOps engineering experience focusing on major cloud platforms (AWS preferred). 2+ years of application development experience, including data streaming and deploying/monitoring high-availability critical application components. 1+ years of experience in Site Reliability Engineering is preferred. --------------- Best Regards, Himanshu Singh Vyze Inc (Subsidiary of Vivid Technologies Inc) 24916 Earls Ford Drive Chantilly, VA 20152 Email: [email protected] Hangout:[email protected] Linkedin: https://www.linkedin.com/in/himanshu-singh-036b02222 Disclaimer : This communication, along with any documents, files or attachments, is intended only for the use of the addressee and may contain confidential information. If you are not the intended recipient, you are hereby notified that any dissemination, distribution or copying of any information contained in or attached to this communication is strictly prohibited, To remove your email address permanently from future mailings, please send REMOVE to [email protected] -- Keywords: continuous integration continuous deployment sthree information technology green card Georgia Virginia Urgent need Site Reliability Engineer (SRE)---Hybrid In Altanta GA ---In person inetrview [email protected] |
[email protected] View all |
Fri Aug 09 02:20:00 UTC 2024 |