Site Reliability Engineer (SRE)- Remote at Remote, Remote, USA |
Email: [email protected] |
From: Bharat, Phasorsoft Group LLC [email protected] Reply to: [email protected] Job Title:- Site Reliability Engineer (SRE) Location :-Remote Note:-Need 10+ Years profiles only,dont share Devops Profiles PP Mandatory Responsibilities: Provide consulting services for improved system stability availability performance and reliability. Assist in determining the impact of operational issues and provide input into their resolution via data extraction and quantification. Work through day-to-day support issues ensure effective and timely resolution of issues in production environment troubleshoot customer impacting issues. Support multiple applications specifically running Kubernetes/Gloo/AWS/Apigee/PCF/GCP/Java based systems in an enterprise environment. Supporting Gloo running on Kubernetes Apigee opdk and saas Grafana Prometheus Cassandra Postgres Spring Boot or Java based applications running on Kubernetes PCF and Java application servers. Apply GitOps principles to manage infrastructure and application configurations Apply monitoring and creating complex alerts and dashboards for production systems. Provide capacity analysis tuning analysis for Apigee and Java applications hosted on LINUX and container platform. Available to provide 24X7 on call support on a rotating basis with other team members. Lead efforts in troubleshooting recovery and root cause investigation. Perform analysis of user requirements and problems to automate or improve systems and review system capabilities workflow and scheduling limitations. Able to follow and develop detailed work plans schedules project estimates resource plans and status reports. Facilitate HA (High Availability) /DR (Disaster Recovery) exercises to ensure that the team are fully prepared in any event. Lead root cause analysis session to understand what causes issues in Production and come up RCA Report along with solutions that will prevent them from happening in the future. Ensure documentation is created and remain updated for any related work. Strong understanding of UNIX operating systems and any scripting language. Forecast and plan for rapidly growing environment Evaluate new software product and service solutions. Skill requirements: Expertise in analyzing and troubleshooting large-scale distributed systems. Strong experience with Kubernetes Envoy Gloo AWS GCP Apigee API Gateway Experience with REST SOAP and GraphQL API support. Experience with tools like: Git Gitlab Docker Postman Splunk App Dynamics Imperva WAF and CI/CD tools Good Experience in GitOps process performance measures & tuning capacity planning and management contingency and disaster recovery Good understanding and strong experience with Unix/Linux operating systems. Ability to debug optimize code and automate routine tasks. Systematic problem-solving approach coupled with effective communication skills. Strong scripting knowledge and experience. Good understanding of networking routing and TLS/SSL Bharath T || Technical Recruiter [email protected] 644 Lakeland East Dr, Suite A, Flowood - MS 39232 https://www.linkedin.com/in/bharath-t-27b2a7236 Keywords: continuous integration continuous deployment microsoft |
[email protected] View all |
Sat Sep 23 01:04:00 UTC 2023 |