Job Opportunity as, Lead Site Reliability Engineer at Allentown, PA (Hybrid) Position at Allentown, New York, USA |
Email: [email protected] |
Hi, Please find the below Job Description. If interested, please share your updated resume along with below details. Job Title: Lead Site Reliability Engineer Location: Allentown, PA (Hybrid) Duration: Long Term Job Description: Must have experience on Grafana LGTM ("SRE Lead" OR "Site Reliability Engineer Lead") AND ("Grafana LGTM" OR "Grafana Loki" OR "Grafana Prometheus") Responsibilities 1. Technical Leadership and Subject Matter Expertise: Provide technical / architectural guidance and support to other engineers on the team. Have demonstrated experience interacting technical leadership on client-side to understand their technical needs and devise a plan to deliver the product to their satisfaction. Ensure adherence to timelines and client expectations; ability to communicate the progress timely, gather feedback and nimbleness to adapt quickly and course correct as needed. 2. Infrastructure Deployment and Maintenance: Deploy and manage Grafana LGTM in Kubernetes (K8s) using Helm charts. Implement HPA / Keda Autoscaling for LGTM stack. Performance tuning the components and setup right resource limits. Set up and maintain Azure DevOps pipelines. Work with Azure portal for resource management and monitoring. 3. Programming and Version Control: Possess a strong programming background in Java and/or .NET. Utilize Git and Azure Repos for version control and collaboration. 4. Monitoring and Dashboards: Good experience working with Helm, GitHub, Grafana and Prometheus. Work with Engineering teams to onboard their applications to Grafana and help them set up required dashboards. Build and maintain Grafana dashboards to monitor system health, performance, and metrics. Familiarity with PromQL, LokiQL, and TraceQL for querying and visualizing data. 5. Kubernetes Expertise: Demonstrate a deep understanding of Kubernetes (K8s) architecture, components, and best practices. Experience with Rancher Desktop or similar tools for local development and testing. 6. Dynatrace Knowledge: Familiarity with Dynatrace for application performance monitoring. Ability to create custom metrics and dashboards. 7. Azure AD Integration: Grafana with Azure Active Directory (AD) for authentication and access control. 8. Onboarding and Alerting: Manage production incidents and work with application teams to conduct root cause and remediation. Create runbooks for applications to resolve critical and recurring issues in quick time. Onboard new applications and infrastructure components to the Grafana stack. Set up alerts and notifications for application performance, incidents, and issues. Qualifications: Bachelors degree in Computer Science, Information Technology, or related field (or equivalent experience). Proven experience as a DevOps Engineer, SRE, or similar role. Strong problem-solving skills and attention to detail. Excellent communication and collaboration abilities. Certifications in Kubernetes, Azure, or related technologies are a plus. Thanks & Regards, Sekhar Lead US IT Recruiter http://digityxsolutions.com/ Address : 6136 Frisco Square Blvd | Suite#426 | Frisco, TX 75034 Email: sekhar@ digityxsolutions.com Phone : +1 908-274-3095 Keywords: active directory information technology Pennsylvania Texas Job Opportunity as, Lead Site Reliability Engineer at Allentown, PA (Hybrid) Position [email protected] |
[email protected] View all |
Tue Jul 23 23:27:00 UTC 2024 |