Home

Urgent Hiring SRE Engineer at Remote, Remote, USA
Email: [email protected]
From:

Shekhar,

Stellar Consulting !!

[email protected]

Reply to:   [email protected]

Job Title :SRE Engineer

Location: Remote

Duration: Long Term

Please find the JD below for the SRE engineers.

Qualifications:

    Proven proficiency in Observability, monitoring, and logging tools Like Dynatrace, Prometheus, Grafana, ELK stack and Splunk.

    3+ years of experience in Dynatrace, synthetic URL monitoring, installing agents, forwarders, APIs, performance monitoring tool alerts, dashboards and data trend analysis in a monitoring tool.

    Should have a minimum 4 to 5 years' working experience in OpenShift and Docker/K8s

    Proficiency in implementing monitoring and observability solutions using GCP monitoring services such as Cloud Monitoring, Logging, and Tracing

    Deep understanding of IT infrastructure monitoring and observability best practices

    Experience with gathering and organizing large amounts of data to use for instrumentation into an Enterprise monitoring solution.

    Experience with recommending baseline monitoring thresholds and performance monitoring KPIs and SLAs

    Experience of at least 4 + years of experience in development of Grafana Dashboards , develop Metrics / monitoring Standardization - Metrics, collection, Dashboards with Grafana a must.

    3-5 years of experience with SQL and familiarity with at least one managed Kubernetes platforms (EKS, AKS, GKE)

    Strong background in software engineering, with expertise in relevant programming languages (like Python, Java, Go) and cloud platforms (like AWS, GCP, Azure)

Responsibilities :

    The Monitoring and Observability engineer will be responsible for Designing, configuring, monitoring, implementing, and maintaining our observability solutions and troubleshooting IT systems and applications to ensure optimal performance and reliability.

    They will be utilizing Observability and Monitoring tools to detect and resolves issues effecting positive user experience.

    The engineer will also be responsible for automating alerting and remediation processes to reduce mean time to resolution (MTTR) and improve system uptime.

    Splunk query language and Monitored Database Connection Health by using Splunk DB connect health dashboards, log parsing, complex Splunk searches, including external table lookups, Splunk data flow, components, features, and product capability.

    Observability: Implement comprehensive monitoring and alerting solutions using GCP monitoring services and external services

    Configure dashboards, alerts, and notifications to ensure timely identification and resolution of issues.

    Troubleshoot issues and outages, working closely with development and operations teams to identify root causes and develop solutions.

    Monitor Server, network infrastructure and application performance metrics, and identify patterns and trends to improve system performance and reliability.

    Develop and integrate tools for logging, monitoring, and alerting to enhance visibility into system performance.

    Participate in strategic planning for the technology roadmap, including scalability, cost-effectiveness, and risk management considerations related to observability infrastructure.

    Very good in communication (both Verbal / Written) and ability to articulate and convince.

Thanks,

Shekhar Das

[email protected]

Keywords: database information technology golang
[email protected]
View all
Mon Mar 04 21:11:00 UTC 2024

To remove this job post send "job_kill 1179436" as subject from [email protected] to [email protected]. Do not write anything extra in the subject line as this is a automatic system which will not work otherwise.


Your reply to [email protected] -
To       

Subject   
Message -

Your email id:

Captcha Image:
Captcha Code:


Pages not loading, taking too much time to load, server timeout or unavailable, or any other issues please contact admin at [email protected]
Time Taken: 0

Location: ,