Home

SRE or Devops local in San Jose, California only USC or GC at California, Maryland, USA
Email: [email protected]
From:

Badal kanojia,

Stellentit

[email protected]

Reply to:   [email protected]

SRE Product Reliability

San Jose, California

Phone + Skype

Job Description:

Key Responsibilities:

Design and Implementation: Develop and implement observability solutions for

Kubernetes based applications using Fluentbit, Cloud Watch, StackDriver, Grafana Loki,

Grafana Tempo, Prometheus, Envoy Health Probes, Open Telemetry, and ArgoCD.

Monitoring and Logging: Configure and maintain logging pipelines using Fluentbit to

collect, process, and route logs for storage and analysis.

Metrics and Tracing: Set up Prometheus for metrics collection and Grafana Tempo for

distributed tracing. Integrate these with Grafana for real-time monitoring and alerting via

open telemetry.

Telemetry: Utilize Open Telemetry to instrument applications for better traceability and

observability.

CI/CD: Use ArgoCD for continuous deployment and ensure observability tools are

integrated into the CI/CD pipeline to deploy the observability suite.

Observability Optimization: Analyze and optimize the performance of the observability

stack to ensure minimal overhead and maximum efficiency.

Troubleshooting: Proactively identify and resolve issues related to the observability

infrastructure. Collaborate with development and operations teams to troubleshoot and

resolve incidents.

Documentation and Training: Document observability processes and best practices.

Provide training and support to other team members on the observability tools and

techniques.

Required Skills and Qualifications:

Experience: Proven experience as an SRE or in a similar role, with a strong focus on

observability in Kubernetes environments supporting applications in EKS in AWS.

Technologies: Hands-on experience with Fluentbit, Cloud Watch, StackDriver, Grafana

Loki, Grafana Tempo, Prometheus, Envoy Health Probes, Open Telemetry, and ArgoCD.

Kubernetes: In-depth knowledge of Kubernetes and container orchestration.

Scripting and Automation: Proficiency in scripting languages such as Python, Bash, or

similar for automation tasks.

Monitoring and Logging: Strong understanding of monitoring, logging, and tracing

concepts and best practices.

Problem Solving: Excellent analytical and problem-solving skills.

Collaboration: Strong communication skills and the ability to work effectively in a team

environment.

Continuous Improvement: A proactive attitude towards identifying opportunities for

improvement and implementing solutions.

Preferred Qualifications:

Certifications: Relevant certifications such as Certified Kubernetes Administrator

(CKA) or Certified Kubernetes Application Developer (CKAD)

Cloud Platforms: Experience with cloud platforms such as AWS and EKS.

DevOps Practices: Familiarity with DevOps practices and tools.

Keywords: continuous integration continuous deployment
SRE or Devops local in San Jose, California only USC or GC
[email protected]
[email protected]
View all
Fri Jul 26 22:38:00 UTC 2024

To remove this job post send "job_kill 1602289" as subject from [email protected] to [email protected]. Do not write anything extra in the subject line as this is a automatic system which will not work otherwise.


Your reply to [email protected] -
To       

Subject   
Message -

Your email id:

Captcha Image:
Captcha Code:


Pages not loading, taking too much time to load, server timeout or unavailable, or any other issues please contact admin at [email protected]
Time Taken: 0

Location: ,