Home

SRE Monitoring Engineer W 2 at Sunnyvale, California, USA
Email: [email protected]
http://bit.ly/4ey8w48
https://jobs.nvoids.com/job_details.jsp?id=3020401&uid=c891f863c8674c778d526b842394fecf

From:

Preeti,

tekinspirations

[email protected]

Reply to: [email protected]

Job Description -
Tittle - SRE Monitoring Engineer { W 2 }
Location - Sunnyvale, CA hybrid schedule
Job Description
Senior SRE Monitoring Engineer
Location: Sunnyvale, CA hybrid schedule
6-9 months project
F2F INTERVIEW
We are only considering candidates with the Medical Devices/Pharma Industry experience who are local to the Sunnyvale, CA area and can work onsite.
As a Senior Production Engineer, you will serve as a technical leader responsible for supporting architecture, securing, and sustaining the production infrastructure supporting our regulated digital health and medical software platforms. You will ensure reliability, scalability, and compliance of critical systems in alignment with FDA GxP guidelines and HITRUST standards for healthcare data protection.
We are looking for candidates with SRE monitoring experience.

You will lead initiatives in incident response, deployment automation, observability, and capacity planningleveraging modern DevOps/SRE methodologies, cloud-native technologies, and advanced tooling. Collaborating across engineering, quality, and compliance teams, you will ensure our solutions remain both safe and effective for patient care, while meeting stringent regulatory requirements.

Key Responsibilities:
Support the design, implementation, and sustainment of CI/CD pipelines with embedded with auditable deployment processes.
Promote infrastructure-as-code using Terraform, Helm, and Ansible, incorporating HITRUST and GxP controls into reusable modules.
Architect and maintain highly available, scalable, and compliant systems leveraging Kubernetes and cloud platforms (AWS, GCP, Azure).
Apply SRE principlesdefining, measuring, and improving reliability metrics (SLIs/SLOs/SLAs) in regulated healthcare environments.
Lead capacity planning, performance tuning, and infrastructure optimization initiatives focused on regulatory and privacy requirements.
Manage the full incident lifecycle (detection, triage, resolution, postmortem), documenting as required for FDA compliance and audit readiness.
Develop and maintain incident response playbooks, including IT and regulatory escalation protocols.
Implement and manage monitoring solutions (Datadog, Prometheus, Grafana, Elastic Search) to support rapid issue identification in compliance with healthcare mandates.
Integrate and manage SIEM tools (Splunk, Datadog Security, Elastic Security) for log aggregation, threat detection, and support of regulatory audits (HITRUST, GxP).
Collaborate with security, quality assurance, and regulatory teams to monitor and respond to production security incidents.
Ensure logging, auditing, and reporting meet FDA, HITRUST, ISO 27001 and healthcare industry standardsincluding data retention, traceability, and privacy safeguards.
Document and communicate infrastructure processes clearly to facilitate internal knowledge transfer and external audit readiness.
Plan and manage resource utilization to meet both performance goals and regulatory efficiency standards.
Troubleshoot and support cloud/network issues, ensuring secure handling of protected health information (PHI) and device data.
Qualifications:
Bachelors or Masters degree in Computer Science, Engineering, or related field.
7+ years in Production Engineering, DevOps, or SRE roles within healthcare, medical device, or life sciences industries.
Expertise in containerization (Kubernetes, Docker), cloud platforms, and infrastructure-as-code.
Direct experience supporting systems subject to FDA GxP and HITRUST compliance; familiarity with HIPAA, SOC2, ISO 27001 frameworks.
Strong skills in scripting/automation (Python, Bash, Go).
Proven track record managing SIEM and monitoring platforms in regulated environments.
In-depth knowledge of incident response and reliability engineering in healthcare/medical device settings.
Certifications in cloud security, DevOps, and/or healthcare compliance (e.g., HITRUST, AWS Security, etc.) strongly preferred.
Preferred Skills:
Experience deploying and supporting medical device software under FDA regulations.
Familiarity with quality management systems, validation procedures, and documentation for regulatory audits and FDA submissions.
Strong communication and leadership skills for cross-functional collaboration in a regulated setting.
Ability to innovate while maintaining strict compliance constraints.

Role Summary Senior SRE Monitoring Engineer (ISIJP00011446)
Intuitive Surgical is seeking a Senior SRE / Production Engineer with strong monitoring and observability expertise to support highly regulated digital health and medical software platforms. This role is responsible for ensuring reliability, scalability, security, and compliance of production systems in accordance with FDA GxP, HITRUST, and healthcare regulatory standards. The engineer will lead incident response, monitoring/alerting strategy, capacity planning, and infrastructure automation while partnering closely with security, quality, compliance, and engineering teams. This is a hands-on role focused on SRE best practices, cloud-native infrastructure, Kubernetes, CI/CD, and enterprise monitoring/SIEM tools in a regulated healthcare environment.
________________________________________
MUST-HAVES (Non-Negotiable Screening Criteria)
Core SRE / Production Engineering
7+ years in SRE, Production Engineering, or DevOps
Hands-on experience supporting production systems (not just development)
Strong ownership of incident response (on-call, triage, RCA, postmortems)
________________________________________
Monitoring & Observability (KEY FOCUS)
Hands-on experience with monitoring tools, such as:
o Datadog
o Prometheus / Grafana
o Elastic (ELK)
Experience defining and managing SLIs, SLOs, SLAs
Ability to design alerting strategies that balance reliability and compliance
________________________________________
Security, Logging & SIEM
Experience with SIEM / log aggregation tools:
o Splunk, Datadog Security, Elastic Security
Strong understanding of:
o Logging, auditing, traceability
o Incident detection and escalation
Experience supporting security incidents in production
________________________________________
Regulated Environment Experience (HARD REQUIREMENT)
Direct experience working in regulated healthcare, medical device, or life sciences environments
Hands-on exposure to:
o FDA GxP
o HITRUST
o Familiarity with HIPAA, SOC2, ISO 27001
Experience producing audit-ready documentation
________________________________________
Cloud, Containers & Infrastructure
Strong hands-on experience with:
o Kubernetes & Docker
o Cloud platforms: AWS, GCP, or Azure
Infrastructure-as-Code using:
o Terraform
o Helm
o Ansible
Experience embedding compliance controls into IaC and CI/CD pipelines
________________________________________
Automation & Scripting
Strong scripting skills in:
o Python
o Bash
o Go (preferred but not mandatory)
Automation of deployments, monitoring, and remediation workflows
________________________________________

Keywords: continuous integration continuous deployment information technology golang California
SRE Monitoring Engineer W 2
[email protected]
http://bit.ly/4ey8w48
https://jobs.nvoids.com/job_details.jsp?id=3020401&uid=c891f863c8674c778d526b842394fecf
[email protected]
View All
01:43 AM 30-Dec-25


To remove this job post send "job_kill 3020401" as subject from [email protected] to [email protected]. Do not write anything extra in the subject line as this is a automatic system which will not work otherwise.

Pages not loading, taking too much time to load, server timeout or unavailable, or any other issues please contact admin at [email protected]


Time Taken: 11

Location: Sunnyvale, California