| SRE Monitoring Engineer W 2 at Sunnyvale, California, USA |
| Email: [email protected] |
|
http://bit.ly/4ey8w48 https://jobs.nvoids.com/job_details.jsp?id=3020401&uid=c891f863c8674c778d526b842394fecf From: Preeti, tekinspirations [email protected] Reply to: [email protected] Job Description - Tittle - SRE Monitoring Engineer { W 2 } Location - Sunnyvale, CA hybrid schedule Job Description Senior SRE Monitoring Engineer Location: Sunnyvale, CA hybrid schedule 6-9 months project F2F INTERVIEW We are only considering candidates with the Medical Devices/Pharma Industry experience who are local to the Sunnyvale, CA area and can work onsite. As a Senior Production Engineer, you will serve as a technical leader responsible for supporting architecture, securing, and sustaining the production infrastructure supporting our regulated digital health and medical software platforms. You will ensure reliability, scalability, and compliance of critical systems in alignment with FDA GxP guidelines and HITRUST standards for healthcare data protection. We are looking for candidates with SRE monitoring experience. You will lead initiatives in incident response, deployment automation, observability, and capacity planningleveraging modern DevOps/SRE methodologies, cloud-native technologies, and advanced tooling. Collaborating across engineering, quality, and compliance teams, you will ensure our solutions remain both safe and effective for patient care, while meeting stringent regulatory requirements. Key Responsibilities: Support the design, implementation, and sustainment of CI/CD pipelines with embedded with auditable deployment processes. Promote infrastructure-as-code using Terraform, Helm, and Ansible, incorporating HITRUST and GxP controls into reusable modules. Architect and maintain highly available, scalable, and compliant systems leveraging Kubernetes and cloud platforms (AWS, GCP, Azure). Apply SRE principlesdefining, measuring, and improving reliability metrics (SLIs/SLOs/SLAs) in regulated healthcare environments. Lead capacity planning, performance tuning, and infrastructure optimization initiatives focused on regulatory and privacy requirements. Manage the full incident lifecycle (detection, triage, resolution, postmortem), documenting as required for FDA compliance and audit readiness. Develop and maintain incident response playbooks, including IT and regulatory escalation protocols. Implement and manage monitoring solutions (Datadog, Prometheus, Grafana, Elastic Search) to support rapid issue identification in compliance with healthcare mandates. Integrate and manage SIEM tools (Splunk, Datadog Security, Elastic Security) for log aggregation, threat detection, and support of regulatory audits (HITRUST, GxP). Collaborate with security, quality assurance, and regulatory teams to monitor and respond to production security incidents. Ensure logging, auditing, and reporting meet FDA, HITRUST, ISO 27001 and healthcare industry standardsincluding data retention, traceability, and privacy safeguards. Document and communicate infrastructure processes clearly to facilitate internal knowledge transfer and external audit readiness. Plan and manage resource utilization to meet both performance goals and regulatory efficiency standards. Troubleshoot and support cloud/network issues, ensuring secure handling of protected health information (PHI) and device data. Qualifications: Bachelors or Masters degree in Computer Science, Engineering, or related field. 7+ years in Production Engineering, DevOps, or SRE roles within healthcare, medical device, or life sciences industries. Expertise in containerization (Kubernetes, Docker), cloud platforms, and infrastructure-as-code. Direct experience supporting systems subject to FDA GxP and HITRUST compliance; familiarity with HIPAA, SOC2, ISO 27001 frameworks. Strong skills in scripting/automation (Python, Bash, Go). Proven track record managing SIEM and monitoring platforms in regulated environments. In-depth knowledge of incident response and reliability engineering in healthcare/medical device settings. Certifications in cloud security, DevOps, and/or healthcare compliance (e.g., HITRUST, AWS Security, etc.) strongly preferred. Preferred Skills: Experience deploying and supporting medical device software under FDA regulations. Familiarity with quality management systems, validation procedures, and documentation for regulatory audits and FDA submissions. Strong communication and leadership skills for cross-functional collaboration in a regulated setting. Ability to innovate while maintaining strict compliance constraints. Role Summary Senior SRE Monitoring Engineer (ISIJP00011446) Intuitive Surgical is seeking a Senior SRE / Production Engineer with strong monitoring and observability expertise to support highly regulated digital health and medical software platforms. This role is responsible for ensuring reliability, scalability, security, and compliance of production systems in accordance with FDA GxP, HITRUST, and healthcare regulatory standards. The engineer will lead incident response, monitoring/alerting strategy, capacity planning, and infrastructure automation while partnering closely with security, quality, compliance, and engineering teams. This is a hands-on role focused on SRE best practices, cloud-native infrastructure, Kubernetes, CI/CD, and enterprise monitoring/SIEM tools in a regulated healthcare environment. ________________________________________ MUST-HAVES (Non-Negotiable Screening Criteria) Core SRE / Production Engineering 7+ years in SRE, Production Engineering, or DevOps Hands-on experience supporting production systems (not just development) Strong ownership of incident response (on-call, triage, RCA, postmortems) ________________________________________ Monitoring & Observability (KEY FOCUS) Hands-on experience with monitoring tools, such as: o Datadog o Prometheus / Grafana o Elastic (ELK) Experience defining and managing SLIs, SLOs, SLAs Ability to design alerting strategies that balance reliability and compliance ________________________________________ Security, Logging & SIEM Experience with SIEM / log aggregation tools: o Splunk, Datadog Security, Elastic Security Strong understanding of: o Logging, auditing, traceability o Incident detection and escalation Experience supporting security incidents in production ________________________________________ Regulated Environment Experience (HARD REQUIREMENT) Direct experience working in regulated healthcare, medical device, or life sciences environments Hands-on exposure to: o FDA GxP o HITRUST o Familiarity with HIPAA, SOC2, ISO 27001 Experience producing audit-ready documentation ________________________________________ Cloud, Containers & Infrastructure Strong hands-on experience with: o Kubernetes & Docker o Cloud platforms: AWS, GCP, or Azure Infrastructure-as-Code using: o Terraform o Helm o Ansible Experience embedding compliance controls into IaC and CI/CD pipelines ________________________________________ Automation & Scripting Strong scripting skills in: o Python o Bash o Go (preferred but not mandatory) Automation of deployments, monitoring, and remediation workflows ________________________________________ Keywords: continuous integration continuous deployment information technology golang California SRE Monitoring Engineer W 2 [email protected] http://bit.ly/4ey8w48 https://jobs.nvoids.com/job_details.jsp?id=3020401&uid=c891f863c8674c778d526b842394fecf |
| [email protected] View All |
| 01:43 AM 30-Dec-25 |