Home

Site Reliability Engineer - API Enablement W2 at Remote, Remote, USA
Email: shivani.sharma@adventatech.com
UID: 0be54aa95edb43119fc90d6a9740ebe3
From:

shivani sharma,

adventatech

shivani.sharma@adventatech.com

Reply to:   shivani.sharma@adventatech.com

Job Description -
 Site Reliability Engineer - API Enablement W2
Linked In
Location: 
ONSITE Hybrid> 3 days a week in either Hartford CT - Charlotte, NC - Chicago, IL - Columbus, OH (No Remote) 
Must Have Skills:
- Site Reliability Engineering (SRE) including demonstrated ability to create dashboards in Splunk, Dynatrace, or similar tools, setup monitoring, SRE best practices
- API Solutions (API Connect, Apigee, Azure API Management, or similar), API delivery, API Management
- Full stack software engineering, including DevOps, Performance Testing, Automation tools and scripts
Job Description: 
We are seeking a highly skilled and experienced API Enablement SRE Senior Staff Engineer to join our Team. The ideal candidate will have a strong background in managing and optimizing complex systems, ensuring their reliability, scalability, and performance. This role focuses on enhancing our API Management Platforms and integrating SRE best practices.

Key Responsibilities:

API Platform and Enablement Team:
Design, implement, and maintain reliable and scalable SRE practices for API Management Platforms.
Strong knowledge and experience in API solutions, platforms, API delivery, and API management.
Strengthen the maturity of SRE practices by building on and executing improvements to observability, resiliency, and stability.
Assess ecosystem changes to determine risk, impact, and checkout needs for API and Integration Platforms.
Proactively consider SRE improvements and GenAI opportunities, create solutions, and successfully execute them.
Create self-service capabilities to enable API provider teams to easily integrate with SRE API best practices.
Incident Management and On-Call Rotation:
Lead incident management, structured triage, and analysis, including the creation and management of incident runbooks.
Participate in on-call rotations for incidents and changes, including evenings and weekends.
Conduct problem analysis, remediation, and continuous improvement to enhance system reliability.
Views, Dashboards, and Unified Views:
Implement and maintain observability and monitoring solutions, including Splunk and Dynatrace.
Create unified views, dashboards, and visualizations to provide a single pane of glass and information radiators for system health and performance.
Create unified views that can be shared across stakeholders to quickly align on the issue root cause.
Resiliency and Strengthening SRE Maturity:
Design, implement, and maintain reliable and scalable systems and infrastructure.
Lead the team in SRE and proactive risk mitigation, including resiliency and disaster recovery exercises, change management, and upgrades and patches.
Level up SRE maturity and demonstrate it through the achievement of KPIs and operational metrics.
Performance and Automation:
Monitor and optimize the performance, availability, and reliability of systems and applications.
Develop and maintain automation tools and scripts to streamline operations and improve efficiency.
Risk Management and Metrics:
Define, operationalize, and integrate SRE-related KPIs, metrics, and ideas into day-to-day activities.
Proactively manage risks, including assessment of findings, planning remediation, and executing to bring prompt closure to resolve risks
Qualifications:
Strong knowledge and experience in API solutions, platforms, API delivery, and API management.
Knowledge and skills in API Platforms (e.g., API Connect, Apigee, AWS API Gateway) and API Management.
5+ years of experience in site reliability engineering or a related field.
Expertise in SRE best practices, including incident management, resiliency, monitoring, detection, diagnosis, remediation, and prevention.
Demonstrated experience in being on call and resolving incidents, including incident management and root cause analysis.
Experience with large-scale distributed systems.
Knowledge of CI/CD pipelines and DevOps practices.
Experience with cloud platforms (e.g., AWS, Azure, GCP)
Strong knowledge of system design, development, and management.
Full stack software engineering skill set, including front-end, back-end, and database development.
Proficiency in scripting languages (e.g., Python, Bash) and automation tools (e.g., Ansible, Terraform).
Familiarity with monitoring and observability tools (e.g., Splunk, Dynatrace).
Demonstrated ability to mature SRE practices and strengthen stability through proven KPIs and metrics.
Excellent documentation, communication, problem-solving, and collaboration skills.
Experience with GenAI and innovation, and a commitment to continuous improvement

Keywords: continuous integration continuous deployment information technology wtwo Connecticut Illinois North Carolina Ohio
Site Reliability Engineer - API Enablement W2
shivani.sharma@adventatech.com
https://jobs.nvoids.com/job_details.jsp?id=2266393
shivani.sharma@adventatech.com
View All
01:53 AM 19-Mar-25


To remove this job post send "job_kill 2266393" as subject from shivani.sharma@adventatech.com to usjobs@nvoids.com. Do not write anything extra in the subject line as this is a automatic system which will not work otherwise.


Your reply to shivani.sharma@adventatech.com -
To       

Subject   
Message -

Your email id:

Captcha Image:
Captcha Code:


Pages not loading, taking too much time to load, server timeout or unavailable, or any other issues please contact admin at me@nvoids.com


Time Taken: 95

Location: , Indiana