Open Telemetry SME at Remote, Remote, USA |
Email: [email protected] |
From: Farha khan, Tek Inspirations LLC [email protected] Reply to: [email protected] Job Description - Role: Open Telemetry SME Location: Remote LinkedIN is needed needs experience designing with open telemetry and be a SME for Splunk. So pretty much they want a Open Telemetry SME. Someone who was a Monitoring Architect that moved to Open Telemetry might work. They want architect level. Someone who knows in/outs of monitoring and implemented various tools from end to end. They have to have implemented it on AWS. Will build dashboards. Skills: A good understanding of the IT Cloud infrastructure that includes AWS Cloud, middleware, database, storage and/or network infrastructure. Strong understanding of IT infrastructure, networking, security concepts and application architecture. Hands-on experience with Open Telemetry instrumentation and telemetry data collection. Proven experience as a Splunk SM with in-depth knowledge of Splunk architecture and components. Excellent troubleshooting and problem-solving skills. Strong documentation skills and attention to detail. Proactively monitoring of hardware, software, and environmental alerts or malfunctions. Analyze dashboards and monitoring tools to look for trends and patterns in application/infrastructure health and performance. Monitor applications and infrastructure using tools like Splunk, DynaTrace, Catchpoint, MoogSoft, xMatters, SignalFx, Catchpoint, MoogSoft, xMatters, SolarWinds, Extrahop etc. Expert understanding of micro service-based applications deployed in Cloud using Lambdas, ECS Fargate etc. Proficiency in AWS services like IAM, Roles, Security groups, EC2, S3, Lambda, ALB, ECS etc. Experience working with AWS tools like ELB, RDS, Redshift, DynamoDB, Aurora, Route53, Lambda, S3, Batch, CloudWatch, CloudTrail, WAF etc. Hands on experience with transaction level monitoring using Dynatrace and Splunk. Create Splunk search queries and dashboards. Be the SME in helping recognize and onboard new data sources into Splunk and other tools, analyze the data for anomalies and trends, and building dashboards highlighting the key trends of the data. Implement best in class engineering strategies to support a distributed clustered Splunk environment consisting of Search Heads, Indexers, Forwarders, Splunk Enterprise Security (ES) app spanning security, performance, engineering, and operational roles. Use open-source Observability framework, OpenTelemetry for instrumenting, generating, collecting, and exporting telemetry data such as traces, metrics, logs to help analyze application performance and behavior. Use distributed tracing in an end-to-end visibility environment that consists of micro-services, Containers, Serverless and Lambda. Work closely with application teams and business stakeholders to perform troubleshooting and aid in incident triage. Influence other technical teams on incident calls and articulate troubleshooting steps effectively. Follow up on items that could negatively impact production operations, assist with postmortem related activities, and support various efforts related to operational improvements. Strong relationship management skills and aptitude to multi-task and work well in a high stress environment, both within teams and independently. Keywords: sthree information technology |
[email protected] View all |
Fri Feb 23 00:06:00 UTC 2024 |