Home

AWS Architect with Open Telemetry || 100% Remote || No H1B || 14+ Years Profiles at Remote, Remote, USA
Email: [email protected]
From:

Abhishek Yadav,

DMS Vision Inc.

[email protected]

Reply to:   [email protected]

Hi,

Hope you are doing well !!

I have an urgent position. Kindly go through the Job description and let me know if this would be of interest to you.

Job Title: Open Telemetry SME / AWS Architect

Location: 100% Remote

Duration: 6+ Months Contract

Job Description:

Need LinkedIn Profile for this position.

The candidate has to have end to end visibility and be an architect who has implemented. Do not need a monitoring Engineer.

About the job

Job Description:

Client is seeking an experienced monitoring tools and Open Telemetry Subject Matter Expert (SME) who will be responsible for designing, implementing and optimizing monitoring solutions and leveraging Open Telemetry to enhance observability within the Enterprise Command Center (ECC).

The SME should collaborate with the Incident Management team to troubleshoot and resolve incidents.

Key Job Functions:

Lead the design and implementation of monitoring solutions using industry standard tools such as Splunk and others.

Customize monitoring configurations to align with the organizational requirements.

Implement and integrate Open Telemetry across various applications and services for enhanced observability.

Optimize monitoring solutions for efficiency and accuracy ensuring minimal impact on system performance.

Responsible for designing and implementing application and infrastructure performance monitoring under AWS Cloud environment.

Create monitors and dashboards to monitor applications and infrastructure performance.

Perform deep statistical analysis using performance data to help identify capacity and performance bottlenecks.

Configure alerting mechanisms within monitoring tools to proactively identify and address potential issues.

Develop comprehensive documentation for monitoring tool configurations, Open Telemetry implementations and best practices.

Provide training to incident management teams on utilizing monitoring tools and interpreting open telemetry data effectively.

Setup monitoring dashboards for incident detection and alerting.

Perform end-to-end analysis of transactions under an observability environment.

Troubleshoot incidents and identify root cause quickly using wire data analytics, application performance management and event correlation monitoring tools.

Diagnose and resolve incidents by providing factual data from the various monitoring and instrumentation systems.

Job Requirements:

A good understanding of the IT Cloud infrastructure that includes AWS Cloud, middleware, database, storage and/or network infrastructure.

Strong understanding of IT infrastructure, networking, security concepts and application architecture.

Hands-on experience with Open Telemetry instrumentation and telemetry data collection.

Proven experience as a Splunk SM with in-depth knowledge of Splunk architecture and components.

Excellent troubleshooting and problem-solving skills.

Strong documentation skills and attention to detail.

Proactively monitoring of hardware, software, and environmental alerts or malfunctions.

Analyze dashboards and monitoring tools to look for trends and patterns in application/infrastructure health and performance.

Monitor applications and infrastructure using tools like Splunk, DynaTrace, Catchpoint, MoogSoft, xMatters, SignalFx, Catchpoint, MoogSoft, xMatters, SolarWinds, Extrahop etc.

Expert understanding of micro service-based applications deployed in Cloud using Lambdas, ECS Fargate etc.

Proficiency in AWS services like IAM, Roles, Security groups, EC2, S3, Lambda, ALB, ECS etc.

Experience working with AWS tools like ELB, RDS, Redshift, DynamoDB, Aurora, Route53, Lambda, S3, Batch, CloudWatch, CloudTrail, WAF etc.

Hands on experience with transaction level monitoring using Dynatrace and Splunk.

Create Splunk search queries and dashboards.

Be the SME in helping recognize and onboard new data sources into Splunk and other tools, analyze the data for anomalies and trends, and building dashboards highlighting the key trends of the data.

Implement best in class engineering strategies to support a distributed clustered Splunk environment consisting of Search Heads, Indexers, Forwarders, Splunk Enterprise Security (ES) app spanning security, performance, engineering, and operational roles.

Use open-source Observability framework, OpenTelemetry for instrumenting, generating, collecting, and exporting telemetry data such as traces, metrics, logs to help analyze application performance and behavior.

Use distributed tracing in an end-to-end visibility environment that consists of micro-services, Containers, Serverless and Lambda.

Work closely with application teams and business stakeholders to perform troubleshooting and aid in incident triage.

Influence other technical teams on incident calls and articulate troubleshooting steps effectively.

Follow up on items that could negatively impact production operations, assist with postmortem related activities, and support various efforts related to operational improvements.

Strong relationship management skills and aptitude to multi-task and work well in a high stress environment, both within teams and independently.

Preferred Qualifications:

Familiarity with distributed tracing and logging solutions.

Knowledge of Cloud Platforms (AWS, Azure) and their integration with monitoring tools.

AWS Solution Architect Associate or higher certification.

Exposure working under an incident management environment.

Triage incidents to resolution in a 24/7/365 environment, effectively guide incident triage calls from a technical perspective, share technical details obtained from monitoring tools and dashboards to aid troubleshooting, outline details of resolution activities provide timely status updates to stakeholders, assist with postmortem related activities and support various efforts related to operational improvements.

Ability to report incident details and metrics to senior leadership.

Perform analysis of data, evaluating multiple application protocols including web, database, storage, and supporting infrastructure such as UNIX, DNS, LDAP, SSL, SMTP, and FTP.

Proficient in Scripting - UNIX/LINUX- Shell Scripting and Python. Working knowledge of JavaScript or Perl etc. for customizing monitoring configurations

Certification in relevant monitoring tools or Open Telemetry is a plus.

Education:

Bachelor's Degree or equivalent required.

Minimum Experience:

8 plus years of related experience

If you are interested, please share your updated resume and suggest the best number & time to connect with you.

Thanks & Regards,

Abhishek Yadav

DMS Vision Inc.  | 
http://dmsvisions.com/

4645 Avon Lane, Suite 210, Frisco, TX 75033

Email: 
[email protected]
 | Phone : 
972-645-5314
 | EXT - 103

Keywords: sthree information technology golang Texas
[email protected]
View all
Thu Mar 07 22:15:00 UTC 2024

To remove this job post send "job_kill 1194547" as subject from [email protected] to [email protected]. Do not write anything extra in the subject line as this is a automatic system which will not work otherwise.


Your reply to [email protected] -
To       

Subject   
Message -

Your email id:

Captcha Image:
Captcha Code:


Pages not loading, taking too much time to load, server timeout or unavailable, or any other issues please contact admin at [email protected]
Time Taken: 33

Location: , Remote