Job Details

Home

Site Reliability Engineer only local to Dallas TX at Dallas, Texas, USA

Email: [email protected]

From:

Tim Patten,

Pullskill Technologies Inc

[email protected]

Reply to: [email protected]

Position Role/Tile: Site Reliability Engineer

Duration: 1 year

Location: Dallas, TX OR Charlotte, NC OR Phoenix, AZ- onsite 3 days per week

Responsibilities:

As a Software Engineer - SRE this role will be responsible for monitoring the applications and responding to events, incidents, and changes originating from internal or vendor applications. Investigate incidents and problems and determine root causes. I will use ServiceNow, Jira, Confluence, Splunk, Azure Monitor, and Google Cloud Monitoring.

Principal Duties, and Responsibilities:

Troubleshoot and resolve issues in live production environments and implement strategies to eliminate them with minimal support.

Manage applications through automation.

Support and monitor new and existing services, platforms, and application stacks.

Engage in improving the lifecycle of service deployment, operations, and refinement.

Provide technical expertise during service-impacting events.

Collaborate with other engineers on code reviews, internal infrastructure improvements, and process enhancements.

Use scalability testing to measure, tune, and optimize system performance.

Participate in periodic 24x7 on-call duties.

Being accountable for resolving the outage via a workaround or permanent fix

Ensuring all administration and reports are maintained and up to date including contact information technical diagrams post major incident reviews.

Responsible for communicating with various stakeholders & shipping IT Communication.

Responsible for the effective implementation of the process Incident, Change, and Problem Management and conducts the respective reporting procedure.

Monitor the incidents to ensure that the Service Level Agreement is respected.

Identify initiate schedule and conduct incident reviews.

Ensure the closure of all resolved and end-user-confirmed Incident records.

Establish continuous process improvement cycles where the process performance activities roles and responsibilities policies procedures and supporting technology is reviewed and enhanced where applicable.

Headed Proofs-of-Concept on Splunk implementation, Splunk indexing, and plugins, mentored and guided other team members on Understanding the use case of Splunk.

Knowledge of Splunk Enterprise Deployments and enable continuous integration as part of configuration using (props. conf, Transforms, Input. conf & Output. conf, Deployment. conf) management.

Knowledge of log parsing, and complex Splunk searches, including external table lookups, Splunk data flow, components, features, and product capability.

. Knowledge in setting up alerts and Monitoring recipes from Machine-generated data.

Position specifications:

Education: Bachelor's Degree or Equivalent

Experience, Skills & Qualifications:

5+ years of experience in Site Reliability engineering

Experience with one or more Cloud Platforms (Azure, AWS, GCP)

Experience with Container technologies: Kubernetes, Docker PKS

Experience setting up monitoring in applications and databases.

Experience in third-party services and third-party vendor management

Experience in ServiceNow

Excellent verbal, written, and interpersonal communication skills.

Keywords: information technology Arizona North Carolina Texas
Site Reliability Engineer only local to Dallas TX
[email protected]

[email protected]
View all

Mon Mar 25 19:44:00 UTC 2024

To remove this job post send "job_kill 1247443" as subject from [email protected] to [email protected]. Do not write anything extra in the subject line as this is a automatic system which will not work otherwise.

Your reply to [email protected] -

To

Subject
Message -

tim@pullskill.com wrote:
From:

Tim Patten,

Pullskill Technologies Inc

tim@pullskill.com

Reply to:   tim@pullskill.com

Position Role/Tile: Site Reliability Engineer

Duration: 1 year

Location: Dallas, TX OR Charlotte, NC OR Phoenix, AZ- onsite 3 days per week

Responsibilities:

As a Software Engineer - SRE this role will be responsible for monitoring the applications and responding to events, incidents, and changes originating from internal or vendor applications. Investigate incidents and problems and determine root causes. I will use ServiceNow, Jira, Confluence, Splunk, Azure Monitor, and Google Cloud Monitoring.

Principal Duties, and Responsibilities:

Troubleshoot and resolve issues in live production environments and implement strategies to eliminate them with minimal support.

Manage applications through automation.

Support and monitor new and existing services, platforms, and application stacks.

Engage in improving the lifecycle of service deployment, operations, and refinement.

Provide technical expertise during service-impacting events.

Collaborate with other engineers on code reviews, internal infrastructure improvements, and process enhancements.

Use scalability testing to measure, tune, and optimize system performance.

Participate in periodic 24x7 on-call duties.

Being accountable for resolving the outage via a workaround or permanent fix

Ensuring all administration and reports are maintained and up to date including contact information technical diagrams post major incident reviews.

Responsible for communicating with various stakeholders & shipping IT Communication.

Responsible for the effective implementation of the process Incident, Change, and Problem Management and conducts the respective reporting procedure.

Monitor the incidents to ensure that the Service Level Agreement is respected.

Identify initiate schedule and conduct incident reviews.

Ensure the closure of all resolved and end-user-confirmed Incident records.

Establish continuous process improvement cycles where the process performance activities roles and responsibilities policies procedures and supporting technology is reviewed and enhanced where applicable.

Headed Proofs-of-Concept on Splunk implementation, Splunk indexing, and plugins, mentored and guided other team members on Understanding the use case of Splunk.

Knowledge of Splunk Enterprise Deployments and enable continuous integration as part of configuration using (props. conf, Transforms, Input. conf & Output. conf, Deployment. conf) management.

Knowledge of log parsing, and complex Splunk searches, including external table lookups, Splunk data flow, components, features, and product capability.

. Knowledge in setting up alerts and Monitoring recipes from Machine-generated data.

Position specifications:

Education: Bachelor's Degree or Equivalent

Experience, Skills & Qualifications:

5+ years of experience in Site Reliability engineering

Experience with one or more Cloud Platforms (Azure, AWS, GCP)

Experience with Container technologies: Kubernetes, Docker PKS

Experience setting up monitoring in applications and databases.

Experience in third-party services and third-party vendor management

Experience in ServiceNow

Excellent verbal, written, and interpersonal communication skills.

Keywords: information technology Arizona North Carolina Texas 
Site Reliability Engineer only local to Dallas TX
tim@pullskill.com

Your email id:

Captcha Image:

Captcha Code:

Pages not loading, taking too much time to load, server timeout or unavailable, or any other issues please contact admin at [email protected]
Time Taken: 7

Location: Dallas, Texas