New Position|| Direct Client|| Immediate Requirement at Plano, Texas, USA |
Email: [email protected] |
Dear Business Partner, Position: 1 Position Title: Reliability and Monitoring Engineer Location: Plano, TX Overall experience should be 12+ Years. Responsible for ensuring the availability, performance, and reliability of our cloud-based infrastructure and services. The primary focus of this role is designing, implementing, and managing robust monitoring and alerting systems to proactively identify issues and timely incident response. This resource will work closely with the engineering platform and Development teams to optimize services and maintain service uptime. Duties include: Develop and maintain comprehensive monitoring solutions for cloud-based services and applications. Configure monitoring tools and systems to collect relevant metrics, logs, and traces. Create custom monitoring dashboards and reports using DataDog or other tools, to provide real-time insights into system performance and health. Continuously monitor the cloud infrastructure's performance and capacity, anticipating and addressing potential scalability issues. Proactively suggest and implement improvements to enhance the system's reliability, resilience, and fault tolerance. Work on automating tasks to streamline operational processes and reduce manual intervention. Collaborate with cross-functional teams to investigate and resolve critical incidents, ensuring minimal impact on end-users. Work with Problem Management team to complete post-mortem analysis of incidents to identify root causes and implement preventive measures. Ideal Qualifications: 3+ years experience working with cloud platforms and services (AWS, Azure, GCP, etc.) in a production environment. Solid understanding of monitoring and logging tools, such as Prometheus, Grafana, ELK stack, Splunk, etc. Experience with infrastructure as code (IaC) tools, like Terraform, CloudFormation, or Ansible. Strong scripting and automation skills (e.g., Python, Bash) to facilitate operational tasks. Knowledge of containerization technologies (Docker, Kubernetes) and microservices architecture. Familiarity with DevOps practices and Agile methodologies. Position: 2 Position Title: Release Manager Location: Plano, TX Overall experience should be 15 Years. Responsible for ensuring the smooth and efficient deployment of software releases across cloud and mobile app environments. Coordinate & collaborate closely with cross-functional teams, including Development, DevOps, and QA, to ensure high-quality software releases while minimizing risks and downtimes. Duties include: Collaborate with development teams & product owners to plan and schedule software releases, considering project timelines and business requirements. Oversee the release calendar (Forward Schedule of Change) and coordinate with stakeholders to ensure smooth deployments with minimal disruptions. Establish release processes and best practices to maintain a consistent and efficient deployment pipeline. Oversee configuration management processes, ensuring that all components, environments, and dependencies are well-documented, and version controlled. Facilitate change management procedures, validating proposed changes, and evaluating potential impacts before deployment. Continuously improve automation scripts and workflows to optimize release efficiency. Conduct post-release reviews and root cause analysis (5 whys) to identify opportunities for improvement and implement preventive measures. Design and implement strategy to leverage Canary Regions improving rollouts of software releases, minimizing potential impact on production environments. Ideal Qualifications: 5+yrs release management, deployment automation, and change management in a cloud-based environment. Familiarity with Canary Region deployment strategies and related tools to manage gradual rollouts effectively. Strong experience with CI/CD tools (e.g., Jenkins, GitLab CI/CD, CircleCI) and version control systems (e.g., Git). Solid understanding of cloud platforms (e.g., AWS, Azure, GCP) and mobile application deployment processes (iOS/Android). Proficiency in scripting and automation (e.g., Bash, Python) to streamline deployment workflows. Knowledge of configuration management tools (e.g., Ansible, Chef, Puppet) and infrastructure-as-code (IaC) principles. Thanks , Rishi Alpharetta, GA 30022 Fax Line: (866) 431-2320 www.cohetech.com Keywords: continuous integration continuous deployment quality analyst Georgia Texas |
[email protected] View all |
Thu Sep 14 19:18:00 UTC 2023 |