HADOOP ADMIN - OPERATIONS exp. - NC (HYBRID) at Charlotte, North Carolina, USA |
Email: [email protected] |
Hello Partner, Hope you doing well Job Title: Hadoop Admin (with Operations exp.) Location: Charlotte, NC Experience: 10 Years Job Description: In this role, you will: Lead complex technology initiatives towards Hadoop platform stability, automation initiatives. Platform Incident and Problem management o Be part of 24/7 platform support team to perform platform incident management and triaging team. o Assess the impact of the platform issue and prioritize the triage by partnering with platform tenants, platform development and Hadoop vendor teams. o Facilitate and perform root cause analysis of platform incidents towards determining permanent resolution involving Tenants and Hadoop vendor. o Communicate with Stakeholders, leadership team on the triaging progress status. o To troubleshoot tenant reported failures by deep diving onto client logs to identify the root cause. o Use Autosys as scheduler to schedule, execute platform routine activities. Hadoop Cluster Monitoring o Monitor Hadoop cluster health and performance. o Automate new monitoring opportunities leveraging available monitoring and alerting tools such as Prometheus, Thousand Eyes, Splunk etc. o Perform troubleshooting on the actionable alerts and tuning to ensure optimal cluster performance. Hadoop Cluster Configuration and tuning o Periodically tune and configure Hadoop ecosystem components such as HDFS, YARN, Hive, HBase, Spark, HPOS etc. o Schedule the Change request and track the approval process for implementation. Platform Backup and Recovery: o Ensure Hadoop platform is resilient with data backup and recovery strategies in place. o Perform emergency or scheduled failover of platform to Disaster recovery site and fall back. Upgrades and Patch Management: o Plan and execute upgrades of Hadoop ecosystem components and patches. o Ensure compatibility and smooth integration of new features and enhancements. o Work with Operating System administrators and patching automation build team to schedule and execute patching. o Perform Cluster health validation post patching and communicate with stakeholders. o Identify automation and process improvement opportunities related to patching to implement it. Documentation and Reporting: o Maintain comprehensive documentation of Hadoop cluster configurations & troubleshooting processes, and procedures. o Develop and Schedule to Generate reports on cluster usage, performance metrics, and capacity utilization. Collaboration and Support: o Ability to work both independently and in collaboration with Platform development and Platform tenant teams to troubleshoot and fix Hadoop platform issues, maintain production stability to enterprise-wide applications. o Collaborate and consult with key technical experts, senior technology team, and Hadoop vendor to resolve complex technical issues and achieve goals. Mandatory Skills and Qualifications: Proven Experience in technology incident and problem management role. Proven Experience in using monitoring tools like Prometheus, Splunk, SiteScope & thousand eyes etc., Proven experience as a Unix Scripting Proven experience in Hadoop administration or in a similar role. Strong knowledge of Hadoop ecosystem and its components (Hive, HBase, Spark, Hue ). Proven Experience is using Autosys or similar job scheduler. Excellent problem-solving and troubleshooting skills. Excellent communication and people skills for collaborating with cross-functional teams. Best, TEJA Professional Recruiter E: [email protected] Birmingham, Alabama 35212 www.pacificconsultancyservices.com -- Keywords: information technology North Carolina HADOOP ADMIN - OPERATIONS exp. - NC (HYBRID) [email protected] |
[email protected] View all |
Mon Oct 21 20:18:00 UTC 2024 |