Flink Administrator at Dallas, Texas, USA |
Email: [email protected] |
From: sunil, ICS [email protected] Reply to: [email protected] Role: Flink Administrator Location: Dallas, TX (Onsite) / Jersey City, NJ (Day 1 Onsite) Duration: Long Term Who are we looking for As a Big data Administrator, help in maintaining and administering on-premises and cloud based big data platform. Help in setting up Platform, automation, maintaining knowledgebase/ run books, troubleshooting, restoring service on platform and provide support. Your responsibilities: Build and support on-premises Hadoop, Flink (Cloudera Streaming Analytics) platform infrastructure and applications. Deploy Flink based applications on the platform and use configuration management tools (such as Ansible, SaltStack, etc..) to manage them. Deploy software to improve the availability, scalability, and efficiency of the platform. Facilitate capacity planning and demand forecasting, software performance analysis, and system tuning. Understand the end-to-end configuration, technical dependencies, and overall behavioral characteristics of production services. Partner with development teams in defining and implementing improvements. Propose solutions related to server hardware and software configuration, networking, standard internet services, scripting languages, cloud computing patterns, technology security and compliance. Troubleshoot priority incidents, facilitate blameless post-mortems. Perform analytics on previous incidents and usage patterns to better predict issues and take proactive actions. Work with development teams throughout the software life cycle ensuring sustainable software releases. Lead and participate in tests; identify bottlenecks, opportunities for optimization, and capacity demands. Participate in the 24x7 support coverage as needed. Measurement and optimization of service performance Tooling to enable observability services, Automating CI/CD pipelines. Provide technical escalation, contribute in the on-call rotation. Automate monitoring system to ensure uptime on production system. Have experience in / be able to troubleshoot end-to-end on a private or public clouds Infrastructure. Infrastructure Monitoring and Reports for all performance metrics. Technical Skills: 10+ years experience in Hadoop, YARN infrastructure management and application deployment Hands on experience in maintain and support Flink/Spark streaming application on Hadoop/ cloud environment. 4+ years of experience in DevOps and Shell Scripting SRE Engineer with strong experience in monitoring, troubleshooting and support. Support rapid development and engineering productivity via release engineering, CI/CD & IaC automation, and build tools. Perform health checks Apps/Infra to identify and proactively pre-empt issues from occurring (verification, alerts, etc). Experience with Python including Object Oriented programming. Working experience on Splunk to work on logs inventory creating dashboards, etc for various streams such as Linux, etc Experience with Ansible, Puppet, SaltStack Container administration and development utilizing Kubernetes, Docker, Mesos, or similar. Infrastructure automation through Terraform, Chef, Ansible, Puppet, Packer or similar. Experience with Cloud Orchestration frameworks, development and SRE support of these systems. Experience with CI/CD pipelines including VCS (git, svn, etc), Gitlab Runners, Jenkins, Rundeck Oracle Database knowledge in ATP, ADW and programming in SQL, PL/SQL Cloud network experience Experience with Linux Experience working with fault tolerant, highly available, high throughput, distributed, scalable systems. Integration with Code Deploy / GitHub Actions Experience in IaaS tools like CFT, Terraform Nice to have: Experience with Kubernetes or other container orchestration framework. Experience in public cloud-based solutions like Azure, GCP, AWS Keywords: continuous integration continuous deployment procedural language New Jersey Texas |
[email protected] View all |
Fri Dec 08 00:59:00 UTC 2023 |