PySpark Developer with AWS EMR Experience:: Remote In EST Time zone :Long Term at Remote, Remote, USA |
Email: [email protected] |
From: Anilkumar, Siri info Solutions [email protected] Reply to: [email protected] Hello , Hope you are doing well!! Greeting from Siri Info Solutions I am Anil Kumar with Siri Info Solutions . We I5 tech is a global staff augmentation firm providing a wide-range of talent on-demand and total workforce solutions. We have an immediate opening for the below position with one of our premium clients. Job Title: PySpark Developer with AWS EMR Experience Duration: long term Work Location: Remote In EST Time zone. Must Have: Please provide hands on experience candidates with Pyspark, AWS, SQL, Redshift, Terraform background. Valid LinkedIn profile Passport Number Travel history Two professional references Dont submit if you feel candidate is fake, avoid submitting JNTU profile. Technical/Functional Skills (Mandatory skills): 5+ years of experience in programming with python. Strong proficiency in python Familiarity with functional programming concepts 3+ years of Hands on experience in developing ETL data pipelines using pyspark on AWS EMR Hands on experience of XML processing using python Good understanding of Sparks RDD API Good understanding of Sparks Dataframe and API Experience in configuring EMR clusters on AWS Experience and good understanding of Apache Spark Data sources API. Experience of dealing with AWS S3 object storage from Spark. Experience in trouble shooting spark jobs. Knowledge of monitoring spark jobs using Spark UI Performance tuning of Spark jobs. Nice to have skills: Knowledge of AWS SDK CLI Experience of setting up continuous integration/deployment of spark jobs to EMR clusters Knowledge of scheduling spark applications in AWS EMR cluster. Understanding the differences between Hadoop Mapreduce and Apache Spark Proficient understanding of code versioning tools as Git, SVN Roles & Responsibilities: Design, development and implementation of performant ETL pipelines using python API (pySpark) of Apache Spark on AWS EMR Writing reusable, testable, and efficient code Integration of data storage solutions in spark especially with AWS S3 object storage. Performance tuning of pySpark scripts. Need to ensure overall build delivery quality is good and on time delivery is done at all times. Should be able to handle meetings with customers with ease. Need to have excellent communication skills to interact with customer. Be a team player and willing to work in an onsite offshore model, mentor other folks in the team (onsite as well as offshore) Keywords: user interface access management sthree |
[email protected] View all |
Sat Jul 29 00:57:00 UTC 2023 |