Opportunity: Sr. Data Engineer with Python and PySpark || McLean, VA | Hybrid at Mclean, Virginia, USA |
Email: [email protected] |
From: Deepak, Xforia Inc [email protected] Reply to: [email protected] Title: Sr. Data Engineer with Python and PySpark Location: McLean, VA | Hybrid Duration: Long Term Contract Job Description: Must Have: Object-oriented programming experience using Python. Experience with SQL. API experience, preferably with familiarity with Boto3. Experience with PySpark and a solid understanding of Big data. Strategy Development and Implementation Develop data filtering, transformational and loading requirements. Define and execute ETLs using Apache Sparks on Hadoop among other Data technologies. Determine appropriate translations and validations between source data and target databases. Implement business logic to cleanse & transform data. Design and implement appropriate error handling procedures. Develop project, documentation and storage standards in conjunction with data architects. Monitor performance, troubleshoot and tune ETL processes as appropriate using tools like in the AWS ecosystem. Create and automate ETL mappings to consume loan level data source applications to target applications. Execution of end-to-end implementation of underlying data ingestion workflow. Qualifications At least 5 years of experience developing in Python, SQL (Postgres/snowflake preferred) Bachelors degree with equivalent work experience in computer science, data science or a related field. Experience working with different Databases and understanding of data concepts (including data warehousing, data lake patterns, structured and unstructured data) 3+ years experience of Data Storage/Hadoop platform implementation, including 3+ years of hands-on experience in implementation and performance tuning Hadoop/Spark implementations. Implementation and tuning experience specifically using Amazon Elastic Map Reduce (EMR). Implementing AWS services in a variety of distributed computing, enterprise environments. Experience writing automated unit, integration, regression, performance and acceptance tests. Solid understanding of software design principles Preferred Skills Understanding of Apache Hadoop and the Hadoop ecosystem. Experience with one or more relevant tools (Sqoop, Flume, Kafka, Oozie, Hue, Zookeeper, HCatalog, Solr, Avro). Deep knowledge on Extract, Transform, Load (ETL) and distributed processing techniques such as Map-Reduce Experience with Columnar databases like Snowflake, Redshift Experience in building and deploying applications in AWS (EC2, S3, Hive, Glue, EMR, RDS, ELB, Lambda, etc.) Experience with building production web services Experience with cloud computing and storage services Knowledge of Mortgage industry www.xforia.com Deepak M.V Recruitment Manager 9300 Wade Boulevard, Suite 220, Frisco TX 75035, USA. Keywords: sthree Texas Virginia |
[email protected] View all |
Wed Nov 22 02:28:00 UTC 2023 |