Job Details

Home

Opportunity: Sr. Data Engineer with Python and PySpark || McLean, VA | Hybrid at Mclean, Virginia, USA

Email: [email protected]

From:

Deepak,

Xforia Inc

[email protected]

Reply to: [email protected]

Title: Sr. Data Engineer with Python and PySpark

Location: McLean, VA | Hybrid

Duration: Long Term Contract

Job Description:

Must Have:
Object-oriented programming experience using Python.
Experience with SQL.
API experience, preferably with familiarity with Boto3.
Experience with PySpark and a solid understanding of Big data.

Strategy Development and Implementation
Develop data filtering, transformational and loading requirements.
Define and execute ETLs using Apache Sparks on Hadoop among other Data technologies.
Determine appropriate translations and validations between source data and target databases.
Implement business logic to cleanse & transform data.
Design and implement appropriate error handling procedures.
Develop project, documentation and storage standards in conjunction with data architects.
Monitor performance, troubleshoot and tune ETL processes as appropriate using tools like in the AWS ecosystem.
Create and automate ETL mappings to consume loan level data source applications to target applications.
Execution of end-to-end implementation of underlying data ingestion workflow.

Qualifications
At least 5 years of experience developing in Python, SQL (Postgres/snowflake preferred)
Bachelors degree with equivalent work experience in computer science, data science or a related field.
Experience working with different Databases and understanding of data concepts (including data warehousing, data lake patterns, structured and unstructured data)
3+ years experience of Data Storage/Hadoop platform implementation, including 3+ years of hands-on experience in implementation and performance tuning Hadoop/Spark implementations.
Implementation and tuning experience specifically using Amazon Elastic Map Reduce (EMR).
Implementing AWS services in a variety of distributed computing, enterprise environments.
Experience writing automated unit, integration, regression, performance and acceptance tests.
Solid understanding of software design principles

Preferred Skills
Understanding of Apache Hadoop and the Hadoop ecosystem. Experience with one or more relevant tools (Sqoop, Flume, Kafka, Oozie, Hue, Zookeeper, HCatalog, Solr, Avro).
Deep knowledge on Extract, Transform, Load (ETL) and distributed processing techniques such as Map-Reduce
Experience with Columnar databases like Snowflake, Redshift
Experience in building and deploying applications in AWS (EC2, S3, Hive, Glue, EMR, RDS, ELB, Lambda, etc.)
Experience with building production web services
Experience with cloud computing and storage services
Knowledge of Mortgage industry

www.xforia.com

Deepak M.V

Recruitment Manager

9300 Wade Boulevard, Suite 220, Frisco TX 75035, USA.

Keywords: sthree Texas Virginia

[email protected]
View all

Wed Nov 22 02:28:00 UTC 2023

To remove this job post send "job_kill 878348" as subject from [email protected] to [email protected]. Do not write anything extra in the subject line as this is a automatic system which will not work otherwise.

Your reply to [email protected] -

To

Subject
Message -

deepak@xforia.com wrote:
From:

Deepak,

Xforia Inc

deepak@xforia.com

Reply to:   deepak@xforia.com

Title: Sr. Data Engineer with Python and PySpark

Location: McLean, VA | Hybrid

Duration: Long Term Contract

Job Description:

Must Have:
Object-oriented programming experience using Python.
Experience with SQL.
API experience, preferably with familiarity with Boto3.
Experience with PySpark and a solid understanding of Big data.

Strategy Development and Implementation
Develop data filtering, transformational and loading requirements.
Define and execute ETLs using Apache Sparks on Hadoop among other Data technologies.
Determine appropriate translations and validations between source data and target databases.
Implement business logic to cleanse & transform data.
Design and implement appropriate error handling procedures.
Develop project, documentation and storage standards in conjunction with data architects.
Monitor performance, troubleshoot and tune ETL processes as appropriate using tools like in the AWS ecosystem.
Create and automate ETL mappings to consume loan level data source applications to target applications.
Execution of end-to-end implementation of underlying data ingestion workflow.

Qualifications
At least 5 years of experience developing in Python, SQL (Postgres/snowflake preferred)
Bachelors degree with equivalent work experience in computer science, data science or a related field.
Experience working with different Databases and understanding of data concepts (including data warehousing, data lake patterns, structured and unstructured data)
3+ years experience of Data Storage/Hadoop platform implementation, including 3+ years of hands-on experience in implementation and performance tuning Hadoop/Spark implementations.
Implementation and tuning experience specifically using Amazon Elastic Map Reduce (EMR).
Implementing AWS services in a variety of distributed computing, enterprise environments.
Experience writing automated unit, integration, regression, performance and acceptance tests.
Solid understanding of software design principles

Preferred Skills
Understanding of Apache Hadoop and the Hadoop ecosystem. Experience with one or more relevant tools (Sqoop, Flume, Kafka, Oozie, Hue, Zookeeper, HCatalog, Solr, Avro).
Deep knowledge on Extract, Transform, Load (ETL) and distributed processing techniques such as Map-Reduce
Experience with Columnar databases like Snowflake, Redshift
Experience in building and deploying applications in AWS (EC2, S3, Hive, Glue, EMR, RDS, ELB, Lambda, etc.)
Experience with building production web services
Experience with cloud computing and storage services
Knowledge of Mortgage industry

www.xforia.com

Deepak M.V

Recruitment Manager

9300 Wade Boulevard, Suite 220, Frisco TX 75035, USA.

Keywords: sthree Texas Virginia

Your email id:

Captcha Image:

Captcha Code:

Pages not loading, taking too much time to load, server timeout or unavailable, or any other issues please contact admin at [email protected]
Time Taken: 9

Location: Mclean, Virginia