Job Details

Home

Sr. Data Engineer - Hybrid - Onsite in McLean, VA at Mclean, Virginia, USA

Email: [email protected]

From:

Riyas,

Xforia Inc

[email protected]

Reply to: [email protected]

Important notes:

Object-oriented programming experience using Python.

Experience with SQL.

API experience, preferably with familiarity with Boto3.

Experience with PySpark and a solid understanding of Big data.

Two-round interview process: 1st round - 60 minutes, 2nd round - 30 minutes

Interview dates: 1st round on 12/1 & 12/4, 2nd round targeting 12/6 & 12/7

Additional Insights from the manager:

Looking for someone to build pipelines using YAML config files for data product teams

YAML experience is necessary; candidates should be proficient in Python, able to read JSON and YAML files

Desired experience: 3-5 years as a data engineer, with strong Python and PySpark skills

Almost all tasks are in Python; a strong understanding of Python and PySpark is crucial

AWS experience is not mandatory but good to know; the basics of AWS are sufficient

SQL proficiency is essential; familiarity with databases like Postgres, Snowflake, and Hadoop is preferred

Data lake is defined as S3 and Snowflake

API calls to Python SDK (Boto3) are required for internal APIs at Freddie Mac

Final Notes:

API calls to Python SDK for internal APIs at Freddie Mac are crucial

Experience with AWS is beneficial but not mandatory

SQL proficiency is a key requirement

PySpark and Hadoop are essential for this role

The candidate will be working on building pipelines using YAML config files.

Job Description (Data Engineer)

Must Haves: Object oriented programming experience using Python. SQL. API experience is mandatory with a preference for familiarity with Boto3 Experience with Pyspark with a solid understanding of big data

Schedule: Standard

Position Overview

Freddie Macs Investments & Capital Markets Division is currently seeking a Senior Data Engineer who enjoys data and building data storage platforms from ground up. The ideal candidate has a passion for data analysis, technology and helping people leverage the technology to transform their business processes and analytics. As a Data Engineer, you will be part of a team responsible for supporting a wide range of internal customers. You will draw on all the skills in your toolkit to analyze, design, and develop data storage and data analytic solutions using data lake patterns, that help our customers run more effective operations and make better business decisions.

Your Work Falls Into Two Primary Categories:

Strategy Development and Implementation

Develop data filtering, transformational and loading requirements

Define and execute ETLs using Apache Sparks on Hadoop among other Data technologies

Determine appropriate translations and validations between source data and target databases

Implement business logic to cleanse & transform data

Design and implement appropriate error handling procedures

Develop project, documentation and storage standards in conjunction with data architects

Monitor performance, troubleshoot and tune ETL processes as appropriate using tools like in the AWS ecosystem.

Create and automate ETL mappings to consume loan level data source applications to target applications

Execution of end to end implementation of underlying data ingestion workflow.

Operations and Technology

Leverage and align work to appropriate resources across the team to ensure work is completed in the most efficient and impactful way

Understand capabilities of and current trends in Data Engineering domain

Qualifications

At least 5 years of experience developing in Python, SQL (postgres/snowflake preferred)

Bachelors degree with equivalent work experience in computer science, data science or a related field.

Experience working with different Databases and understanding of data concepts (including data warehousing, data lake patterns, structured and unstructured data)

3+ years experience of Data Storage/Hadoop platform implementation, including 3+ years of hands-on experience in implementation and performance tuning Hadoop/Spark implementations.

Implementation and tuning experience specifically using Amazon Elastic Map Reduce (EMR).

Implementing AWS services in a variety of distributed computing, enterprise environments.

Experience writing automated unit, integration, regression, performance and acceptance tests

Solid understanding of software design principles

Key to success in this role

Strong consultation and communication skills

Ability to work with and collaborate across the team and where silos exist

Deep curiosity to learn about new trends and how to do things better

Ability to use data to help inform strategy and direction

Top Personal Competencies to possess

Seek and Embrace Change Continuously improve work processes rather than accepting the status quo

Growth and Development Know or learn what is needed to deliver results and successfully compete

Preferred Skills

Understanding of Apache Hadoop and the Hadoop ecosystem. Experience with one or more relevant tools (Sqoop, Flume, Kafka, Oozie, Hue, Zookeeper, HCatalog, Solr, Avro).

Deep knowledge on Extract, Transform, Load (ETL) and distributed processing techniques such as Map-Reduce

Experience with Columnar databases like Snowflake, Redshift

Experience in building and deploying applications in AWS (EC2, S3, Hive, Glue, EMR, RDS, ELB, Lambda, etc.)

Experience with building production web services

Experience with cloud computing and storage services

Knowledge of Mortgage industry

Thanks Regards

Riyas Deen

Xforia INC, 99300 Wade Boulevard, Suite 220, Frisco TX 75035

Email:

[email protected]

|Website:

www.xforia.com

|

Keywords: sthree Texas

[email protected]
View all

Wed Nov 22 03:07:00 UTC 2023

To remove this job post send "job_kill 878524" as subject from [email protected] to [email protected]. Do not write anything extra in the subject line as this is a automatic system which will not work otherwise.

Your reply to [email protected] -

To

Subject
Message -

riyas.d@xforia.com wrote:
From:

Riyas,

Xforia Inc

riyas.d@xforia.com

Reply to:   riyas.d@xforia.com

Important notes:

Object-oriented programming experience using Python.

Experience with SQL.

API experience, preferably with familiarity with Boto3.

Experience with PySpark and a solid understanding of Big data.

Two-round interview process: 1st round - 60 minutes, 2nd round - 30 minutes

Interview dates: 1st round on 12/1 & 12/4, 2nd round targeting 12/6 & 12/7

Additional Insights from the manager:

Looking for someone to build pipelines using YAML config files for data product teams

YAML experience is necessary; candidates should be proficient in Python, able to read JSON and YAML files

Desired experience: 3-5 years as a data engineer, with strong Python and PySpark skills

Almost all tasks are in Python; a strong understanding of Python and PySpark is crucial

AWS experience is not mandatory but good to know; the basics of AWS are sufficient

SQL proficiency is essential; familiarity with databases like Postgres, Snowflake, and Hadoop is preferred

Data lake is defined as S3 and Snowflake

API calls to Python SDK (Boto3) are required for internal APIs at Freddie Mac

Final Notes:

API calls to Python SDK for internal APIs at Freddie Mac are crucial

Experience with AWS is beneficial but not mandatory

SQL proficiency is a key requirement

PySpark and Hadoop are essential for this role

The candidate will be working on building pipelines using YAML config files.

Job Description (Data Engineer)

Must Haves: Object oriented programming experience using Python. SQL. API experience is mandatory with a preference for familiarity with Boto3 Experience with Pyspark with a solid understanding of big data

Schedule: Standard

Position Overview

Freddie Macs Investments & Capital Markets Division is currently seeking a Senior Data Engineer who enjoys data and building data storage platforms from ground up. The ideal candidate has a passion for data analysis, technology and helping people leverage the technology to transform their business processes and analytics. As a Data Engineer, you will be part of a team responsible for supporting a wide range of internal customers. You will draw on all the skills in your toolkit to analyze, design, and develop data storage and data analytic solutions using data lake patterns, that help our customers run more effective operations and make better business decisions.

Your Work Falls Into Two Primary Categories:

Strategy Development and Implementation

Develop data filtering, transformational and loading requirements

Define and execute ETLs using Apache Sparks on Hadoop among other Data technologies

Determine appropriate translations and validations between source data and target databases

Implement business logic to cleanse & transform data

Design and implement appropriate error handling procedures

Develop project, documentation and storage standards in conjunction with data architects

Monitor performance, troubleshoot and tune ETL processes as appropriate using tools like in the AWS ecosystem.

Create and automate ETL mappings to consume loan level data source applications to target applications

Execution of end to end implementation of underlying data ingestion workflow.

Operations and Technology

Leverage and align work to appropriate resources across the team to ensure work is completed in the most efficient and impactful way

Understand capabilities of and current trends in Data Engineering domain

Qualifications

At least 5 years of experience developing in Python, SQL (postgres/snowflake preferred)

Bachelors degree with equivalent work experience in computer science, data science or a related field.

Experience working with different Databases and understanding of data concepts (including data warehousing, data lake patterns, structured and unstructured data)

3+ years experience of Data Storage/Hadoop platform implementation, including 3+ years of hands-on experience in implementation and performance tuning Hadoop/Spark implementations.

Implementation and tuning experience specifically using Amazon Elastic Map Reduce (EMR).

Implementing AWS services in a variety of distributed computing, enterprise environments.

Experience writing automated unit, integration, regression, performance and acceptance tests

Solid understanding of software design principles

Key to success in this role

Strong consultation and communication skills

Ability to work with and collaborate across the team and where silos exist

Deep curiosity to learn about new trends and how to do things better

Ability to use data to help inform strategy and direction

Top Personal Competencies to possess

Seek and Embrace Change  Continuously improve work processes rather than accepting the status quo

Growth and Development  Know or learn what is needed to deliver results and successfully compete

Preferred Skills

Understanding of Apache Hadoop and the Hadoop ecosystem. Experience with one or more relevant tools (Sqoop, Flume, Kafka, Oozie, Hue, Zookeeper, HCatalog, Solr, Avro).

Deep knowledge on Extract, Transform, Load (ETL) and distributed processing techniques such as Map-Reduce

Experience with Columnar databases like Snowflake, Redshift

Experience in building and deploying applications in AWS (EC2, S3, Hive, Glue, EMR, RDS, ELB, Lambda, etc.)

Experience with building production web services

Experience with cloud computing and storage services

Knowledge of Mortgage industry

Thanks Regards

Riyas Deen

Xforia INC, 99300 Wade Boulevard, Suite 220, Frisco TX 75035

Email:

Riyas.d@xforia.com

|Website:

www.xforia.com

Keywords: sthree Texas

Your email id:

Captcha Image:

Captcha Code:

Pages not loading, taking too much time to load, server timeout or unavailable, or any other issues please contact admin at [email protected]
Time Taken: 0

Location: ,