Job Details

Home

Data Engineer - 100% remote at Remote, Remote, USA

Email: [email protected]

From:

pankaj,

Stellent IT

[email protected]

Reply to: [email protected]

Data Engineer

100% remote

Phone and Skype

Long Term

Job Description:

Technical Skills:

Skill

Years/Level of Experience

MongoDB -P4 - Expert

PostgreSQL- P4 - Expert

Cloud Application Architecture- P4 - Expert

P1 Beginner (0-2yrs experience)

P2 Intermediate (3-5yrs experience)

P3 Advanced (7-10yrs experience

P4 Expert (10+yrs experience)

Role Description: The Data Engineer provides the ETL support to the data science and software engineering team members. Build, modify, support infrastructure for optimal extraction, transformation, and loading of data from variety of structure, unstructured data sources and multi-terabyte distributed file system. Candidate will formulate and rapidly prototype various approaches as well as effectively communicate the pros and cons of each. Provide data-driven approaches to tackle various business problems. The candidate will have the ability to contribute to a high-performing, motivated workgroup by applying interpersonal and collaboration skills to achieve project goals Architect for ML data pipeline with data acquisition and preprocessing functionalities that gather data from heterogeneous data pool from the distributed file system, unstructured text extracted from multi-million images of medical records with varied OCR quality, their metadata from relational databases and custom annotations.

Responsibilities:

Provide current system architecture documentation, engineering/web development programming support for program/project requirements defined tasks, data science/data engineering related technical assessments

Manage/maintain structured, semi-structured, and unstructured data, structuring and wrangling data as appropriate for statistical analysis

Implement data warehouse concepts and relational databases, big data management techniques and tools (e.g. Hadoop, MAPReduce)

Communicate with technical and non-technical users and managers, and server administration, to include hardware and software support to existing servers.

Provide software engineering support to operate, maintain and enhance systems that are integrated with and/or relied upon by the data engineering lifecycle

Integrate, analyze, and visualize data and information in near real-time (within 24 hours) from multiple disparate data sources.

Optimize data storage and access

Proficiency with Python and Java, Oracle enterprise manager, SQL, AWS

Qualifications:

Masters degree in related field + 5 years experience; or PhD +1 year experience; or Bachelors degree in related field + 7 years experience

Minimum of 5 years experience conducting ETL tasks, performance engineering, run-time optimization, large data volume transfers

Minimum 3 years experience with Regular Expressions, SQL (PostgreSQL), No-SQL (MongoDB)

Minimum 1 year experience with Version control systems (Git)

Preference to developer with experience working with healthcare data and Health IT

Skills/Tools Utilized (at least 1-2 years exp in some of the following):

Apache Hadoop (Cloudera)

AWS Data Platforms (Redshift, S3, EMR/Hive)

SQL

Java

Kafka

Scala

Kotlin

Neo4j

NiFi

Flink

Sqoop

PostgreSQL

EMR

Apache Spark

Python

PHP

Oracle

Splunk

BDD

testing framework: Cucumber

Knowledge of and experience using various NLP approaches, particularly:

Pattern recognition/feature extraction

Supervised, Unsupervised, and Semi-Supervised learning techniques

Understanding of various language models (N-Gram, Skipgram, NLM, etc.)

Chunking/Tokenization

Semantic parsing

Skills highly desired:

Healthcare IT experience

Statistical model building (particularly classification)

Education Level

Masters

Option to Hire

Yes No

Terms as indicated in the suppliers Federal Contractor Exchange MSA.

Work Location

On-site (Government / AFS Site): Remote

On-site %: 0%

Keywords: machine learning sthree information technology

[email protected]
View all

Thu Sep 07 20:20:00 UTC 2023

To remove this job post send "job_kill 611623" as subject from [email protected] to [email protected]. Do not write anything extra in the subject line as this is a automatic system which will not work otherwise.

Your reply to [email protected] -

To

Subject
Message -

pankaj@stellentit.com wrote:
From:

pankaj,

Stellent IT

pankaj@stellentit.com

Reply to:   pankaj@stellentit.com

Data Engineer

100% remote

Phone and Skype

Long Term

Job Description:

Technical Skills:

Skill

Years/Level of Experience

MongoDB -P4 - Expert

PostgreSQL- P4 - Expert

Cloud Application Architecture- P4 - Expert

P1  Beginner (0-2yrs experience)

P2  Intermediate (3-5yrs experience)

P3  Advanced (7-10yrs experience

P4  Expert (10+yrs experience)

Role Description:  The Data Engineer provides the ETL support to the data science and software engineering team members. Build, modify, support infrastructure for optimal extraction, transformation, and loading of data from variety of structure, unstructured data sources and multi-terabyte distributed file system. Candidate will formulate and rapidly prototype various approaches as well as effectively communicate the pros and cons of each.  Provide data-driven approaches to tackle various business problems.  The candidate will have the ability to contribute to a high-performing, motivated workgroup by applying interpersonal and collaboration skills to achieve project goals Architect for ML data pipeline with data acquisition and preprocessing functionalities that gather data from heterogeneous data pool from the distributed file system, unstructured text extracted from multi-million images of medical records with varied OCR quality, their metadata from relational databases and custom annotations.

Responsibilities:

Provide current system architecture documentation, engineering/web development programming support for program/project requirements defined tasks, data science/data engineering related technical assessments

Manage/maintain structured, semi-structured, and unstructured data, structuring and wrangling data as appropriate for statistical analysis

Implement data warehouse concepts and relational databases, big data management techniques and tools (e.g. Hadoop, MAPReduce)

Communicate with technical and non-technical users and managers, and server administration, to include hardware and software support to existing servers.

Provide software engineering support to operate, maintain and enhance systems that are integrated with and/or relied upon by the data engineering lifecycle

Integrate, analyze, and visualize data and information in near real-time (within 24 hours) from multiple disparate data sources.

Optimize data storage and access

Proficiency with Python and Java, Oracle enterprise manager, SQL, AWS

Qualifications:

Masters degree in related field + 5 years experience; or PhD +1 year experience; or Bachelors degree in related field + 7 years experience

Minimum of 5 years experience conducting ETL tasks, performance engineering, run-time optimization, large data volume transfers

Minimum 3 years experience with Regular Expressions, SQL (PostgreSQL), No-SQL (MongoDB)

Minimum 1 year experience with Version control systems (Git)

Preference to developer with experience working with healthcare data and Health IT

Skills/Tools Utilized (at least 1-2 years exp in some of the following):

Apache Hadoop (Cloudera)

AWS Data Platforms (Redshift, S3, EMR/Hive)

SQL

Java

Kafka

Scala

Kotlin

Neo4j

NiFi

Flink

Sqoop

PostgreSQL

EMR

Apache Spark

Python

PHP

Oracle

Splunk

BDD

testing framework: Cucumber

Knowledge of and experience using various NLP approaches, particularly:

Pattern recognition/feature extraction

Supervised, Unsupervised, and Semi-Supervised learning techniques

Understanding of various language models (N-Gram, Skipgram, NLM, etc.)

Chunking/Tokenization

Semantic parsing

Skills highly desired:

Healthcare IT experience

Statistical model building (particularly classification)

Education Level

Masters

Option to Hire

Yes No

Terms as indicated in the suppliers Federal Contractor Exchange MSA.

Work Location

On-site (Government / AFS Site): Remote

On-site %:  0%

Keywords: machine learning sthree information technology

Your email id:

Captcha Image:

Captcha Code:

Pages not loading, taking too much time to load, server timeout or unavailable, or any other issues please contact admin at [email protected]
Time Taken: 0

Location: ,