Home

Data Engineer - 100% remote at Remote, Remote, USA
Email: [email protected]
From:

pankaj,

Stellent IT

[email protected]

Reply to:   [email protected]

Data Engineer

100% remote

Phone and Skype

Long Term

Job Description:

Technical Skills:

Skill

Years/Level of Experience

MongoDB -P4 - Expert

PostgreSQL- P4 - Expert

Cloud Application Architecture- P4 - Expert

P1 Beginner (0-2yrs experience)

P2 Intermediate (3-5yrs experience)

P3 Advanced (7-10yrs experience

P4 Expert (10+yrs experience)

Role Description:  The Data Engineer provides the ETL support to the data science and software engineering team members. Build, modify, support infrastructure for optimal extraction, transformation, and loading of data from variety of structure, unstructured data sources and multi-terabyte distributed file system. Candidate will formulate and rapidly prototype various approaches as well as effectively communicate the pros and cons of each.  Provide data-driven approaches to tackle various business problems.  The candidate will have the ability to contribute to a high-performing, motivated workgroup by applying interpersonal and collaboration skills to achieve project goals Architect for ML data pipeline with data acquisition and preprocessing functionalities that gather data from heterogeneous data pool from the distributed file system, unstructured text extracted from multi-million images of medical records with varied OCR quality, their metadata from relational databases and custom annotations.

Responsibilities:

        Provide current system architecture documentation, engineering/web development programming support for program/project requirements defined tasks, data science/data engineering related technical assessments

        Manage/maintain structured, semi-structured, and unstructured data, structuring and wrangling data as appropriate for statistical analysis

        Implement data warehouse concepts and relational databases, big data management techniques and tools (e.g. Hadoop, MAPReduce)

        Communicate with technical and non-technical users and managers, and server administration, to include hardware and software support to existing servers.

        Provide software engineering support to operate, maintain and enhance systems that are integrated with and/or relied upon by the data engineering lifecycle

        Integrate, analyze, and visualize data and information in near real-time (within 24 hours) from multiple disparate data sources.

        Optimize data storage and access

        Proficiency with Python and Java, Oracle enterprise manager, SQL, AWS

Qualifications:

        Masters degree in related field + 5 years experience; or PhD +1 year experience; or Bachelors degree in related field + 7 years experience

        Minimum of 5 years experience conducting ETL tasks, performance engineering, run-time optimization, large data volume transfers

        Minimum 3 years experience with Regular Expressions, SQL (PostgreSQL), No-SQL (MongoDB)

        Minimum 1 year experience with Version control systems (Git)

        Preference to developer with experience working with healthcare data and Health IT

Skills/Tools Utilized (at least 1-2 years exp in some of the following):

        Apache Hadoop (Cloudera)

        AWS Data Platforms (Redshift, S3, EMR/Hive)

        SQL

        Java

        Kafka

        Scala

        Kotlin

        Neo4j

        NiFi

        Flink

        Sqoop

        PostgreSQL

        EMR

        Apache Spark

        Python

        PHP

        Oracle

        Splunk

        BDD

        testing framework: Cucumber

        Knowledge of and experience using various NLP approaches, particularly:

Pattern recognition/feature extraction

Supervised, Unsupervised, and Semi-Supervised learning techniques

Understanding of various language models (N-Gram, Skipgram, NLM, etc.)

Chunking/Tokenization

Semantic parsing

Skills highly desired:

        Healthcare IT experience

        Statistical model building (particularly classification)

Education Level

Masters

Option to Hire

Yes No

Terms as indicated in the suppliers Federal Contractor Exchange MSA.

Work Location

On-site (Government / AFS Site): Remote

On-site %:  0%

Keywords: machine learning sthree information technology
[email protected]
View all
Thu Sep 07 20:20:00 UTC 2023

To remove this job post send "job_kill 611623" as subject from [email protected] to [email protected]. Do not write anything extra in the subject line as this is a automatic system which will not work otherwise.


Your reply to [email protected] -
To       

Subject   
Message -

Your email id:

Captcha Image:
Captcha Code:


Pages not loading, taking too much time to load, server timeout or unavailable, or any other issues please contact admin at [email protected]
Time Taken: 0

Location: ,