Job Details

Home

URGENT HIRING for ML Data Engineer with Vector DB and GenAI Skills at Remote, Remote, USA

Email: [email protected]

From:

Abhishek Mukherjee,

Appian Infotech

[email protected]

Reply to: [email protected]

Hi,

Abhishek Mukherjee here from Appian InfoTech. I am looking for a ML Data Engineer with Vector DB and GenAI Skills for one of my clients to support them REMOTE with occasional travelling to PA. Please let me know if youre looking for a change. The job description is attached below for your reference.

I can be reached at [email protected]/276-910-0140.

ML Data Engineer with Vector DB and GenAI Skills.

Pennsylvania, US Occasional Visit

The candidate needs to be situated in either the EST or CST time zone and should be willing to commute to the office in PA upon client request.

The client will cover the associated expenses.

Reports To: AI Competency Lead

Summary:

We are seeking a passionate and skilled ML Data Engineer (Band 4B) to join our team in USA. You will play a pivotal role in building and maintaining the data infrastructure and pipelines for our cutting-edge Generative AI applications. You will collaborate closely with the Generative AI Full Stack Architect and MLOps Engineer to ensure the quality, security, and accessibility of data for our Generative AI models.

Responsibilities:

Design, develop, and implement data pipelines for ingesting, pre-processing, and transforming unstructured data (Image, .pdf, Audio, video) for Generative AI model training and inference.

Need to have some level of understanding or working experience with Vector DBs ( Like Pinecone , Redis , Chroma)

Understanding on Large Language Models ( Llama , GPT-4 , Claude 2.0 ) to do text summarization , entity extraction and classification.

Build and maintain efficient data storage solutions, including data lakes, warehouses, and databases, appropriate for large-scale generative AI datasets.

Implement data security and governance policies to ensure the privacy and integrity of sensitive data used in Generative AI projects.

Collaborate with data scientists and engineers to understand data requirements for Generative AI models and translate them into efficient data pipelines.

Monitor and optimize data pipelines for performance, scalability, and cost-effectiveness.

Stay up-to-date on the latest advancements in data engineering tools and technologies (e.g., Apache Spark, Airflow, Snowflake, Data Bricks ) and apply them to our Generative AI platform.

Document data pipelines and processes for clarity and transparency.

Communicate effectively with technical and non-technical stakeholders about data quality and availability for Generative AI projects.

Qualifications:

Bachelors degree in computer science, Data Science, Statistics, or a related field, or equivalent experience.

6+ years of experience in data engineering or related roles, such as data pipeline development, data storage, or ETL/ELT processes.

Proven experience building and maintaining data pipelines for machine learning projects.

Strong understanding of data modeling principles, data quality measures, and data security best practices.

Proficient in programming languages like Python, SQL, and scripting languages (e.g., Bash, Shell).

Familiarity with cloud platforms (e.g., AWS, GCP, Azure) for data storage and processing along with GenAI services like (AWS BedRock)

Excellent communication, collaboration, and problem-solving skills.

Ability to work independently and as part of a team.

Passion for Generative AI and its potential to solve real-world challenges.

Must have

Senior individual contributor with substantial data engineering expertise and leadership experience.

Manages complex data projects and initiatives with independent decision-making authority.

Provides technical guidance and mentorship to junior team members.

Has a demonstrated track record of success in delivering impactful data solutions.

Keywords: artificial intelligence machine learning access management database Pennsylvania

[email protected]
View all

Thu Feb 22 01:54:00 UTC 2024

To remove this job post send "job_kill 1143990" as subject from [email protected] to [email protected]. Do not write anything extra in the subject line as this is a automatic system which will not work otherwise.

Your reply to [email protected] -

To

Subject
Message -

abhishek.m@appianinfotech.com wrote:
From:

Abhishek Mukherjee,

Appian Infotech

abhishek.m@appianinfotech.com

Reply to:   abhishek.m@appianinfotech.com

Hi,

Abhishek Mukherjee here from Appian InfoTech. I am looking for a ML Data Engineer with Vector DB and GenAI Skills  for one of my clients to support them REMOTE with occasional travelling to PA. Please let me know if youre looking for a change. The job description is attached below for your reference.

I can be reached at abhishek.m@appianinfotech.com/276-910-0140.

ML Data Engineer with Vector DB and GenAI Skills.

Pennsylvania, US  Occasional Visit

The candidate needs to be situated in either the EST or CST time zone and should be willing to commute to the office in PA upon client request.

The client will cover the associated expenses.

Reports To: AI Competency Lead

Summary:

We are seeking a passionate and skilled ML Data Engineer (Band 4B) to join our team in USA. You will play a pivotal role in building and maintaining the data infrastructure and pipelines for our cutting-edge Generative AI applications. You will collaborate closely with the Generative AI Full Stack Architect and MLOps Engineer to ensure the quality, security, and accessibility of data for our Generative AI models.

Responsibilities:

Design, develop, and implement data pipelines for ingesting, pre-processing, and transforming unstructured data (Image, .pdf, Audio, video) for Generative AI model training and inference.

Need to have some level of understanding or working experience with Vector DBs ( Like Pinecone , Redis , Chroma)

Understanding on Large Language Models ( Llama , GPT-4 , Claude 2.0 ) to do text summarization , entity extraction and classification.

Build and maintain efficient data storage solutions, including data lakes, warehouses, and databases, appropriate for large-scale generative AI datasets.

Implement data security and governance policies to ensure the privacy and integrity of sensitive data used in Generative AI projects.

Collaborate with data scientists and engineers to understand data requirements for Generative AI models and translate them into efficient data pipelines.

Monitor and optimize data pipelines for performance, scalability, and cost-effectiveness.

Stay up-to-date on the latest advancements in data engineering tools and technologies (e.g., Apache Spark, Airflow, Snowflake, Data Bricks ) and apply them to our Generative AI platform.

Document data pipelines and processes for clarity and transparency.

Communicate effectively with technical and non-technical stakeholders about data quality and availability for Generative AI projects.

Qualifications:

Bachelors degree in computer science, Data Science, Statistics, or a related field, or equivalent experience.

6+ years of experience in data engineering or related roles, such as data pipeline development, data storage, or ETL/ELT processes.

Proven experience building and maintaining data pipelines for machine learning projects.

Strong understanding of data modeling principles, data quality measures, and data security best practices.

Proficient in programming languages like Python, SQL, and scripting languages (e.g., Bash, Shell).

Familiarity with cloud platforms (e.g., AWS, GCP, Azure) for data storage and processing along with GenAI services like (AWS BedRock)

Excellent communication, collaboration, and problem-solving skills.

Ability to work independently and as part of a team.

Passion for Generative AI and its potential to solve real-world challenges.

Must have

Senior individual contributor with substantial data engineering expertise and leadership experience.

Manages complex data projects and initiatives with independent decision-making authority.

Provides technical guidance and mentorship to junior team members.

Has a demonstrated track record of success in delivering impactful data solutions.

Keywords: artificial intelligence machine learning access management database Pennsylvania

Your email id:

Captcha Image:

Captcha Code:

Pages not loading, taking too much time to load, server timeout or unavailable, or any other issues please contact admin at [email protected]
Time Taken: 0

Location: ,