Job Details

Home

Machine Learning (MLOps) Engineer - Hollywood, FL (Remote only locals) at Hollywood, Florida, USA

Email: [email protected]

Role:
Machine Learning Operations (MLOps) Engineer - AWS (with LLM Focus)

Location: Hollywood, FL (Remote only locals)

Responsibilities:

LLM-Optimized MLOps Infrastructure:
Design and implement MLOps infrastructure on AWS tailored for LLMs, leveraging services like SageMaker, EC2 (with GPU instances), S3, ECS/EKS, Lambda, and more.

LLM Deployment Pipelines:
Build and manage CI/CD pipelines specifically for LLM deployment, addressing unique challenges like model size, inference optimization, and versioning.

LLMOps Practices:
Implement LLMOps best practices for monitoring model performance, drift detection, prompt management, and feedback loops for continuous improvement.

RESTful API Development:
Design and develop RESTful APIs to expose LLM capabilities to other applications and services, ensuring scalability, security, and optimal performance.

Model Optimization:
Apply techniques like quantization, distillation, and pruning to optimize LLM models for efficient inference on AWS infrastructure.

Monitoring and Observability:
Establish comprehensive monitoring and alerting mechanisms to track LLM performance, latency, resource utilization, and potential biases.

Prompt Engineering and Management:
Develop strategies for prompt engineering and management to enhance LLM outputs and ensure consistency and safety.

Collaboration:
Work closely with data scientists, researchers, and software engineers to integrate LLM models into production systems effectively.

Cost Optimization:
Continuously optimize LLMOps processes and infrastructure for cost-efficiency while maintaining high performance and reliability.

Qualifications:

Experience:
3+ years of experience in MLOps or a related field, with hands-on experience in deploying and managing LLMs.

AWS Expertise:
Strong proficiency in AWS services relevant to MLOps and LLMs, including SageMaker, EC2 (with GPU instances), S3, ECS/EKS, Lambda, and API Gateway.

LLM Knowledge:
Deep understanding of LLM architectures (e.g., Transformers), training techniques, and inference optimization strategies.

Programming Skills:
Proficiency in Python and experience with infrastructure-as-code tools (e.g., Terraform, CloudFormation), REST API frameworks (e.g., Flask, FastAPI), and LLM libraries (e.g., Hugging Face Transformers).

Monitoring:
Familiarity with monitoring and logging tools for LLMs, such as Prometheus, Grafana, and CloudWatch.

Containerization:
Experience with Docker and container orchestration (e.g., Kubernetes, ECS) for LLM deployment.

Problem Solving:
Excellent problem-solving and troubleshooting skills in the context of LLMs and MLOps.

Communication:
Strong communication and collaboration skills to effectively work with cross-functional teams.

Responsibilities:

LLM-Optimized MLOps Infrastructure:
Design and implement MLOps infrastructure on AWS tailored for LLMs, leveraging services like SageMaker, EC2 (with GPU instances), S3, ECS/EKS, Lambda, and more.

LLM Deployment Pipelines:
Build and manage CI/CD pipelines specifically for LLM deployment, addressing unique challenges like model size, inference optimization, and versioning.

LLMOps Practices:
Implement LLMOps best practices for monitoring model performance, drift detection, prompt management, and feedback loops for continuous improvement.

RESTful API Development:
Design and develop RESTful APIs to expose LLM capabilities to other applications and services, ensuring scalability, security, and optimal performance.

Model Optimization:
Apply techniques like quantization, distillation, and pruning to optimize LLM models for efficient inference on AWS infrastructure.

Monitoring and Observability:
Establish comprehensive monitoring and alerting mechanisms to track LLM performance, latency, resource utilization, and potential biases.

Prompt Engineering and Management:
Develop strategies for prompt engineering and management to enhance LLM outputs and ensure consistency and safety.

Collaboration:
Work closely with data scientists, researchers, and software engineers to integrate LLM models into production systems effectively.

Cost Optimization:
Continuously optimize LLMOps processes and infrastructure for cost-efficiency while maintaining high performance and reliability.

Qualifications:

Experience:
3+ years of experience in MLOps or a related field, with hands-on experience in deploying and managing LLMs.

AWS Expertise:
Strong proficiency in AWS services relevant to MLOps and LLMs, including SageMaker, EC2 (with GPU instances), S3, ECS/EKS, Lambda, and API Gateway.

LLM Knowledge:
Deep understanding of LLM architectures (e.g., Transformers), training techniques, and inference optimization strategies.

Programming Skills:
Proficiency in Python and experience with infrastructure-as-code tools (e.g., Terraform, CloudFormation), REST API frameworks (e.g., Flask, FastAPI), and LLM libraries (e.g., Hugging Face Transformers).

Monitoring:
Familiarity with monitoring and logging tools for LLMs, such as Prometheus, Grafana, and CloudWatch.

Containerization:
Experience with Docker and container orchestration (e.g., Kubernetes, ECS) for LLM deployment.

Problem Solving:
Excellent problem-solving and troubleshooting skills in the context of LLMs and MLOps.

Communication:
Strong communication and collaboration skills to effectively work with cross-functional teams.

--

Keywords: continuous integration continuous deployment sthree information technology Florida
Machine Learning (MLOps) Engineer - Hollywood, FL (Remote only locals)
[email protected]

[email protected]
View all

Sat Jun 29 01:34:00 UTC 2024

To remove this job post send "job_kill 1522071" as subject from [email protected] to [email protected]. Do not write anything extra in the subject line as this is a automatic system which will not work otherwise.

Your reply to [email protected] -

To

Subject
Message -

sksona.it@gmail.com wrote:
Role:
 Machine Learning Operations (MLOps) Engineer - AWS (with LLM Focus)

Location: Hollywood, FL (Remote only locals)

Responsibilities:

LLM-Optimized MLOps Infrastructure:
 Design and implement MLOps infrastructure on AWS tailored for LLMs, leveraging services like SageMaker, EC2 (with GPU instances), S3, ECS/EKS, Lambda, and more.

LLM Deployment Pipelines:
 Build and manage CI/CD pipelines specifically for LLM deployment, addressing unique challenges like model size, inference optimization, and versioning.

LLMOps Practices:
 Implement LLMOps best practices for monitoring model performance, drift detection, prompt management, and feedback loops for continuous improvement.

RESTful API Development:
 Design and develop RESTful APIs to expose LLM capabilities to other applications and services, ensuring scalability, security, and optimal performance.

Model Optimization:
 Apply techniques like quantization, distillation, and pruning to optimize LLM models for efficient inference on AWS infrastructure.

Monitoring and Observability:
 Establish comprehensive monitoring and alerting mechanisms to track LLM performance, latency, resource utilization, and potential biases.

Prompt Engineering and Management:
 Develop strategies for prompt engineering and management to enhance LLM outputs and ensure consistency and safety.

Collaboration:
 Work closely with data scientists, researchers, and software engineers to integrate LLM models into production systems effectively.

Cost Optimization:
 Continuously optimize LLMOps processes and infrastructure for cost-efficiency while maintaining high performance and reliability.

Qualifications:

Experience:
 3+ years of experience in MLOps or a related field, with hands-on experience in deploying and managing LLMs.

AWS Expertise:
 Strong proficiency in AWS services relevant to MLOps and LLMs, including SageMaker, EC2 (with GPU instances), S3, ECS/EKS, Lambda, and API Gateway.

LLM Knowledge:
 Deep understanding of LLM architectures (e.g., Transformers), training techniques, and inference optimization strategies.

Programming Skills:
 Proficiency in Python and experience with infrastructure-as-code tools (e.g., Terraform, CloudFormation), REST API frameworks (e.g., Flask, FastAPI), and LLM libraries (e.g., Hugging Face Transformers).

Monitoring:
 Familiarity with monitoring and logging tools for LLMs, such as Prometheus, Grafana, and CloudWatch.

Containerization:
 Experience with Docker and container orchestration (e.g., Kubernetes, ECS) for LLM deployment.

Problem Solving:
 Excellent problem-solving and troubleshooting skills in the context of LLMs and MLOps.

Communication:
 Strong communication and collaboration skills to effectively work with cross-functional teams.