Machine Learning (MLOps) Engineer - Hollywood, FL (Remote only locals) at Hollywood, Florida, USA |
Email: [email protected] |
Role: Machine Learning Operations (MLOps) Engineer - AWS (with LLM Focus) Location: Hollywood, FL (Remote only locals) Responsibilities: LLM-Optimized MLOps Infrastructure: Design and implement MLOps infrastructure on AWS tailored for LLMs, leveraging services like SageMaker, EC2 (with GPU instances), S3, ECS/EKS, Lambda, and more. LLM Deployment Pipelines: Build and manage CI/CD pipelines specifically for LLM deployment, addressing unique challenges like model size, inference optimization, and versioning. LLMOps Practices: Implement LLMOps best practices for monitoring model performance, drift detection, prompt management, and feedback loops for continuous improvement. RESTful API Development: Design and develop RESTful APIs to expose LLM capabilities to other applications and services, ensuring scalability, security, and optimal performance. Model Optimization: Apply techniques like quantization, distillation, and pruning to optimize LLM models for efficient inference on AWS infrastructure. Monitoring and Observability: Establish comprehensive monitoring and alerting mechanisms to track LLM performance, latency, resource utilization, and potential biases. Prompt Engineering and Management: Develop strategies for prompt engineering and management to enhance LLM outputs and ensure consistency and safety. Collaboration: Work closely with data scientists, researchers, and software engineers to integrate LLM models into production systems effectively. Cost Optimization: Continuously optimize LLMOps processes and infrastructure for cost-efficiency while maintaining high performance and reliability. Qualifications: Experience: 3+ years of experience in MLOps or a related field, with hands-on experience in deploying and managing LLMs. AWS Expertise: Strong proficiency in AWS services relevant to MLOps and LLMs, including SageMaker, EC2 (with GPU instances), S3, ECS/EKS, Lambda, and API Gateway. LLM Knowledge: Deep understanding of LLM architectures (e.g., Transformers), training techniques, and inference optimization strategies. Programming Skills: Proficiency in Python and experience with infrastructure-as-code tools (e.g., Terraform, CloudFormation), REST API frameworks (e.g., Flask, FastAPI), and LLM libraries (e.g., Hugging Face Transformers). Monitoring: Familiarity with monitoring and logging tools for LLMs, such as Prometheus, Grafana, and CloudWatch. Containerization: Experience with Docker and container orchestration (e.g., Kubernetes, ECS) for LLM deployment. Problem Solving: Excellent problem-solving and troubleshooting skills in the context of LLMs and MLOps. Communication: Strong communication and collaboration skills to effectively work with cross-functional teams. Responsibilities: LLM-Optimized MLOps Infrastructure: Design and implement MLOps infrastructure on AWS tailored for LLMs, leveraging services like SageMaker, EC2 (with GPU instances), S3, ECS/EKS, Lambda, and more. LLM Deployment Pipelines: Build and manage CI/CD pipelines specifically for LLM deployment, addressing unique challenges like model size, inference optimization, and versioning. LLMOps Practices: Implement LLMOps best practices for monitoring model performance, drift detection, prompt management, and feedback loops for continuous improvement. RESTful API Development: Design and develop RESTful APIs to expose LLM capabilities to other applications and services, ensuring scalability, security, and optimal performance. Model Optimization: Apply techniques like quantization, distillation, and pruning to optimize LLM models for efficient inference on AWS infrastructure. Monitoring and Observability: Establish comprehensive monitoring and alerting mechanisms to track LLM performance, latency, resource utilization, and potential biases. Prompt Engineering and Management: Develop strategies for prompt engineering and management to enhance LLM outputs and ensure consistency and safety. Collaboration: Work closely with data scientists, researchers, and software engineers to integrate LLM models into production systems effectively. Cost Optimization: Continuously optimize LLMOps processes and infrastructure for cost-efficiency while maintaining high performance and reliability. Qualifications: Experience: 3+ years of experience in MLOps or a related field, with hands-on experience in deploying and managing LLMs. AWS Expertise: Strong proficiency in AWS services relevant to MLOps and LLMs, including SageMaker, EC2 (with GPU instances), S3, ECS/EKS, Lambda, and API Gateway. LLM Knowledge: Deep understanding of LLM architectures (e.g., Transformers), training techniques, and inference optimization strategies. Programming Skills: Proficiency in Python and experience with infrastructure-as-code tools (e.g., Terraform, CloudFormation), REST API frameworks (e.g., Flask, FastAPI), and LLM libraries (e.g., Hugging Face Transformers). Monitoring: Familiarity with monitoring and logging tools for LLMs, such as Prometheus, Grafana, and CloudWatch. Containerization: Experience with Docker and container orchestration (e.g., Kubernetes, ECS) for LLM deployment. Problem Solving: Excellent problem-solving and troubleshooting skills in the context of LLMs and MLOps. Communication: Strong communication and collaboration skills to effectively work with cross-functional teams. -- Keywords: continuous integration continuous deployment sthree information technology Florida Machine Learning (MLOps) Engineer - Hollywood, FL (Remote only locals) [email protected] |
[email protected] View all |
Sat Jun 29 01:34:00 UTC 2024 |