Job Details

Home

MLOps-LLMOps Architect (Remote) at Remote, Remote, USA

From:

Sanjeev,

HMG America LLC

sanjeev@hmgamerica.com

Reply to: sanjeev@hmgamerica.com

Title: MLOps/LLMOps Architect
Location: Remote Preferred Dallas but any east, central or south is ok given the situation
Job Description:

About the Role
We are seeking an experienced MLOps/LLMOps Architect with deep expertise in building scalable, production-grade machine learning and generative AI pipelines. The ideal candidate will have strong experience in observability, model performance monitoring, and using tools such as Arize to ensure the reliability and scalability of AI/ML models in production. This role requires a strategic mindset and hands-on technical skills to design and implement robust MLOps/LLMOps frameworks, ensuring seamless model deployment, monitoring, and optimization.
Key Responsibilities
Architect and Implement MLOps/LLMOps Frameworks:
Design and build scalable MLOps/LLMOps pipelines for model training, deployment, monitoring, and retraining.
Establish automated CI/CD pipelines to streamline model development and deployment.
Model Observability and Monitoring:
Develop and implement model observability strategies using tools like Arize to track model performance, drift, and bias.
Create real-time dashboards and alerts for proactive issue identification and resolution.
Performance and Scalability:
Ensure high availability, low latency, and scalability of deployed models.
Optimize model inference and serving using best practices in distributed computing and cloud infrastructure.
Manage and optimize compute costs for large-scale Gen AI models by implementing intelligent load balancing, autoscaling, and infrastructure tuning.
Model Governance and Compliance:
Establish frameworks for model versioning, auditing, and explainability to meet regulatory and business requirements.
Ensure alignment with Responsible AI and ethical AI guidelines.
Cross-Functional Collaboration:
Partner with data scientists, ML engineers, platform teams, and business stakeholders to align MLOps strategies with business objectives.
Provide technical leadership and mentorship to junior team members.
Required Skills and Qualifications
Experience: 10 to 15 years of experience in machine learning, MLOps, and AI model deployment in enterprise environments.
MLOps/LLMOps Expertise: Strong background in MLOps and LLMOps, including model lifecycle management, monitoring, and automation.
Observability Tools: Proficient in using observability platforms such as Arize, Weights & Biases, TensorBoard, MLflow, or similar tools.
Cloud Platforms: Experience with cloud-based ML solutions (e.g., AWS, Azure, GCP).
Programming: Strong programming skills in Python and experience with ML frameworks such as TensorFlow, PyTorch, and Hugging Face.
Containerization and Orchestration: Hands-on experience with Docker, Kubernetes, and distributed computing frameworks.
Model Monitoring: Experience in detecting and mitigating model drift, bias, and data quality issues.
Performance Tuning: Expertise in model optimization, inference acceleration, and efficient resource utilization.

Keywords: continuous integration continuous deployment artificial intelligence machine learning
MLOps-LLMOps Architect (Remote)
sanjeev@hmgamerica.com
https://jobs.nvoids.com/job_details.jsp?id=2261152

sanjeev@hmgamerica.com
View All

11:02 PM 17-Mar-25

To remove this job post send "job_kill 2261152" as subject from sanjeev@hmgamerica.com to usjobs@nvoids.com. Do not write anything extra in the subject line as this is a automatic system which will not work otherwise.

Your reply to sanjeev@hmgamerica.com -

To

Subject
Message -

sanjeev@hmgamerica.com wrote:
From:

Sanjeev,

HMG America LLC

sanjeev@hmgamerica.com

Reply to:   sanjeev@hmgamerica.com

Title: MLOps/LLMOps Architect
Location: Remote Preferred Dallas but any east, central or south is ok given the situation
Job Description:

About the Role
We are seeking an experienced MLOps/LLMOps Architect with deep expertise in building scalable, production-grade machine learning and generative AI pipelines. The ideal candidate will have strong experience in observability, model performance monitoring, and using tools such as Arize to ensure the reliability and scalability of AI/ML models in production. This role requires a strategic mindset and hands-on technical skills to design and implement robust MLOps/LLMOps frameworks, ensuring seamless model deployment, monitoring, and optimization.
Key Responsibilities
Architect and Implement MLOps/LLMOps Frameworks:
Design and build scalable MLOps/LLMOps pipelines for model training, deployment, monitoring, and retraining.
Establish automated CI/CD pipelines to streamline model development and deployment.
Model Observability and Monitoring:
Develop and implement model observability strategies using tools like Arize to track model performance, drift, and bias.
Create real-time dashboards and alerts for proactive issue identification and resolution.
Performance and Scalability:
Ensure high availability, low latency, and scalability of deployed models.
Optimize model inference and serving using best practices in distributed computing and cloud infrastructure.
Manage and optimize compute costs for large-scale Gen AI models by implementing intelligent load balancing, autoscaling, and infrastructure tuning.
Model Governance and Compliance:
Establish frameworks for model versioning, auditing, and explainability to meet regulatory and business requirements.
Ensure alignment with Responsible AI and ethical AI guidelines.
Cross-Functional Collaboration:
Partner with data scientists, ML engineers, platform teams, and business stakeholders to align MLOps strategies with business objectives.
Provide technical leadership and mentorship to junior team members.
Required Skills and Qualifications
Experience: 10 to 15 years of experience in machine learning, MLOps, and AI model deployment in enterprise environments.
MLOps/LLMOps Expertise: Strong background in MLOps and LLMOps, including model lifecycle management, monitoring, and automation.
Observability Tools: Proficient in using observability platforms such as Arize, Weights & Biases, TensorBoard, MLflow, or similar tools.
Cloud Platforms: Experience with cloud-based ML solutions (e.g., AWS, Azure, GCP).
Programming: Strong programming skills in Python and experience with ML frameworks such as TensorFlow, PyTorch, and Hugging Face.
Containerization and Orchestration: Hands-on experience with Docker, Kubernetes, and distributed computing frameworks.
Model Monitoring: Experience in detecting and mitigating model drift, bias, and data quality issues.
Performance Tuning: Expertise in model optimization, inference acceleration, and efficient resource utilization.

Keywords: continuous integration continuous deployment artificial intelligence machine learning 
MLOps-LLMOps Architect  (Remote)
sanjeev@hmgamerica.com

Your email id:

Captcha Image:

Captcha Code:

Pages not loading, taking too much time to load, server timeout or unavailable, or any other issues please contact admin at me@nvoids.com

Time Taken: 19

Location: , Oregon