MLOps Engineer (Machine Learning Operations ) at Remote, Remote, USA |
Email: [email protected] |
From: vivek, vyzeinc [email protected] Reply to: [email protected] MLOps Engineer (Machine Learning Operations ) Remote 6 Months Any Visa These are truly MLOps Engineering roles. They will be interviewed on 1) Machine Learning/Data Science, 2) Data Engineering/ETL, and 3) DevOps with Kubernetes ** Do not share DevOps Engineers that dont know ML/Data Science or Data Scientist without experience in Kubernetes/DevOps. They will not pass interview ** Required skills: Must have MLOps experience not just Data Science or Data Engineering Data Engineering Azure based Azure Required Azure Databricks, ADO, etc Machine Learning / Data Science Python is must have Devops/Kubernetes at least a self-rating of 6 of 10 on experience, 10 being expert Spark- Required Have deployed and worked with any of these tools: Jupterhub/ mlflow/ databricks/ kubeflow Must Haves: 8+ years of experience in Data and DevOps Engineering Prior experience in MLOps Engineering Strong hands-on experience in Kubernetes (5+ years) Strong hands-on experience in Python (5+ years) Team is developing an AI/ML Model Inferencing Pipeline that would automate the extraction of all data elements from the Document or from Source Streaming Data, this will leverage the elastic nature of cloud for cost optimize for different use cases. Tasks: You will assist in design, development, test, deploy, maintain, and enhance Machine Learning Pipelines using K8s/AKS based Argo Workflow Orchestration solutions. Participate and contribute to design reviews with platform engineering team to decide the design, technologies, project priorities, deadlines, and deliverables. You will work closely with Data Lake and Data Science team to understand their data structure and machine learning algorithms. Understanding of ETL pipelines, and ingress / egress methodologies and design patterns Implement real time Argo workflow pipelines, integrate pipelines with machine learning models, and translate data and model results into business stakeholders Data Lake Develop distributed Machine Learning Pipeline for training & inferencing using Argo, Spark & AKS Build highly scalable backend REST APIs to collect data from Data Lake and other use-cases / scenarios. Deploy Application in Azure Kubernetes Service using GitLab CICD, Jenkins, Docker, Kubectl, Helm and Mainfest Experience in branching, tagging and maintaining the versions across the different environments in GitLab. Review code developed by other developers and provide a feedback to ensure best practices (e.g., checking code in, accuracy, testability, and efficiency) Debug/track/resolve by analyzing the sources of issues and the impact on application, network, or service operations and quality. Functional, benchmark & performance testing and tuning for the built workflows. Assess, design & optimize the resources capacities (e.g .Memory, GPU etc.) for ML based resource intensive workloads ESSENTIAL FUNCTIONS Designs and writes complex code in several languages relevant to our existing product stack, with a focus on automation Configures, tunes, maintains and installs applications systems and validates system functionality Monitors and fine tunes applications system to achieve optimum performance levels and works with hardware teams to resolve issues with hardware and software Develops and maintains department's knowledge database containing enterprise issues and possible resolutions. Develops models of task problem domain for which a system will be designed or built. Uses models, hypotheses, and cognitive analysis techniques to elicit real problem-solving knowledge from the experts Mediates between the expert and knowledge base; encodes for the knowledge base Acts as subject matter expert for difficult or complex application problems requiring interpretation of AI tools and principles Researches and prepares reports and studies on various aspects of knowledge acquisition, modeling, management, and presentation Develops and maintains processes, procedures, models, and templates for collecting and organizing knowledge into specialized knowledge representation programs Acts as vendor liaison for products and services to support development tools Maintains the definition, documentation, training, testing, and activation of Disaster Recovery/Business Continuity Planning to meet compliance standards Maintains a comprehensive operating system hardware and software configuration database/library of all supporting documentation to ensure data integrity Acts to improve the overall reliability of systems and to increase efficiency Works collaboratively with cross functional teams, using Agile / DevOps principles to bring products to life, achieve business objectives and serve customer needs What skills/technologies are required: Bachelors/Masters degree in Computer Science or Data Science 5 to 8 years of experience in software development and with data structures/algorithms 5 to 7 years of experience with programming language Python Strong understanding and experience with Kubernetes for availability and scalability of the application in Azure Kubernetes Service 5 years of experience in developing large-scale infrastructure, distributed systems or networks, experience with compute technologies, storage architecture 5 years of experience with Unit and Functional test cases using PyTest, UnitTest and Mocking External Services for functional and non-functional requirements Experience with cloud tools like Azure and Google Cloud Platform Strong understanding of microservices architecture and experience with building and deploying RestAPIs using Python, Flask and Django Experience in building and deploying applications with Azure, using third-party tools (e.g., Docker, Kubernetes and Terraform) Experience with development tools, CI/CD pipelines such as GitLab CI/CD, Artifactory, Cloudbees and Jenkins Keywords: continuous integration continuous deployment artificial intelligence machine learning |
[email protected] View all |
Fri Nov 03 21:29:00 UTC 2023 |