Home

MLOps Engineer Level 2: Remote at Remote, Remote, USA
Email: [email protected]
From:

Upama,

CBS

[email protected]

Reply to:   [email protected]

MLOps Engineer Level 2 :

Remote

Job Description

Kroger is currently looking for a MLOps Engineer to take a pivotal role in managing the installation, modification, and support of Linux applications within their supernode platform. This platform comprises a cluster of high-performance computers, designed to streamline AI model training and testing. The ideal candidate will bring their expertise in Kubernetes, Slurm, JupyterHub, and Linux administration to ensure the smooth operation of this environment.

Candidates with experience in AI development, video analytics, and other tech disciplines will find this role particularly engaging, as the platform's capabilities extend beyond AI. This position offers a unique opportunity to contribute to the broader technological advancements of the organization.

Requirements:
Proficiency in Linux administration, with a strong preference for candidates with deep expertise in Linux environments. Windows experience is acceptable, but a solid grasp of Linux is essential.
Demonstrated ability to install, modify, and provide support for Linux applications. Experience with JupyterHub is a plus.
Familiarity with cluster management, particularly in negotiating resources across multiple computers simultaneously. Knowledge of the Bright software is highly desirable.
 Proficiency in Slurm for job scheduling, with any prior experience being an advantage.
Competence in container management, including expertise with Docker for containerization, pushing, and pulling containers.
Knowledge of maintaining High-Performance Computing (HPC) systems, encompassing various components that make up this sophisticated infrastructure.

Key Responsibilities
Collaborate with the AI team to customize the environment, ensuring it is optimized for AI development.
Work closely with the infrastructure team to configure and manage physical hardware and the underlying operating system.
Implement and manage partitioning on the supernode, allocating resources for different environments (Jupyter, Slurm, Linux shell, Docker containers, etc.).
Provide support and administration for Kubernetes, aiding in the integration of various providers.
Continuously evolve processes and ways of working to maximize the platform's efficiency, ultimately reducing the need for external support.

Note to Vendors
6-12 month contract, with possibility to extend
Video pre-screen consists of 5 video questions and a games section
This role is supporting the R&D AI Tech Domain area 
Candidates with experience in AI development, video analytics, and other tech disciplines will find this role particularly engaging, as the platform's capabilities extend beyond AI. This position offers a unique opportunity to contribute to the broader technological advancements of the organization.

Keywords: artificial intelligence rlang information technology
[email protected]
View all
Thu Sep 14 23:08:00 UTC 2023

To remove this job post send "job_kill 641092" as subject from [email protected] to [email protected]. Do not write anything extra in the subject line as this is a automatic system which will not work otherwise.


Your reply to [email protected] -
To       

Subject   
Message -

Your email id:

Captcha Image:
Captcha Code:


Pages not loading, taking too much time to load, server timeout or unavailable, or any other issues please contact admin at [email protected]
Time Taken: 0

Location: ,