Home

AI/ML & HPC Specialization) - Remote at Remote, Remote, USA
Email: [email protected]
(AI/ML & HPC  Specialization)

Type: Contract

Location: USA Remote

Key Responsibilities:

Design, implement, and manage cloud-based infrastructure that supports AI/ML workflows.

Collaborate with data scientists and ML engineers to deploy scalable machine learning models into production.

Ensure the security, scalability, and reliability of AI/ML systems in the cloud.

Optimize cloud resources for cost-effective and efficient use.

Stay current with the latest in cloud services, AI/ML tools, and industry best practices.

Provide technical leadership and guidance in cloud and AI/ML architecture.

Develop and maintain CI/CD pipelines for AI/ML model training and deployment.

Monitor and troubleshoot AI/ML applications and cloud environments.

Document system design and operational procedures.

Collaborate with AI/ML and HPC teams to understand their computing and storage needs.

Qualifications:

Bachelors or Masters degree in Computer Science, Engineering, or related field.

Proven experience in cloud computing (AWS, Azure, GCP) and cloud architecture.

Strong background in AI/ML technologies, with experience in depl
oying ML models.

Proficiency in scripting languages (Python, Bash) and containerization technologies (Docker, Kubernetes).

Proficiency with virtual compute environments (EC2).

Hands-on experience with High Performance Computing (HPC) and server node Cluster Management

Strong Knowledge of Linux/Unix operating systems (RHEL/Ubuntu)

Experience with job schedulers (like SLURM, PBS), resource management, and system monitoring tools (DynaTrace).

Understanding of storage solutions and file systems used in HPC (such as Lustre, GPFS).

Experience with infrastructur
e as code (IaC) tools like Terraform or CloudFormation.

Knowledge of networking, security, and database technologies in a cloud environment.

Excellent problem-solving, communication, and team collaboration skills.

Preferred Skills:

Familiarity with machine learning frameworks (TensorFlow, PyTorch) and data pipelines.

Certifications in cloud architecture (AWS Certified Solutions Architect, Google Cloud Professional Cloud Architect, etc.).

Experience in an Agile development environment.

Prior work with distributed computing and big data technologies (Hadoop, Spark).

Operational experience running large scale platforms, including AI/ML platforms

--

Thanks and Regards,Praveen J

Email Address - [email protected]

Talent Acquisition Specialist

http://adepttechservices.com

11340 Lakefield Dr., Suite 200, Johns Creek, GA 30097

--

Keywords: continuous integration continuous deployment artificial intelligence machine learning information technology Georgia
[email protected]
View all
Tue Jan 30 23:36:00 UTC 2024

To remove this job post send "job_kill 1066210" as subject from [email protected] to [email protected]. Do not write anything extra in the subject line as this is a automatic system which will not work otherwise.


Your reply to [email protected] -
To       

Subject   
Message -

Your email id:

Captcha Image:
Captcha Code:


Pages not loading, taking too much time to load, server timeout or unavailable, or any other issues please contact admin at [email protected]
Time Taken: 22

Location: , Remote