Urgent Requirement || Software Engineer IV ||Menlo Park, CA (Onsite) || Immediate Interview at Park, Kansas, USA |
Email: [email protected] |
From: shruti, 1 Point System [email protected] Reply to: [email protected] Hi, This side Shruti Singh from 1point system, Hope you are doing Well, I have a job position for you. Job Description:- Job Role:- Software Engineer IV Location :- Menlo Park, CA (Onsite) Contract The AI SW/HW Co-Design team's mission is to explore, develop and help productize high-performance software and hardware technologies for AI at datacenter scale. The team also supports. Model co-design for scalability and efficiency of production recommender and content understanding models. Software libraries, models, and frameworks for training systems GPU optimizations About This Featured Opportunity In this role you will.... Develop next-gen model compression and model efficiency/scalability optimizations Support programming in C/C++ and Python (proficiency in at least one language required, both preferred) Support performance at the core: identify and improve ML model accuracy and performance bottlenecks Support Computing-at-scale, enabling large-scale software deployments in production Develop software solutions for cutting-edge (including off-market) hardware systems. Key Success Factors GPU performance optimizations CUDA programming C/C++ and Python proficiency (5-7 years of industry experience required) Experience with some subset of the following HPC systems software: Accelerator (GPU/ASIC) kernel development and optimization (e.g., NVIDIA, AMD, Intel, or other misc. accelerators) Numerical libraries experience (e.g. mixed precision linear algebra, tensor-based frameworks, TensorFlow, etc.) Performance enablement, tracing, profiling, and debugging. Preferred Experience: Scientific computing or other forms of HPC with an AI/ML/DL emphasis Experience scaling research across different modalities Experience with AI workload optimizations (compression, quantization, pruning techniques; graph-based systems) Datacentre scale distributed AI training and inference experience Performance, programmability, and efficiency at datacentre scale (data access optimizations such as prefetching and catching, designing scalable frameworks for efficient use of high performance HW, high-performance and fault-tolerant middleware, network and communication fabric optimization) Keywords: cprogramm cplusplus artificial intelligence machine learning California Colorado |
[email protected] View all |
Thu Aug 24 22:00:00 UTC 2023 |