| Likhil P - AI ML Engineer - 5 years EXP |
| [email protected] |
| Location: Remote, Remote, USA |
| Relocation: Yes |
| Visa: OPT EAD |
|
LIKHIL PENUJULI
Email: [email protected] | GitHub: github.com/Likhil2k1 Professional Summary: Experienced AI/ML Engineer with 5 years building scalable machine learning and GenAI systems across Azure and GCP, specializing in RAG pipelines, real-time streaming, and large-scale data processing for enterprise and research-driven environments Designed and deployed end-to-end ML pipelines using Azure ML, Vertex AI, and BigQuery, processing 25M+ transactions/day and 18M+ healthcare records, enabling low-latency inference, anomaly detection, and intelligent decision-making systems Strong expertise in GenAI architectures including RAG, semantic search, and vector search, implementing embedding pipelines and LLM-powered systems for knowledge retrieval, fraud detection, and clinical intelligence use cases Proficient in big data technologies including PySpark, Spark, and distributed data pipelines, handling 3TB+ datasets and enabling scalable batch and streaming processing for high-throughput enterprise workloads Specialized in deep learning models including CNN, LSTM, and hybrid architectures, with experience in anomaly detection, time-series modeling, and improving model robustness through evaluation, backtesting, and drift analysis Experienced in healthcare and financial domains, leveraging FHIR APIs and structured data pipelines to build compliant, secure, and interoperable AI systems aligned with enterprise data standards Strong background in model explainability using SHAP and interpretability techniques, generating 800+ explanations to improve transparency, support regulatory compliance, and enhance trust in AI-driven decisions Skilled in building data engineering pipelines using Azure Data Factory, Synapse, and Cloud Storage, enabling reliable ETL workflows, data validation, and integration across 12+ enterprise systems Proven ability to collaborate with cross-functional teams, including 7+ stakeholders, translating business and research requirements into scalable AI solutions while aligning models with real-world operational workflows Passionate about advancing GenAI and ML systems, focusing on scalability, performance optimization, and delivering measurable impact across enterprise AI platforms and scientific research initiatives Technical Skills: Programming & Data Processing: Python, SQL, R, C++, PySpark, Spark, NumPy, Pandas, React.js Cloud Platform & Services: Microsoft Azure, Azure ML, Azure Synapse Analytics, Azure Data Factory, Azure Storage, Azure Monitor GenAI & Machine Learning: LLMs, RAG, NLP, Deep Learning, CNN, LSTM, Hybrid Models, Anomaly Detection, Fraud Detection, Predictive Modeling Machine Learning & AI Platforms: Vertex AI, Machine Learning Pipelines, Feature Engineering, Model Training Pipelines, Model Deployment, Model Monitoring, Fraud Detection Models Vector Search & Retrieval: Vector Search, Embedding Pipelines, Semantic Search, Retrieval Pipelines Big Data & Data Engineering: ETL, Data Pipelines, Batch Processing, Real-Time Streaming, Data Transformation, Feature Engineering, Data Quality Validation Healthcare & Data Standards: FHIR APIs, Clinical Data Integration, Healthcare Data Modeling, Data Interoperability Model Explainability & Evaluation: SHAP, Model Evaluation, Backtesting, Experiment Tracking, Model Governance Data Visualization & Reporting: Tableau, Dashboards, Data Visualization, Analytics Reporting Workflow Orchestration: Azure Data Factory, ML Pipeline Orchestration, ETL Workflow Automation Version Control & Collaboration: Git, Agile Methodologies, Stakeholder Collaboration, Research Documentation Education: M.S., Artificial Intelligence University of North Texas B.Tech., Computer Science & Engineering Amrita Vishwa Vidyapeetham Professional Experience: AI/ML Engineer | Sentara Health | Remote March 2025 - Present Project: Built a GenAI-powered clinical knowledge assistant using Azure ML, FHIR APIs, and Synapse to process 18M+ healthcare records, enabling RAG pipelines, semantic search, and secure natural language access to clinical data for care teams. Designed scalable RAG pipelines using Azure ML to ingest and process 18M+ clinical records, enabling structured retrieval and contextual understanding of patient histories, clinical notes, and healthcare policies for intelligent and accurate query responses Built robust data ingestion workflows using Azure Data Factory to integrate 12+ healthcare systems, leveraging standardized FHIR APIs to normalize clinical data and ensure interoperability across diverse electronic health record environments Developed advanced semantic search capabilities using vector search techniques on Azure ML, enabling clinicians to retrieve highly relevant patient information and clinical insights from large-scale unstructured healthcare datasets efficiently Engineered optimized data transformation pipelines using Synapse, processing 3TB clinical datasets, enabling efficient storage, querying, and feature preparation for downstream GenAI, analytics, and predictive healthcare workloads Implemented scalable chunking and embedding pipelines generating 120M+ text chunks, enabling efficient indexing, retrieval, and contextual grounding of clinical documents for accurate and context-aware response generation Built high-performance low-latency inference workflows using Azure ML endpoints, delivering responses within 200 ms, enabling real-time access to clinical knowledge and decision support for healthcare professionals Designed comprehensive data validation frameworks enforcing 15+ QC rules, ensuring accuracy, completeness, and consistency of clinical data ingested from multiple healthcare systems and improving downstream model reliability Applied model explainability techniques using SHAP, generating 800+ explanations, enabling transparency in GenAI outputs and supporting compliance with healthcare regulatory and audit requirements Developed interactive monitoring dashboards using Synapse analytics, tracking 25+ performance metrics, enabling visibility into system usage, query latency, and response accuracy for continuous operational improvements Conducted extensive model evaluation workflows across 10+ clinical scenarios, validating performance of RAG pipelines and improving response quality for diverse healthcare use cases and real-world clinical queries Implemented structured experiment tracking systems capturing 400+ model runs, ensuring reproducibility of ML pipelines and enabling systematic experimentation for continuous improvement of model performance Collaborated with 7+ clinical stakeholders to refine retrieval and response logic, improving usability, aligning outputs with real-world clinical workflows, and enhancing trust in system-generated healthcare insights ML Engineer | CITI Bank | Texas, USA May 2024 Feb 2025 Project: Built a scalable real-time fraud detection platform using GCP Vertex AI, BigQuery, and Cloud Storage to process and score 25M+ daily transactions, enabling low-latency inference, anomaly detection, and secure fraud prevention across enterprise banking systems globally. Designed real-time data pipelines using BigQuery and Cloud Storage to ingest and process 25M+ daily transactions, enabling continuous fraud detection and supporting high-throughput processing across enterprise-scale banking systems Built machine learning models using Vertex AI trained on 2TB historical transaction data, enabling accurate detection of fraudulent activities and improving classification performance across diverse financial transaction patterns Engineered low-latency inference pipelines using Vertex AI endpoints, delivering fraud risk scores within 150 ms, enabling real-time decisioning for transaction approvals and reducing exposure to high-risk activities Implemented feature engineering pipelines in BigQuery generating 200+ features, capturing behavioral, transactional, and temporal patterns to enhance fraud detection accuracy and improve model robustness across banking datasets Integrated streaming and batch processing workflows supporting 2 processing modes, enabling real-time fraud scoring alongside historical analysis pipelines for continuous model improvement and validation Developed vector similarity search workflows using Vertex Vector Search indexing 50M+ embeddings, enabling detection of suspicious behavioral patterns and improving identification of previously unseen fraud scenarios Designed data validation and quality checks enforcing 20+ rules, ensuring consistency and accuracy of incoming transaction data while reducing the impact of corrupted or incomplete financial records Built fraud monitoring dashboards using BigQuery analytics, visualizing 30+ risk indicators, enabling analysts to track anomalies, investigate suspicious transactions, and support operational fraud response workflows Conducted model evaluation and backtesting workflows across 12 months historical data, validating fraud detection performance and improving resilience against evolving fraud patterns in dynamic financial environments Implemented model explainability techniques using SHAP, generating 500+ explanations, enabling transparency in fraud predictions and supporting regulatory compliance requirements across enterprise banking systems Developed experiment tracking systems capturing 300+ model runs, ensuring reproducibility of training pipelines and supporting structured experimentation for continuous fraud model optimization workflows Collaborated with 8+ fraud analysts to tune model thresholds, reducing false positives while maintaining detection sensitivity and aligning outputs with real-world fraud investigation workflows AI/ML Research Engineer | INCOIS | Hyderabad, India Feb 2021 Aug 2023 Project: Built a scalable real-time streaming ocean analytics platform using PySpark, Kafka, and deep learning models to detect marine anomalies across 15M+ sensor events/day, enabling low-latency alerting, scientific explainability, and robust model governance for climate monitoring. Designed real-time streaming pipelines using PySpark to process 15M+ marine sensor events/day, enabling continuous anomaly detection across diverse oceanographic data sources while maintaining low-latency ingestion for mission-critical climate monitoring systems Engineered Kafka-based streaming architecture to dispatch alerts across 8 downstream systems, enabling reliable low-latency alerting workflows and ensuring timely anomaly notifications for marine operations and operational intelligence Developed CNN-based anomaly detection models trained on 3TB historical ocean data, improving detection of complex spatial-temporal patterns across marine climate variables and supporting accurate identification of subtle oceanographic anomalies Implemented LSTM and hybrid deep learning models across 3 architectures, enabling comparative evaluation to enhance drift-resistant anomaly detection and improve long-term stability of predictive models for evolving ocean climate patterns Built batch and streaming inference pipelines supporting 2 deployment modes, enabling real-time scoring alongside retrospective analysis workflows for large-scale ocean datasets and ensuring flexible deployment strategies for research and operational needs Standardized feature engineering and preprocessing pipelines across 20+ sensor types, harmonizing timestamp mismatches and resolution inconsistencies to create unified datasets that improved model training consistency and analytical integrity Applied SHAP and Grad-CAM explainability techniques generating 1000+ interpretability visualizations, validating deep learning outputs and improving transparency of ocean anomaly predictions for interdisciplinary research teams Designed data quality validation frameworks enforcing 15+ QC rules, detecting corrupt or partial sensor feeds and ensuring consistent, high-quality inputs for downstream anomaly detection models and analytics pipelines Built Tableau dashboards visualizing 20+ anomaly indicators, enabling oceanographers to analyze temporal shifts, interpret anomaly signatures, and support data-driven decision-making across multiple climate monitoring scenarios Conducted model backtesting workflows across 10+ historical climate scenarios, validating anomaly detection performance and improving robustness against seasonal variations and environmental changes observed in long-term ocean datasets Developed experiment tracking and metadata systems capturing 500+ model runs, ensuring reproducibility of model training workflows and enabling structured tracking of experiments for scientific reporting and validation processes Collaborated with 6+ oceanographers to tune model thresholds, reducing false-positive alerts while maintaining sensitivity across critical anomaly categories and aligning model outputs with real-world oceanographic interpretations Certifications: AWS Certified Machine Learning Microsoft Certified: Azure AI Engineer Associate Google Professional Machine Learning Engineer Microsoft Certified: Azure Data Engineer Projects: Compliance-Aware Enterprise RAG Platform | FastAPI, FAISS, Pinecone, LangChain, AWS ECS, API Gateway Designed and implemented a compliance-aware RAG platform to enable accurate, explainable, and auditable question-answering across enterprise payroll, insurance, and policy documentation environments. Built ingestion pipelines for PDFs and structured data, implementing document chunking, semantic enrichment, and embedding generation to support scalable retrieval across large document repositories. Implemented metadata-aware semantic retrieval using FAISS and Pinecone, enabling contextual filtering by policy version, jurisdiction, and document type for precise compliance query resolution. Integrated hybrid retrieval strategies combining dense and keyword search, improving recall and ensuring coverage for compliance-critical queries requiring both semantic and lexical matching approaches. Developed citation-backed answer generation ensuring responses are traceable to source documents, supporting auditability and regulatory review requirements in enterprise compliance environments. Optimized inference cost through context compression and dynamic prompt construction, reducing token usage while maintaining high-quality responses across large-scale enterprise workloads. Multi-Agent GenAI Workflow System | LangChain, LangGraph, CrewAI, FastAPI, Vector Databases Architected a multi-agent GenAI system to automate enterprise workflows including case triage, policy reasoning, and escalation handling using coordinated intelligent agents. Designed specialized agents for intake, reasoning, risk detection, and human review, enabling modular handling of complex multi-step enterprise decision-making workflows. Implemented agent orchestration using LangChain, LangGraph, and CrewAI, enabling structured task delegation, parallel execution, and coordinated reasoning across multiple agents. Built shared memory using vector databases to maintain conversation continuity and enable context-aware reasoning across agents in multi-step workflows. Enforced domain guardrails and safety policies to prevent unsupported reasoning, ensuring compliance, reliability, and controlled outputs in enterprise environments. Optimized prompt design and execution flow to reduce redundant LLM calls, improving efficiency and minimizing inference overhead across complex multi-agent workflows. LLM Observability, Drift & Hallucination Monitoring Platform | MLflow, EvidentlyAI, Vector Embeddings, FastAPI Designed and built a production-grade LLM observability platform to monitor quality, safety, and performance of deployed GenAI systems across enterprise environments. Implemented hallucination detection metrics measuring semantic deviation between generated responses and source content, enabling continuous validation of model grounding and output accuracy. Built semantic drift detection pipelines using embedding distance analysis, identifying shifts in data distributions and retrieval relevance over time in production systems. Integrated prompt regression tracking to compare prompt versions across accuracy, latency, and cost, supporting controlled experimentation and optimization workflows. Leveraged MLflow to track model, prompt, and retrieval configurations, enabling reproducibility, version control, and rollback capabilities for production GenAI systems. Developed EvidentlyAI dashboards to visualize grounding quality, response stability, and confidence trends, providing actionable insights into system performance and reliability. Keywords: cplusplus artificial intelligence machine learning javascript rlang microsoft mississippi Keywords: cplusplus artificial intelligence machine learning javascript rlang microsoft mississippi Keywords: cplusplus artificial intelligence machine learning javascript rlang microsoft mississippi |