ARAVIND - AI/ML ENGINEER |
sarahp@ganait.com |
Location: Tampa, Florida, USA |
Relocation: YES |
Visa: H1B |
Professional Summary:
AI/ML Engineer with 10 years of expertise in Data Science, Machine Learning, Deep Learning, and Large Language Models (LLMs), specializing in building, fine-tuning, and deploying AI-driven solutions at scale. Expertise in building, fine-tuning, and deploying large language models (LLMs) and generative AI using GPT-4, Gemini AI, LLaMA, Falcon, and BLOOM. Proficient in diffusion models, GANs, VAEs, and parameter-efficient fine-tuning (LoRA, QLoRA, PEFT) to optimize model performance. Implemented retrieval-augmented generation (RAG) with vector databases (FAISS, Pinecone, Weaviate, ChromaDB) for enterprise AI solutions. Hands-on experience in Python-based deep learning using TensorFlow, PyTorch, and Hugging Face Transformers. Developed and optimized NLP models (BERT, GPT, T5, RoBERTa) with advanced tokenization techniques (Word2Vec, FastText, BPE, Sentence Transformers). Built AI- powered applications for chatbots, document summarization, and text analytics using NLTK, SpaCy, and OpenAI APIs. Deployed AI/ML models on AWS, Azure, and GCP using SageMaker, Vertex AI, and Azure ML. Expertise in serverless AI architectures (AWS Lambda, Cloud Run) and distributed model training with PySpark and Databricks. Built AI-driven data pipelines leveraging Apache Airflow, Kafka, and Snowflake. Strong experience in processing large-scale structured and unstructured data with Python, PySpark, Snowflake, BigQuery, and Redshift. Developed feature stores (Feast, Databricks Feature Store) to enhance real-time model inference and retraining workflows. Expertise in advanced tokenization techniques (Word2Vec, FastText, BPE, Sentence Transformers) for NLP tasks. Developed AI-powered applications for chatbots, document summarization, and text analytics using NLTK, SpaCy, and OpenAI APIs. Designed and automated end-to-end MLOps pipelines using GitHub Actions, MLflow, and Terraform, ensuring continuous integration and deployment. Experience in containerized deployments with Docker, Kubernetes, and Helm, optimizing model serving with KServe, TensorFlow Serving, and Triton Inference Server. Utilized MLflow, Kubeflow, and TensorFlow Extended (TFX) for automated model versioning, tracking, and managing model lifecycles, ensuring continuous improvement and governance. Integrated AWS SageMaker Model Monitor, GCP Vertex AI, and Azure ML to automate model drift detection, performance evaluation, and anomaly detection. Skilled in distributed model training with PySpark and Databricks, processing large-scale datasets for scalable AI solutions. Built HIPAA- and FHIR-compliant AI/ML models for secure healthcare data processing and EHR integration. Optimized predictive analytics to improve patient outcome forecasting. Built and optimized AI-driven data pipelines using Apache Airflow, Kafka, and Snowflake, ensuring seamless data processing and real-time inference. Proficient in API development using FastAPI, Flask, and Uvicorn, delivering asynchronous AI inference services and scalable API solutions. Proficient in evaluation metrics such as precision, recall, F1-score, and AUC-ROC, ensuring 95%+ accuracy and 90% precision in classification and regression tasks. Skills: Programing & Scripting Languages Python, Pyspark, Java, SQL, Bash, Perl, YAML, Groovy Machine Learning Supervised Learning, Unsupervised Learning, Feature Engineering, Model Optimization, Time Series Analysis, Recommendation Systems, Sentiment Analysis, Evaluation Metrics. Deep Learning TensorFlow, PyTorch, Keras, CNNs, RNNs, LSTMs, GANs, Transformer Models, ANN, Transfer Learning, Ensemble Models, Computer Vision. Large Language Models (LLMs) GPT, BERT, LLaMA, Falcon, Hugging Face Transformers, LoRA, DeepSpeed. Retrieval-Augmented Generation (RAG) FAISS, ChromaDB, Pinecone, Weaviate, Llama Index. Generative AI Fine-tuning LLM s, Langchain, Llama index, RAG, AI Agents, CrewAI, Auto Gen, Prompt Engineering, Hugging face, Phi, OpenAI, Llama, FIASS. Big Data & Distributed Computing HDFS, MapReduce, PySpark Model Serving FastAPI, Flask, Streamlit, Kserve, TensorFlow Serving, TorchServe, Cloud Run, Vertex AI Endpoints, SageMaker Endpoints, Azure ML endpoints. Cloud Platforms AWS (SageMaker, Bedrock, Lambda, Lex, CloudWatch, CloudTrail, Redshift ML, DynamoDB, CodeBuild, CodeDeploy, S3, EC2, IAM, AMIs). GCP (Vertex AI, AutoML, Cloud Vision API, Dialogflow, NVIDIA GPUs, BigQuery ML, VM Instance, VPC). Azure (Azure ML, Azure AI & OpenAI, Blob Storage, Azure Functions, Azure Cognitive Services) MLOps & CI/CD GitHub Actions, GitLab CI/CD, Bitbucket Pipelines, Jenkins, CircleCI, Cloud Build, CodePipeline, Azure Pipelines, MLflow, Kubeflow, DVC, DagsHub Monitoring & Logging AWS CloudWatch, GCP Cloud Monitoring, Azure Monitor, CloudTrail Logs, Prometheus, Grafana, MLflow, Weights & Biases, TensorBoard, Evidently AI, Vertex AI Model Monitoring, Cloud Logging MLOps & Workflow Orchestration Kubeflow, MLflow, Airflow, TFX Pipelines, Vertex AI Pipelines, Supervised/Unsupervised Algorithms, ANN, CNN, NLP, Computer Vision, GAN s, LSTM, Feature Engineering, Transfer Learning, ensemble models, Time series, Recommendation systems, Sentiment Analysis, Evaluation Metrics. . Frame Works PySpark, Tensorflow, Keras, PyTorch, Scikit-Learn, openCV, NLTK, Pandas, Transformers, Flask, Celery, FastAPI, Streamlit, Gradio, Pickle, Pydantic, Anaconda, Jupyter Notebook. Version Control GIT, GitHub, Azure Repos, AWS Code Commit. Data Warehousing Snowflake, Data Lake Storage, Google Big Query. Container Orchestration and Infrastructure as Code Docker, Kubernetes, Terraform. Operating Systems Windows, Ubuntu, Linux. Certifications: Azure Certification - Link Professional Experience: Client: Motive MI, SFO, CA(Remote) Jun 2023 - Jan 2025 Role: Senior AI/ML GenAI Engineer Responsibilities: Developed Generative AI-powered diagnostic assistant using AWS Bedrock and fine- tuned LLMs (GPT, Falcon, T5) on medical datasets, enabling real-time patient symptom analysis and personalized treatment recommendations. Implemented Retrieval-Augmented Generation (RAG) using FAISS and AWS OpenSearch to enhance the accuracy of AI-generated medical insights by integrating real- time EHR and clinical trial data. Utilized Deep Learning & NLP (Transformer models like BERT and T5) with SpaCy for clinical text processing, extracting vital health indicators from unstructured medical documents. Designed and deployed custom LLM APIs using TensorFlow Serving and PyTorch with FastAPI, ensuring low-latency inference for real-time medical queries. Developed feature engineering pipelines using PySpark and Pandas, optimizing structured and unstructured medical data ingestion from AWS S3 & Snowflake for efficient training. Built secure and scalable model deployment using SageMaker endpoints and VertexAI Endpoints, containerized with Docker & Kubernetes, ensuring HIPAA-compliant inferencing. Integrated AWS CloudWatch, Prometheus, and Grafana for real-time monitoring of GenAI model performance, drift detection, and automatic retraining triggers. Built and deployed AI models using AWS SageMaker and AWS Bedrock for high- performance inference. Applied TF-IDF, BERT embeddings, and PCA for text feature extraction and dimensionality reduction. Developed AI/ML models using Python, leveraging TensorFlow, PyTorch, and NLP libraries for secure healthcare data processing and predictive analytics. Optimized data extraction and transformation using SQL, ensuring seamless integration with EHR systems and efficient querying of healthcare records. Preprocessed medical images using OpenCV, PIL, and wavelet transforms for noise reduction and augmentation. Implemented Named Entity Recognition (NER) using SpaCy & BioBERT for extracting symptoms and medical entities. Fine-tuned GPT-4, LLaMA, and BioBERT on AWS Bedrock for clinical text analysis and AI- driven diagnosis. Built CNNs, Vision Transformers, and LSTMs for medical image classification and patient risk prediction. Optimized model performance using LoRA, quantization, and TensorRT, reducing inference latency by 30%. Achieved 95%+ accuracy in disease classification by implementing hyperparameter tuning & cross-validation. Deployed models on AWS SageMaker with MLflow for versioning, tracking, and experiment logging. Implemented asynchronous API processing using Kafka & Celery for handling large- scale medical queries. Monitored data drift, model bias, and performance metrics using AWS SageMaker Model Monitor & MLflow. Automated model retraining & deployment using AWS Lambda & SageMaker Pipelines, ensuring continuous improvement. Ensured HIPAA & GDPR compliance using AWS KMS encryption, IAM roles, and secure API authentication. Client: OPTUM, Albany, NY Dec 2021 May 2023 Role: ML Engineer Responsibilities: Developed an LLM-powered Virtual Health Assistant for real-time symptom analysis, medical Q&A, and personalized treatment suggestions. Leveraged Azure OpenAI, Azure ML, and Speech AI to create an interactive multimodal AI system supporting text, voice, and medical imaging. Ingested structured EHR data from Azure Data Factory & Snowflake, alongside unstructured text, voice, and radiology images. Preprocessed medical speech using Azure Speech Services for transcription and intent recognition. Extracted clinical entities using Med-BERT, UMLS embeddings, and Named Entity Recognition (NER). Applied self-supervised learning on medical text & speech data, improving intent recognition accuracy by 30%. Implemented dimensionality reduction (PCA, t-SNE) for optimizing large-scale patient history datasets. Ingested structured EHR data from Azure Data Factory & Snowflake, alongside unstructured text, voice, and radiology images. Preprocessed medical speech using Azure Speech Services for transcription and intent recognition. Extracted clinical entities using Med-BERT, UMLS embeddings, and Named Entity Recognition (NER). Applied self-supervised learning on medical text & speech data, improving intent recognition accuracy by 30%. Implemented dimensionality reduction (PCA, t-SNE) for optimizing large-scale patient history datasets. Fine-tuned GPT-4 & BioGPT using LoRA & QLoRA on Azure OpenAI for symptom analysis and medical chatbot responses. Developed Transformer-based models (Vision Transformer, Swin Transformer) for radiology image classification. Integrated attention mechanisms & reinforcement learning (RLHF) for improving virtual assistant recommendations. Used Contrastive Learning to improve text-to-image alignment for medical diagnostics. Optimized inference with ONNX, TensorRT, and DeepSpeed, reducing latency by 40%. Built real-time APIs using TensorFlow Serving & TorchServe, exposing AI predictions via FastAPI . Developed voice-to-text API with Azure Cognitive Services, enabling real-time medical Q&A. Integrated Azure Event Grid for real-time notifications & streaming patient health updates. Monitored model drift & explainability with SHAP, Azure Monitor, Prometheus, and Grafana. Ensured HIPAA, GDPR, and SOC-2 compliance by implementing role-based access (RBAC) & private networking. Encrypted sensitive patient conversations & medical reports using Azure Key Vault & Managed HSM. Applied federated learning techniques to train AI models without exposing sensitive patient data. Client: British American Tobacco, Albany, NY Mar 2020 - Nov 2021 Role: ML Engineer Responsibilities: Designed and implemented a predictive analytics solution using Azure Machine Learning and Databricks to forecast both cost changes and future sales volumes of tobacco products. Built ML models using Linear Regression, Gradient Boosting, SVM, ElasticNet, Random Forest, and K-Means clustering to analyze cost trends, demand patterns, and optimal pricing strategies. Developed large-scale data pipelines in PySpark within Databricks Notebooks to process past 3 years of historical sales data from Snowflake and predict sales amounts and costs for the next 2 years. Implemented feature engineering and data transformation techniques to extract meaningful insights from sales data, customer trends, and external market factors. Integrated Snowflake as a scalable data source for both batch and real-time ingestion, enabling automated data pipelines using Azure Data Factory. Trained and optimized ML models to minimize forecasting errors and improve cost efficiency and sales planning accuracy by leveraging Azure ML AutoML and Hyperparameter tuning. Deployed predictive models as endpoints in Azure ML and exposed them via FastAPI and Azure Functions for real-time inference and integration with business applications. Implemented CI/CD pipelines with Azure DevOps, GitHub Actions, and MLflow to automate model retraining, deployment, and versioning for continuous improvement. Monitored model performance, accuracy, and drift using Azure Monitor, MLflow, Prometheus, and Grafana, ensuring reliability in sales and cost predictions. Developed dashboards and visualizations in Power BI to provide stakeholders with insights into future sales forecasts, pricing strategies, and demand planning. Reduced forecasting errors by 25% and improved sales planning efficiency, enabling optimized production and pricing decisions for British American Tobacco. Maveric Systems, Chennai, India Apr 2018 - Feb 2020 Role: DevOps Engineer Responsibilities: Collaborated with cross-functional teams to identify requirements for automating the machine learning lifecycle, focusing on scalable model deployment and monitoring using Azure ML Studio and Azure Kubernetes Service (AKS). Assessed infrastructure requirements and designed solutions leveraging Azure DevOps, Terraform, and Docker to ensure scalable, efficient workflows. Designed a modular architecture for end-to-end machine learning workflows using Azure Databricks, Azure ML Studio, and Azure ML Pipelines, ensuring reusability and scalability. Developed infrastructure as code (IaC) templates using Terraform and ARM Templates for provisioning cloud resources, including AKS, ACI, and Azure Redis Cache. Architected graph-based data models in Neo4j for efficient storage and querying complex relationships in large-scale datasets. Built ETL pipelines using Azure Databricks to preprocess and transform raw data from Snowflake and other sources, ensuring high-quality datasets for training and validation. Automated data validation workflows with Azure ML Pipelines, integrating feature engineering and anomaly detection processes. Utilized Gremlin for querying graph databases in Azure Cosmos DB, enabling dynamic feature extraction for predictive modeling. Designed and deployed scalable Neo4j graph database solutions on Azure Kubernetes Service (AKS) to manage and query complex data relationships, ensuring high availability and fault tolerance. Developed machine learning models using Azure ML Studio, leveraging its experimentation and model registry capabilities to manage iterative improvements. Trained and optimized models using Azure Databricks, integrating MLflow for experiment tracking, hyperparameter tuning, and artifact management. Integrated TensorFlow and PyTorch frameworks for model development, ensuring compatibility with diverse AI workloads. Designed custom evaluation metrics to validate model performance, integrating tests into CI/CD pipelines using Azure DevOps and GitHub Actions. Ensured reproducibility and traceability by implementing data and model versioning practices in Azure ML Studio. Evaluated feature relationships with graph-based models in Neo4j, enabling insights into complex interactions within health insurance datasets. Deployed containerized machine learning models on AKS, ACI, and Azure App Services, optimizing resource allocation for low-latency predictions. Packaged models with Docker and published containers to Azure Container Registry (ACR), ensuring seamless integration with CI/CD workflows. Automated real-time inference endpoints using Azure ML Endpoints, providing scalable and efficient production-ready solutions. Monitored deployed models using Azure Monitor and Azure ML Model Monitoring, detecting data drift and performance degradation in production environments. Configured Azure Redis Cache for caching frequently queried data, reducing latency and improving system responsiveness. Maintained CI/CD pipelines for Dash applications and ML models, ensuring consistent delivery across staging, testing, and production environments. Authored detailed documentation outlining pipeline designs, deployment strategies, and model monitoring practices, enhancing team knowledge and reproducibility. Conducted workshops and training sessions on using Azure ML Pipelines, Terraform, and Kubeflow to streamline MLOps processes across teams. Client Name: ACE Technologies, Hyderabad, India Jul 2015 - Mar 2018 Role: DevOps Engineer. Responsibilities: Installed, configured, and maintained Jenkins to facilitate continuous integration (CI) and end- to-end automation for all build and deployment processes. Documented software release management procedures, capturing input decisions and maintaining clarity in project processes. Maintained Linux environments for deployments, including Ubuntu, Debian, Red Hat, and CentOS, ensuring compatibility and support for various applications. Integrated SageMaker with AWS Lambda, S3, and Step Functions to build automated and serverless machine learning pipelines. Configured production administration, monitoring, and reporting using Puppet, designing and implementing Puppet templates for efficiency. Developed and maintained MAVEN and Shell scripts to automatically compile, package, deploy, and test J2EE applications across various WebLogic platforms. Managed routine system backups and scheduled jobs, including enabling and disabling cron jobs, system logging, and performance tuning. Participated in configuring and monitoring distributed servers using Puppet, employing Puppet server and workstation for effective node management. Applied extensive knowledge of configuration management policies and practices throughout the SDLC, focusing on automation through BASH/Shell and Perl scripting. Created build and deployment scripts using ANT and MAVEN in Jenkins, facilitating transitions between environments and creating new jobs and branches. Deployed and configured automation tools like Puppet and Chef for effective configuration management. Worked on configuring Puppet master servers and installing Puppet client software on Linux servers to enhance infrastructure management. Deployed Puppet, Puppet Dashboard, and Puppet DB for configuration management within existing infrastructure. Created and updated Bash scripts, modules, files, and packages to support automation and efficiency in system operations. Implemented CI/CD processes using Jenkins and Shell scripts, automating routine jobs to enhance deployment workflows. Education: Bachelors in Computer Science Keywords: continuous integration continuous deployment artificial intelligence machine learning business intelligence sthree database California Michigan New York |