Home

Rahul - Data Scientist
[email protected]
Location: Buffalo, New York, USA
Relocation: Open
Visa: OPT
Name: Rahul Contact no: 703-745-8917

Data Scientist with over all 6 years of experience specializing in machine learning and artificial intelligence across various industries
I hold a master s degree in computer science and engineering from the University at Buffalo and a BTech in Electronics Engineering from IIT (BHU) Varanasi.
My expertise extends to developing and deploying scalable AI solutions, with a significant focus on generative AI, natural language processing, anomaly detection systems, and particularly large language models (LLMs).
I have demonstrated the ability to lead cross-functional teams in designing and implementing comprehensive systems that enhance operational efficiency and decision-making processes.
Awarded multiple patents and published in peer-reviewed conferences, my career reflects a deep commitment to advancing AI research and its practical applications.
I am adept at leveraging advanced AI technologies, including LLMs, to deliver impactful solutions that drive business growth and innovation. Currently,
I am seeking opportunities to bring my technical leadership and strategic insights to a forward-thinking organization.
EDUCATION
University at Buffalo
Buffalo, New York, US
Master of Science in Computer Science and Engineering
Aug. 2022 Dec. 2023
Coursework: NLP, Computational Linguistics, Pattern Recognition, Machine Learning, Algorithms Design
GPA: 3.82 / 4
Indian Institute of Technology (Banaras Hindu University), Varanasi
Uttar Pradesh, India
Bachelor of Technology in Electronics Engineering
Jul. 2014 May 2018

SKILLS
Languages :
Python, Java, C++, C, PLSQL, SQL, MongoDB
Technologies :
Retrieval Augmented Generation(RAG), Large Language Models(LLMs), Natural Language Processing(NLP), Ma- chine Learning, Deep Learning, Web Services, Data Structures, Algorithms, Prompt Engineering, Indexing, Quantization
Cloud :
Amazon Web Services (AWS), Azure, Google Cloud Platform (GCP), Oracle, Atlas, OpenAI
Frameworks and Libraries :
Pytorch, TensorFlow, Apache Spark, AWS Sage maker, Azure MLOps, Kera s, MLFlow, Kafka, Sentence- Transformers(embeddings, re-rankers), Databricks, Snowflake, PySpark, Langchain, Llama Index, Vector DBs(Chroma DB, Faiss, elas- ticsearch), MLFlow, Git, Jenkins, Big Data, Hadoop, Flask, Docker, Kubernetes, Pandas, Streamlit, Flask-RESTful, FastAPI, YOLO, Django, XGBoost, GAN, ActiveMQ, Springboot, Dask, spaCy, NLTK, Knowledge Graph

EXPERIENCE
Hilabs
Washington, D.C.
Feb.2024 May. 2024
Senior Data Scientist
GenAI Driven Contract Analyzer: Streamlining Claim Processing
Skills: Layout LM, OCR, AWS Cloud, Hugging Face, Python, OpenSearch, LLM, Mistral AI, AWS SQS, Private Data, Image Processing, AWS EKS, Docker, GenerativeAI, Reterival, Langchain, POC, Claim Processing, Insurance
Led development of scalable end-to-end system for processing contract documents of insurance providers, emphasizing entity extraction to facilitate pricing configuration in claim processing applications.
Engineered custom document processing pipeline with Layout extracting entities with images, tables, and text elements.
Developed Langchain service for indexing extracted elements and managed metadata within OpenSearch to optimize retrieval.
Implemented a robust infrastructure leveraging self-hosted fine-tuned Mistral AI LLM and asynchronous processing via SQS to efficiently handle requests, while prioritizing the utmost security and privacy of sensitive legal contracts
Successful POCs followed by integration for 3 clients to validate solution effectiveness and potential for widespread adoption.

Apexanalytix
Remote, New York
May.2023-Feb.2024 Data Scientist | ML Engineer Intern |
Generative Knowledge Specific Chatbot
Skills: Hugging Face, Python, Azure Cloud, Azure Devops, Azure OpenAI, GenerativeAI, Reterival, Langchain, ChromaDB, Retrieval Augmented Generation (RAG), Production, LLM, Chatbot, Re-Rankers
Developed advanced Retrieval Augmented Generation(RAG) Chatbot with LLM for intelligent knowledge access across teams.
Engineered efficient retrieval pipeline with Parent Child document indexing using LangChain and Chroma VectorDB
Designed agile RAG pipeline, integrating MMR scoring for chunk retrieval and Azure OpenAI LLM for response generation.
Implemented user feedback collection to monitor chatbot performance and gather data for iterative refinement and fine-tuning.
Achieved 89% approval in human evaluations and integrated technology into 12 internal and 18 external client applications.


Oracle
Bengaluru, India
Sept.2020 Aug.2022
Senior Application Engineer(ML)
Preemptive Anomaly Prediction in Corporate Billing
Skills: Oracle Cloud, Oracle Financial Services, PLSQL, SQL, Anomaly Detection, US Patent, Big Data, Parallel Processing
- Implemented in-memory multivariate anomaly prediction system for corporate billing, addressing monthly billing challenges
Leveraged Oracle in-database ML for Semi-Supervised classification with both local and global model explainability.
Optimized service with indexing and parallelism, processing 1.2M bills and 5M segments in 20 mins with 92% precision
Integrated services with Oracle Revenue Management and Billing (ORMB) product, USPTO patent granted [US17/710745]

Wipro Limited
Bengaluru, India
June 2018 Sept 2020
Project Engineer(AI)
Chatbot Services for Employee Helpline Portal s Ticketing System
Skills: PyTorch, Hugging Face, AWS Cloud, Python, Retrieval, BERT, FAISS, Chatbot, User Experience, Automation, Semantic Search
Developed Employee Helpline Portal Chatbot with effective retrieval of historical ticket resolutions for enhanced user support.
Implemented query intent classification and BERT-powered semantic search to deliver precise responses.
Successfully integrated the chatbot into the portal and achieved the target human agent intervention reduction of 70%.
Achieved 79% accuracy score in evaluation and decreased wait times from 18 to 4 minutes, improving overall user experience.

PATENTS and PUBLICATIONS
Paper Published: Virtual Conversation with Real-Time Prediction of Body Moments/Gestures ICMLIP 2019 [Link]
US Patent Granted: Method And System For Multimodal Analysis Based Emotion Recognition US16/795840 [Link]
US Patent Granted: Anomaly Detection for Bill Generation US17/710745 [Link]
US Patent Granted: Technology System For Assisting Financial Institutions In Debt Collection US17/659017 [Link]

PROJECTS

Generative Empathetic Chatbot (BabbleGo) [code] [report] [slides]
Skills: PyTorch, Hugging Face, Python, OpenAI, GenerativeAI, Reterival, Haystack, Elasticsearch, Jupyter Notebook
Deployed the application as web service hosted on Streamlit, utilizing vectorstore on ElasticSearch for efficient storage and retrieval of conversation data.
Implemented versatile RAG chatbot capable of delivering information and engaging in emotion-aware casual conversations.
Developed an intelligent dialog management system for effective user interaby 35% while maintaining resource efficiency.
Network-based Intrusion Detection System (NIDS) [code] [results]
Skills: PyTorch, TensorFlow, Data Analysis, Deep Learning, Python, Flask Research Assistant, Dr. Hongxin Hu
Engineered an Intrusion Detection System using deep neural detectors to efficiently identify and respond to potential security threats in the network, mitigating the risk of data loss and downtime.
Conducted analysis on 12 network attack datasets with 4 deep neural detectors, identified method effectiveness and limitations.
Conducted comprehensive analysis on 12 different network attack datasets and evaluated the performance of 4 deep neural detectors. Reported insights into the limitations and effectiveness of methods.

LLM-Powered SQL DB agent [code] [slides]
Skills: Lang Chain, OpenAI, LLM, Python, SQL, APIs, SQL Alchemy
Designed, and implemented an LLM-powered SQL Database Agent, enabling intuitive natural language interactions with SQL databases
Seamlessly integrated Lang Chain to extract comprehensive table descriptions and contextual information directly from SQL databases. This context was then leveraged to enhance the generative capabilities of the OpenAI model.
Introduced a robust query execution layer within the agent, proficiently managing SQL query execution and proficiently handling database validation errors to ensure accurate and reliable query results.
Text Chat Application [Code] [Live Demo]
Skills: C++, Networking, Linux, Socket Programming, Protocols, Chat Application, Server, Client
Developed a client-server chat application following the conventional client-server model, providing the capability for numerous clients to log in, establish their identity, and communicate with each other via the central server.
Utilized socket programming to establish robust connections between the server and clients, ensuring reliable message trans- mission. Implemented a buffering system to manage message storage and retrieval, particularly for clients who were offline at the time of message receipt
Conducted comprehensive testing of the application in a live production environment to assess its accuracy and efficiency, guaranteeing that it met the performance requirements and functioned reliably under real-world conditions.
Intelligent Search for Offers [Code] [Live Demo]
Skills: Streamlit, Sentence Transformers, Rerankers, Python, Pandas
Independently designed and developed the Offers Search Engine, implementing hybrid semantic and exact search techniques for improved offer retrieval
Spearheaded the integration of search and reranking pipelines, resulting in a 40% more accurate search and 70% reduction in retrieval times for extensive datasets.
Successfully deployed the service with a user-friendly interface using Streamlit, enhancing user engagement and ensuring a seamless experience for offer searches.
Keywords: cprogramm cplusplus artificial intelligence machine learning access management database information technology Massachusetts

To remove this resume please click here or send an email from [email protected] to [email protected] with subject as "delete" (without inverted commas)
[email protected];2732
Enter the captcha code and we will send and email at [email protected]
with a link to edit / delete this resume
Captcha Image: