Home

Sai Monica - Sr. Data Engineer / AI Engineer (Azure, Databricks, Snowflake, PySpark,, LLMs)
[email protected]
Location: Austin, Texas, USA
Relocation: Yes
Visa: H1B
MONICA RAMINENI
Data Engineer | AI Engineer
| https://www.linkedin.com/in/monica-ramineni-440911135/ | Austin, TX
PROFESSIONAL SUMMARY
Data Engineer & AI Engineer with 8+ years of experience building scalable, high-performance data pipelines and analytics solutions that drive enterprise decision-making. I am skilled in end-to-end ETL/ELT engineering using SSIS, Informatica PowerCenter/IICS, Azure Data Factory, and Azure Databricks with deep proficiency in SQL, Python, PySpark, Snowflake, SQL Server, and Azure Synapse Analytics across cloud, on-premises, and hybrid environments. Delivers trusted, analytics-ready data products through Power BI, DAX, and Power Query backed by a strong focus on data quality frameworks, governance, compliance standards, and end-to-end data lineage. On the AI side, currently engineering LLM evaluation pipelines, agentic workflows, and RAG-grounded AI systems at Hireko.ai spanning multi-agent orchestration, MCP server integrations, A2A Protocol, and AI safety guardrails bridging enterprise data engineering with production-grade AI.
CORE COMPETENCIES
Scalable Data Pipeline Design Data Quality & Governance
ETL/ELT Architecture & Development Data Warehousing & Cloud Architecture
High-Volume Data Processing Pipeline Health & Performance Monitoring
Cloud & On-Premises Data Integration Compliance & Security Controls
Data Modeling & Schema Design Cross-Team Collaboration & Influence
Analytics & BI Delivery Power BI Report Development & Optimization
Azure Synapse Analytics Agentic AI, RAG & Multi-Agent Systems
LLM Evaluation & AI Guardrails Enterprise Data Integration

TECHNICAL SKILLS
ETL & Data Integration: SSIS, Informatica PowerCenter, Informatica IICS/CDI, Azure Data Factory (ADF), Azure Databricks
Databases & Data Platforms: Snowflake, Azure SQL Database, SQL Server, Azure Data Lake Storage Gen2, Azure Synapse Analytics
Languages: SQL, T-SQL, Python, PySpark, SparkSQL, DAX
BI & Reporting: Power BI, DAX, Power Query, Power Automate, Pipeline Health Dashboards, KPI Monitoring, Microsoft Excel
Orchestration & Streaming: Apache Airflow, Apache Kafka, Control-M, SQL Server Agent
AI & Agent Frameworks: LangChain, LangGraph, LangSmith, OpenAI Agents SDK, MCP, A2A Protocol, OpenAI, LLMs, Prompt Engineering, Generative AI
RAG & Vector DBs: Qdrant, OpenAI Embeddings, Cohere Embeddings, BM25, Multi-Query Retrieval, Ensemble Retrieval
AI Safety & Evaluation: RAGAS-style metrics, LangSmith evaluators, PII detection, jailbreak prevention, guardrails
Tools & Practices: Git, SSMS, Postman, Jupyter, Visual Studio, ALM Toolkit, Agile/Scrum
PROFESSIONAL EXPERIENCE
AI / Data Engineer Aug 2025 Present
Hireko.ai | USA
Built an LLM-based candidate assessment pipeline combining semantic similarity, concept coverage, and factual validation designing end-to-end data flows from grounding through scoring and structured storage in DynamoDB.
Integrated Google Vertex AI Search grounding to generate real-time reference material, improving scoring accuracy, and reducing hallucinations in pipeline outputs.
Implemented skill-level scoring by aggregating question-level evaluations and benchmarking against human ratings using RMSE, MAE, and correlation preparing data for prescriptive modeling use cases.
Applied modular pipeline architecture (grounding, scoring, semantic analysis, coverage checks) with structured logging across all workflow steps for full observability, auditability, and reliability.
Monitored DynamoDB read/write patterns and tuned pipeline for cost and latency efficiency; partnered with stakeholders to iteratively refine data models and scoring logic across multiple skills.
Architected a multi-agent system with specialized agents handling distinct pipeline stages transcript processing, grounding retrieval, scoring, and orchestration enabling modular, independently testable workflows with clean separation of concerns and reliable end-to-end assessment execution.
Designed and deployed Power BI dashboards sourcing structured assessment data from DynamoDB surfacing pipeline usage trends, assessment quality metrics, and scoring performance to enable stakeholders to track AI system health and drive data-driven decisions on model and rubric improvements.
Environment: Python, LLMs, Prompt Engineering, Google Vertex AI Search, AWS DynamoDB, REST APIs, Embeddings, Evaluation Metrics, Power BI, Git
Data Engineer / Analyst Nov 2021 Dec 2024
Farmers Insurance / Accenture | India
Built and delivered scalable Azure-native ETL/ELT pipelines using ADF, Azure Databricks, Azure Synapse Analytics, Azure SQL Database, and ADLS Gen2 processing high-volume data from ingestion through transformation to analytics-ready outputs.
Built scalable PySpark pipelines in Azure Databricks to process high-volume enterprise data optimized using partitioning, caching, and broadcast joins for consistent performance and reliable downstream delivery.
Managed end-to-end data integration workflows across ingestion, transformation, and publishing layers maintaining data lineage, transformation logic, and pipeline documentation for full traceability.
Enforced data quality through validation, profiling, reconciliation, and cleansing; established governance standards, stewardship practices, and compliance controls including Row-Level Security and role-based access.
Built Power BI semantic models, KPI dashboards, and pipeline health views with DAX, Power Query, and RLS automated report refresh and distribution using Power Automate and implemented Incremental Refresh to reduce refresh times and improve performance.
Conducted ad hoc SQL and Power BI analysis on Snowflake to uncover trends, data anomalies, and operational inefficiencies translating findings into actionable insights for business stakeholders.
Drove SIT/UAT/Production validation, root cause analysis, and Dev/Test/Prod pipeline promotions using Git and ALM Toolkit within Agile/Scrum delivery.
Environment: Azure Data Factory, Azure Data Lake Storage Gen2, Azure Databricks, Azure Synapse Analytics, Azure SQL Database, PySpark, Snowflake, Power BI, DAX, Power Query, Power Automate, Git, ALM Toolkit, Rally
Data Engineer May 2018 Nov 2021
XPO Logistics / Infosys | India
Designed and maintained end-to-end SSIS ETL workflows across heterogeneous sources flat files, CSV, Excel, and relational databases ensuring reliable, repeatable data movement with strong source-to-target integrity.
Scheduled, monitored, and troubleshot SSIS packages to ensure timely data delivery proactively identified and resolved pipeline failures, data anomalies, and performance bottlenecks.
Developed and maintained Informatica PowerCenter assets mappings, mapplets, sessions, and workflows leveraging core transformations (Joiner, Lookup, Aggregator, Expression, Router) to implement complex business rules; contributed to PowerCenter to IICS/CDI migration covering mapping redesign, validation, and production cutover.
Authored and optimized SQL stored procedures, views, and complex queries to support ETL processing, data reconciliation, and analytics-ready datasets.
Orchestrated and monitored ETL jobs using Control-M and SQL Server Agent performed root cause analysis on failures, implemented corrective actions, and tuned workflows to improve throughput and SLA adherence.
Built Power BI dashboards and reports for operational data analysis investigated and resolved data discrepancies, traced issues back to source systems, and validated fixes end-to-end.
Applied data validation, profiling, unit testing, and cleansing across pipeline stages to ensure accuracy, completeness, and end-to-end data integrity from source to target; delivered within Agile/Scrum sprints.
Environment: SSIS, Informatica PowerCenter, Informatica IICS/CDI, SQL Server, T-SQL, Control-M, SQL Server Agent, SSMS, Power BI, Git, Agile/Scrum
EDUCATION
Bachelor's Degree Computer Science and Engineering
Sri Venkateswara University College of Engineering
CERTIFICATIONS & AWARDS
Microsoft Certified: Azure Fundamentals (AZ-900)
Accenture Celebrates Excellence (ACE) The Extra Mile Award (Individual, FY22 Q2)
Accenture Celebrates Excellence (ACE) Shared Success Catalyst Award (Team, FY23 Q1)
Anthropic Academy Certified (17 courses, 2025) Claude, MCP, Agentic AI, Generative AI, Cloud Integrations
AI Makerspace Certified AI Engineer (AIE7, 2025)
KEY PROJECTS
CareBridge Healthcare AI Capstone (AI Makerspace AIE7): AI-powered health assistant delivering evidence-based medical guidance grounded in trusted sources (Mayo Clinic, FDA, CDC, NIH, WHO); features triage indicators, lab report ingestion, privacy-safe architecture (no PHI stored), and an enterprise edition with provider/patient dashboards and auditable point-of-care summaries.
Stack: LangGraph, LLMs, Multi-Agent Systems, Qdrant, Advanced RAG Retrieval | GitHub: https://github.com/Monica-Ramineni/CareBridge
Keywords: artificial intelligence access management business intelligence active directory Arizona Texas

To remove this resume please click here or send an email from [email protected] to [email protected] with subject as "delete" (without inverted commas)
[email protected];7204
Enter the captcha code and we will send and email at [email protected]
with a link to edit / delete this resume
Captcha Image: