Lavanya Esireddy - Data Engineer |
[email protected] |
Location: Queens, New York, USA |
Relocation: Yes |
Visa: H1B |
PROFESSIONAL SUMMARY
Over 9 years of professional experience in information technology as AWS Data Engineer with an expert hand in the areas of Database Development, ETL Development, Data modelling, Report Development and Big Data Technologies. Experience in branching, merging, and maintaining versions using tools like Bitbucket. Performed and deployed builds for SIT, UAT and Production environments. Experience on Migrating SQL database to Azure Data Lake (ADL), Azure data lake Analytics, Azure SQL Database, Data Bricks and Azure SQL Data warehouse and Controlling and granting database access. Experience in OLTP/OLAP system study, Analysis, and modelling, developing data warehouse schemas like Star and Snowflake Schema used in dimensional and multi-dimensional modelling Expertise in Object Oriented Programming (OOPS) concepts in C# Abstraction, Encapsulation, Inheritance and Polymorphism. C# Used Attributes, Delegates, Generics, Collections, Multithreading and Exception Handling. Experience with all phases of the software development lifecycle (SDLC) and project methodologies performed Tableau type conversion functions when connected to relational and data sources. Expert level understanding in Application Design, Development and testing in Mainframe environments using PL/1, COBOL, EGL, Easytrieve, DB2, JCL, QC & VAG. Expertise in data analysis and development using SQL Server Reporting Services (SSRS), SQL Server Integration Services (SSIS) and SQL Server Analysis Services (SSAS) Developed drill down and drill through reports from multidimensional objects like star schema and Snowflake schema using SSRS and performance point server. Experienced in Automating, Configuring, and deploying instances on AWS, Azure environments and Data centers, also familiar with EC2, Cloud watch, Cloud Formation and managing security groups on AWS. Experience in VB, C#.NET, ASP.NET and ADO.NET worked in implementation, roll-off, upgrade and maintenance projects Experience SQL BI developing interactive drill down reports using slicers and dicers in Power BI and Power Pivot and hands on experience in embedding Tableau reports and dashboards on external websites using Java API scripting Experience modelling and manipulating data using Python 3.12/3.11, Alteryx and Informatica Tableau reporting administrative knowledge of ALM workflows, maintenance and customization Single handed Built and designed a whole Information extraction bot POC for KYC extraction. This bot is using adaptive learning techniques and uses some custom supervised classifiers for entity and relation extraction. Extensively worked on Python 3.12/3.11 (NumPy 1.26, Pandas, Matplotlib, Tensor flow, NLTK and Scikit-learn) having Hadoop/Big Data related technology experience in Storage, Querying, Processing and analysis of data Experience building solutions for enterprises, context-awareness, pervasive computing, and/or application of machine learning Experience with Requests, Report Lab, NumPy, SciPy, Pytables, cv2, imageio, Python-Twitter, Matplotlib, HTTPLib2, Urllib2, Beautiful Soup, Data Frame and Pandas python libraries during development lifecycle. Experience in Machine Learning, Statistics, Regression- Linear, Logistic, Poisson, Binomial can help you build futuristic AI bots to assist/replace human in various business domains. Good experience in designing and scheduling complex SSIS 16.0/15.0/14.0 Packages for Data Migration from various sources like MS SQL Server 2022/2019/2017/2016/2014, Oracle Database and Excel. Handling on experience in data mining algorithms and approach Comfortable presenting to senior management, business stakeholders, and external partners. Good at algorithm and design techniques fluency in modern programming languages such as Java comfortable presenting to senior management, business stakeholders, and external partner s architecture Design of reusable server components for the web as well as Mobile applications strong programming expertise Python 3.12/3.11 and strong in Database SQL. Proficient in Python 3.12, experience building, and product ionizing end-to-end systems solid coding and engineering skills in Machine Learning Design and develop Solutions using C#, ASP.NET Core, Web API, Microsoft Azure techniques . Experience with file systems, server architectures, databases, SQL, and data movement (ETL) knowledge of Information Extraction, NLP algorithms coupled with Deep Learning Experience in logical and physical data modelling using ER Studio and Microsoft Visio proficient in requirement analysis, system design, and Object-Oriented Design (UML). Proficient in Database performance optimization, debugging and tuning using the Query Analyzer, SQL Profiler and MS SQL Server 2022/2019/2017/2016/2014 Debugger and data viewers and performance tuning of ETL data flows. Experience in using Jira service Desk as a ticketing system to resolve end users tickets and using Jira as part of an Agile Frame to manage team projects. TECHNICAL SKILLS ETL Informatica Power Center 10.x/9.6/9.1, AWS Glue, Talend 5.6, SQL Server Integration Services (SSIS) Databases & Tools MS SQL Server 2014/2012/2008, Teradata 15/14, Oracle 11g10g, SQL Assistant, Erwin 8/9, ER Studio Cloud Environment AWS Snowflake, AWS RDS, AWS Aurora, Redshift, EC2, EMR, S3, Lambda, Glue, Data Pipeline, Athena, Data Migration Services, SQS, SNS, ELB, VPC, EBS, RDS, Route53, Cloud Watch, AWS Auto Scaling, Git, AWS CLI, Jenkins, Microsoft Azure, Google Cloud Platform (GCP) Reporting Tools Tableau, PowerBI Big Data Ecosystem HDFS, Map Reduce, Hive/Impala, Pig, Sqoop, Hbase, Spark, Scala, Kafka. Programming languages Unix Shell Scripting, SQL, PL/SQL, C#, Perl, Python, T-SQL Data Warehousing & BI Star Schema, Snowflake schema, Facts and Dimensions tables, SAS, SSIS, and Splunk Languages SQL, PL/SQL, JavaScript, HTML 5, XML, C, T-SQL, Python 3.12/3.11 Operating System Windows, UNIX (HP-UX, Solaris, AIX), Linux. PROFESSIONAL EXPERIENCE Discovery Financial, Silver Spring, MD Apr 2022 Till date AWS Data Engineer Responsibilities: A highly immersive Data Science program involving Data Manipulation & Visualization, Web Scraping, Machine Learning, Python 3.12, programming, SQL, GIT, Unix Commands, NoSQL, MongoDB, AWS, Hadoop 3.3. Developing Voice Bot using AI (IVR), improving the interaction between Human and the Virtual Assistant on different data formats such as JSON, XML and performed machine learning algorithms in Python 3.12. Creating various database objects (tables, indexes, views, stored procedures and triggers) and implement referential integrity constraints for enforcing data integrity and business rules Advanced proficiency in SQL, including complex query optimization and performance tuning within Snowflake environments. Implement ETL pipelines for ingesting and transforming large volumes of data from various sources into Snowflake, using combination of Scala, Python Scripts and Snowflake SnowSQL and SnowPipe. Experience in deploying and managing data models in cloud environments, ensuring high availability and scalability. Write a back-end logic by choosing C# language with Asp.net Core Web API. Familiarity with functional programming paradigms and object-oriented programming in Scala. Performing T-SQL tuning and optimization of queries that take longer execution time using MS SQL Profiler and Database Engine Tuning Advisor. In-depth knowledge of Snowflake's architecture and features, with hands-on experience in implementing best practices for optimal performance and cost-efficiency Built and maintained scalable ETL pipelines using Scala for data ingestion, transformation, and storage. Designing and develop Power BI graphical and visualization solutions with Python 3.12 business requirement documents and plans for creating interactive dashboards Conducting performance tuning and optimization of Snowflake's database objects, resulting in improved query performance and reduced latency. Developed scalable data solutions for processing terabytes of data daily using Spark with Scala and PySpark. Extensively worked on ETL process to load the data from/into to Oracle, Teradata 17.2, DB2, Flat files, etc. Experience in OLTP/OLAP system study, Analysis, and modelling, developing data warehouse schemas like Star and Snowflake 7, Schema used in dimensional and multi-dimensional modelling Successfully designed and implemented data solutions using Snowflake, leveraging its capabilities to manage large-scale data environments. Expertise in schema design, normalization, and optimization for Snowflake databases, ensuring efficient data storage and retrieval. Designing and developed a scalable data warehouse solution using Snowflake, enabling efficient data storage and retrieval for analytics purposes. Involving in designing, developing, and testing of the ETL strategy to Java, AWS populate the data from various source systems using SSIS 16.0 Designing streaming data flow and developed StreamSets pipelines to consume data in JSON format from KAFKA origin and ingested the data in HDFS data lake and Snowflake database of MD THINK cloud data repository. Ensured data solutions are secure, scalable, and compliant with industry standards. Perform Data Cleaning, features scaling, features engineering using pandas and numpy packages in python. Architected and implemented end-to-end ETL pipelines using Scala and PySpark to handle large volumes of streaming and batch data. Working with Tabular reports, Matrix reports, Gauges & Chart reports, Parameterized reports, Sub reports, Ad-hoc reports, environment using SQL Server Reporting Services SQL BI, SSIS 16.0, SSRS 16.0, ETL Process. Performing Data mapping between source systems to local systems, logical data modelling, created class diagrams and ER diagrams and used SQL queries to filter data. Conducted regular performance assessments and implemented improvements based on analysis. Involved and worked on Python Open stack API's and used several python libraries such as wxPython and NumPy Reverse engineered backend database to change the T-SQL scripts, create Views, Stored Procedures, Triggers and Functions to improve performance drastically. Participating in all phases of data mining; data collection, AWS data cleaning, developing models, validation, visualization and performed Gap analysis. Worked collaboratively with cross-functional teams, including data analysts, data scientists, and business stakeholders, to deliver high-quality data solutions. Implemented complex data cleaning, transformation, and aggregation tasks using PySpark Implementing Agile Methodology for building an internal application Data Manipulation and Aggregation from different source using Nexus, Toad, Business Objects, Power BI and Smart View. Expert using Business Intelligence tools - Microsoft SSRS 16.0, SSIS 16.0 and SSAS 16.0, Visual Studio, Informatics Power. Developing MS SQL Server 2022 and Oracle 12c Database development experience using tables, triggers, views, packages and stored procedures in PL/SQL & Postgre SQL, Strong RDBMS fundamentals Experienced working in the Agile methodology and data viewers and performance tuning of ETL data flows involved in creating VB.Net Script for Data Flow and Error Handling using Script component in SSIS 16.0. Environment: NumPy 1.26, SSRS 16.0, SSIS 16.0, SSAS 16.0, MDM, Scala. GIT, Unix, C#, Python 3.12, SAS, Hadoop 3.3, Teradata 17.2, NoSQL, OLTP, OLAP, HDFS, ODS, NLTK, SVM, JSON, XML. ER Studio 9.7, Tableau, AWS, PySpark, Snowflake 7, Oracle 12c, MS SQL Server 2022, Humana, Louisville, KY Nov 2020 Mar 2022 Data Engineer - Tableau Developer Responsibilities: Independently coded new programs and designed Tables to load and test the program effectively for the given POC's using with Big Data/Hadoop 3.2. Experience on Cloud Databases and Data warehouses (SQL Azure and Confidential Redshift/RDS) with additional version control tools such as Git, Subversion (SVN), Perforce, TFS and UCM ClearCase. Designed data models and data flow diagrams using Erwin 10 and MS Visio research on improving IVR used internally in J&J performed data cleaning and imputation of missing values using. Used various python ETL modules and packages. Developed several Classes using C# and experienced in creating Assemblies and Name Spaces. Worked extensively on modifying and updating existing oracle 10g code including object types, views, PL/SQL stored procedures and packages, functions and triggers based on business requirements Documented the installation process for setup and recovery to the new environment for the technical support team created MS SQL server 2019 reports and developed Postgre SQL query for generating drill down reports using SSRS 15.0 Configured object-level security using the system-defined roles and custom roles Worked in the migration process to move SSIS 15.0 packages for SQL server 2019 from one data Centre to another data Centre Involved in the complete Software Development Life Cycle (SDLC), Azure process by analyzing business requirements and understanding the functional workflow of information from source systems to destination systems Designed and implemented complex SSIS 15.0 package to migrate data from multiple data sources for data analyzing, deploying and dynamic configuring of SSIS packages Developed IVR for clinics so that the callers can receive anonymous access to test results worked with Hadoop eco system covering HDFS, HBase, YARN and MapReduce requests based on different departments and locations. Utilized Spark, Scala, Hadoop 3.2, HBase, Kafka, Spark Streaming, MLlib, Python 3.11, a broad variety of machine learning methods including classifications, regressions, dimensionally reduction etc. Created S3 buckets and Snow pipe for data loading to load CSV and JSON Used data joining and blending and other advanced features in Tableau 8.0 on various data sources like MySQL Tables and Flat files monitor, tune, and analyse database performance and allocate server resources to database performance Developed Python ETL jobs to migrate data from one database to another Provides input and recommendations on technical issues to BI Engineers, Business & Data Analysts and Data Scientists. Developed, Implemented & Maintained the Conceptual, Logical & Physical Data Models using Erwin 10 for Forward/Reverse Engineered Databases. Created and configured roles and access management in snowflake. Development and presentation of a data analytics data-hub prototype with the help of the other members of the emerging solutions team. Worked with Data governance, Data quality, data lineage, Data architect to design various models and processes perform a proper EDA, Univariate and bi-variate analysis to understand the intrinsic effect/combined effects. Created Streams, Tasks, Procedures, Stages, File formats. Developed and maintained scalable scheduled jobs to extract load and transform data, in addition to troubleshooting and fixing problems with existing processes. Developed complex Stored Procedures and Functions and incorporated them in Crystal Reports to enable report generation on the fly (SSRS). Empowered decision makers with data analysis dashboards using Tableau 8.0 and Power BI interface with other technology teams to extract, transform, and load (ETL) data from a wide variety of data sources Designed, Developed and Configured Snowflake database. Environment: QlikView, MLlib, SSAS 15.0, PL/SQL, HDFS, SSIS 15.0, JSON, Hadoop 3.2, C#, MapReduce, PIG, Spark, R Studio, Mahout, JAVA, HIVE. Erwin 10, Tableau 8.0, Oracle 10g, SSRS 15.0, Snowflake, Azure, Oracle, Teradata, Tableau (Desktop/ Prep/Sever) 2019, MS SQL Server 2019, Python 3.11, MDM. Verizon Communications, New York, NY Aug 2019 Oct 2020 Tableau Developer Responsibilities: Development of Python APIs to dump the array structures in the Processor at the failure point for debugging of the break point tool using Perl and Java User Interface Performed an end to end Informatica ETL Testing for these custom tables by writing complex SQL Queries on the source database and comparing the results against the target database. Worked on Data Pipeline to configure data loads from S3 to into Redshift did debug use breakpoints and data viewers and performance tuning of ETL data flows Created various SSIS 14.0 packages to populate the data from flat files, Excel and Access into ODS (SQL Server) performed full loads for current data and incremental uploads for Erwin 9 historical data (transaction-based data). Designed SSAS 14.0 tabular models and created dashboards and reports using SQL BI view and deployed them in Share point server 2013. Extracted the source data from Oracle tables, MS SQL Server 2017, sequential files and excel sheets predictive modelling using state-of-the-art methods Migrated Informatica mappings from SQL Server to Netezza Foster culture of continuous engineering improvement through mentoring, feedback, and metrics Implemented Event Task for execute Application Automatically broad knowledge of programming, and scripting (especially in R/Java/Python) Involved in defining the source to target data mappings, business rules, and data definitions performing data profiling on various source systems that are required for transferring data to ECH using Worked on DTS Packages, DTS Import/Export for transferring data between MS SQL Server 2017 involved in upgrading DTS packages to SSIS 14.0 packages (ETL). Designed a Multi-Dimensional Analysis (Configured OLAP Cubes, Dimensions, Measures, MDX Queries met with business users, gathered business requirements, and prepared the documentation for requirement analysis. Created centralized data warehouse (ODS) and developed de-normalized structures based on the requirements and to improve the query performance for end-users. Developed various types of reports drill down, drill through, matrix and sub reports using SSRS 14.0 created multiple dashboards with calculations and KPIs using Tableau and deployed them in Tableau server Environment: Python, ODS, OLAP, OLTP, SSAS 14.0, Informatica Power Centre 8.1. Erwin 9, Teradata 17, SQL Server 2017, Oracle 9i, SSRS 14.0, Mahout, SSIS 14.0, Hadoop 3.0, SQL*Loader, PL/SQL, Tibco software, India Dec 2013 Dec 2018 Software Developer Responsibilities: Involved in defining the source to target data mappings, business rules, and data definitions Performed forward and reverse engineering for database development and design using ER Win. Determined regression model predictors using Correlation matrix for Factor analysis in R built Regression model to understand order fulfilment time lag issue using Scikit-learn in Python Managed the project requirements, documents and use cases by IBM Rational Requisite Pro assisted in building an Integrated Logical Data Design, propose physical database design for building the data mart. Document all data mapping and transformation processes in the Functional Design documents based on the business requirements establish standards of procedures generate weekly and monthly asset inventory reports. Calculated and analyzed claims data for provider incentive and supplemental benefit analysis using Microsoft Access and Oracle SQL. Designed and created distributed reports in multiple formats such as Excel, PDF, XML, HTML, and CSV using SQL Server Reporting Services (SSRS) Analyzed business requirements and build logical and physical data models that describe all the data and relationships between the data using Erwin. Responsible for defining the key identifiers for each mapping/interface Gather & Review Customer Information Requirements for OLAP and building the data mart Enterprise Metadata Library with any changes or updates Analyzed business process workflows and assisted in the development of ETL procedures for mapping data from source to target systems. Expertise in applying data mining techniques and optimization techniques Snowflake in B2B and B2C industries and proficient in Machine Learning, Data/Text Mining, Statistical Analysis and Predictive Modeling. Developed drill down and drill through reports from multidimensional objects like star schema and python schema using SSRS and performance point server. Used SSIS and T-SQL stored and Work in Power BI procedures to transfer data from OLTP databases to the staging area and finally transfer into data marts and performed the action in XML and performance tuning of ETL data flows. Created various reports for analysis of revenue, claims, customer interactions and shipping data using SQL Server Reporting Services, Report Builder. Environment: Teradata, SQL Server, Hadoop, SAP, ETL operations, OpenStack, Data Warehousing, Data Modelling, Cassandra, EC2, S3, Advanced SQL methods, NiFi, Python, Linux, Scala, Spark-SQL, HBase. EDUCATION Bachelors in Computer Science from Loyola Academy, India 2013 Masters in Computer Science from St. Francis College, USA 2020 Keywords: cprogramm csharp artificial intelligence user experience business intelligence sthree active directory rlang hewlett packard microsoft procedural language Delaware Kentucky Maryland New York |