PREETHI .A - Data Engineer |
[email protected] |
Location: Orlando, Florida, USA |
Relocation: opt |
Visa: OPT |
PREETHI AMBATI
(689) 208-7912 | [email protected] PROFESSIONAL SUMMARY: Skilled Data Engineer with around 4 years of experience in designing, developing, and maintaining data infrastructure. Proficient in data warehousing, ETL processes, and big data technologies. Adept at collaborating with cross-functional teams to deliver data-driven solutions. Proficient in leveraging cloud platforms, data lakes, and business intelligence tools to design and implement robust data models and solutions. EDUCATION Webster University Master of science in information technology and management JNTU Bachelors of Electronics and Communication Engineering EXPERIENCE: MYRIAD GENETICS, Salt Lake City, UT Data Engineer and Analyst Dec 2022 Present Responsibilities As Data Engineer I have taken initiative and ownership to provide business solutions on time. Created High level technical design documents and Application design documents as per the requirements and delivered clear, well-communicated and complete design documents. Implemented Azure Data Factory (ADF) extensively for ingesting data from different source systems like relational and unstructured data to meet business functional requirements. Design and developed Batch processing and real-time processing solutions using ADF, Databricks clusters and stream Analytics. Created pipelines in Azure using Azure Data Factory v2 to get the data from disparate source systems by using different Azure Activities like Move & Transform, Copy, filter, for each, Databricks etc. Maintain and provide support for optimal pipelines, data flows and complex data transformations and manipulations using ADF and PySpark with Databricks. Automated jobs using different triggers like Events, Schedules and Tumbling in ADF. Created, provisioned different Databricks clusters, notebooks, jobs and auto scaling. Created Databricks Spark jobs with Pyspark to perform several tables to table operations. Performed data flow transformation using the data flow activity. Implemented Azure, self-hosted integration runtime in ADF. Developed spark programming code in Python Databricks workbooks. Improved performance by optimizing computing time to process the streaming data by optimizing the cluster run time. Developed Power BI reports using power query as a feed from SQL Server & different Data sources. Ensure data accuracy, integrity, and reliability of both in back-end Power BI reports. Created Linked services to connect the external resources to ADF. Wrote complex SQL queries including joins, correlated sub queries, scalar sub queries, views, Stored Procedures and Triggers Ensured the developed solutions are formally documented and signed off by business. Worked with team members to resolve any technical issue, Troubleshooting, Project Risk &Issue identification, and management. Worked on the cost estimation, billing, and implementation of services on the cloud. Environment: Azure Cloud, Azure Data Factory (ADF v2), Azure functions Apps, Azure Data Lake, BLOB Storage, SQL server, Windows remote desktop, AZURE Power Shell, Data bricks, Python, Azure SQL Server, Azure Data Warehouse. Jr Data Engineer April 2021 July 2022 AON CONSULTING LTD, Hyderabad, India Responsibilities: Worked as Data Analyst, Analyzed data quality issues against source system and prepared a data quality document confirming all the source data quality. Worked with the application and Business Analyst team to develop requirements. Involved in Data profiling and performed Data Analysis based on the requirements, which helped in catching many Sourcing Issues upfront. Used SQL tools to run SQL queries and validate the data loaded into the target tables. Created tables, views, sequences, indexes, constraints and generated SQL scripts for implementing physical data model. Used AWS services EC2 and S3 for small data sets processing and storage. Designed and developed architecture for data services ecosystem spanning Relational, NoSQL, and Big Data technologies. Involved in extensive data validation by writing several complex SQL queries. Involved in back-end testing and worked with data quality issues. Created dimensional model for reporting system by identifying required dimensions and facts using Erwin. Worked and extracted data from various database sources like XML and Flat files into the Data Stage. Consumed XML messages using Kafka and processed the xml file using Spark streaming to capture UI updates. Evaluated data profiling, cleansing, integration and extraction tools. Involved in extensive Data validation by writing SQL queries and involved in back-end testing and worked with data quality issues. Responsible for the data architecture design delivery, data model development, review, approval and Data warehouse implementation. Involved in designing and developing Data Models and Data Marts that support the Business Intelligence Data Warehouse. Involved in extensive data validation by writing several complexes SQL queries and involved in back-end testing and worked with data quality issues. Created SQL tables with referential integrity, constraints and developed queries using SQL, SQL*PLUS and PL/SQL. Performed GAP analysis of current state to desired state and document requirements to control the gaps identified. Managed all indexing, debugging and query optimization techniques for performance tuning using T-SQL. Generated comprehensive analytical reports by running SQL queries against current databases to conduct data analysis. Wrote T-SQL statements for retrieval of data and involved in performance tuning of T- SQL queries and Stored Procedures. Developed pre-processing job using Spark Data frames to flatten Json documents to flat file. Developed Dashboard reports on Tableau. Environment: SQL, PL/SQL, SAS, Microsoft Excel 2010, T-SQL, triggers, Stored Procedures. TECHNICAL SKILLS Programming Languages: Python, SQL, Java Big Data Technologies: Apache Kafka, Apache Spark, Hadoop Data Warehousing: AWS Redshift, Snowflake ETL Tools: Apache Airflow, Talend Data Visualization: Power BI, Tableau Cloud Platforms: AWS Version Control: Git Keywords: user interface business intelligence sthree procedural language Utah |