haritha d - Sr Data Engineer |
[email protected] |
Location: Argonia, Kansas, USA |
Relocation: yes |
Visa: GC |
Haritha
Data Engineer 423-203-1314 [email protected] PROFESSIONAL SUMMARY IT Professional with 8+ years of experience in Data Engineering, Development, Maintenance, Testing in RDBMS, Data warehouse and Data Cloud environments. Worked on Business Requirements gathering, Analysis, System study, Prepare Functional & Technical specifications, Design (Logical and Physical model), Coding, Testing, Code migration. Implementation, System maintenance, Support, and Documentation Have worked with variant data types parsing and loading XML&JSON data from AWS S3 to Snowflake Worked on Snowpipe for continuous Data Loads .Streams& Tasks for implementing CDC and running the schedules. Experience in Data sharing, Procedures, and Materialized Views in Snowflake. Experience in PL/SQL Programming. Have Stored Procedures, Functions Experience working with Cursors, REF Cursors & Dynamic SQL. Designing and Developing Azure Data Factory (ADF) extensively for ingesting data from different source systems like relational and non-relational to meet business functional requirements. Designed and Developed event-driven architectures using blob triggers and Data Factory. Creating pipelines, data flows, and complex data transformations and manipulations using ADF and PySpark with Data bricks. Automated jobs using different triggers like Events, Schedules, and Tumbling in ADF. Created, and provisioned different Data bricks clusters, notebooks, jobs, and autos caling. Ingested huge volume and variety of data from disparate source systems into Azure Data Lake Gen2 using Azure Data Factory V2. Created several Data bricks Spark jobs with Py Spark to perform several tables-to-table operations. Performed data flow transformation using the data flow activity. Implemented Azure, self-hosted integration runtime in ADF Written complex SQL Queries for Analyzing Data in different Databases like Oracle, Teradata, SQL Server Developed Python scripts for Change Detection and automated Database changes for one of the Application Worked on Python Scripting for Accessing the AWS services like Param Store, Secret Manager and S3 Experience working on Shell scripting creating the wrapper scripts for Orchestration. Experience working with Docker, AWS EKS(Elastic Kubernetes Services) and accessing the services. Expertise in Troubleshooting ETL workflows, SQL issues, Access issues and Server Issues, monitor batch cycles analyze failures, fixing the issues or escalate the problem Assist in data management deliverables such as business need analysis, Data source identification and analysis and data modeling techniques Worked on Jenkins, Bit Bucket, GitHub and Jira. Experience in Implementing scalable designs and software engineering excellence practices Assess and research the current implementation of the platforms and define the course of action for modernization Written complex SQL queries and handled the projects in Agile environment. Solved the technical issues related to implementation. Identify and recommend appropriate continuous improvement opportunities TECHNICAL SKILLS Programming Languages & Scripting : SQL, PL/SQL, JavaScript, Python, Unix Shell Scripting Databases : Oracle, Teradata, SQL Server, Snowflake Cloud Technologies : AWS: EC2, S3, IAM, Param Store, Secret Manager, Redshift, Glue, Lambda, CloudWatch ,Azure : Azure Data Factory, Azure Data Lake, Azure Event Hub, Azure Stream Analytics, Azure Application Gateway, Azure Storage, Azure Traffic Manager, Azure Active Directory,GCP Data Engineering & ETL Tools :Azure Data Factory (ADF),Databricks (Spark SQL, PySpark),Apache Spark,Informatica Containerization & Orchestration : Docker, AWS EKS (Elastic Kubernetes Service) Version Control & CI/CD : GitHub, Bitbucket , Jenkins Project Management & Issue Tracking : Jira Other Tools & Technologies : Control-M (Job Scheduling),Snowpipe (Continuous Data Loading), Streams & Tasks (CDC Implementation),Data Sharing, Procedures, Materialized Views in Snowflake Areas of Expertise : Data Warehousing, Data Lake,Data Modeling (Logical and Physical),Data Migration (On-premises to Cloud),Event-Driven Architectures,Complex Data Transformations and Data Flow Management,ETL/ELT Processes,Data Quality and Governance,Troubleshooting and Performance Tuning,Agile and Waterfall Methodologies,On-Call Production Support PROFESSIONAL EXPERIENCE ; Client - Mobile,Bellevue, WA APR 2023 TILL NOW Senior Data Engineer Responsibilities: Meetings with business/user groups to understand the business process, gather requirements, analyze, design, development, and implement according to client requirements. Designing and Developing Azure Data Factory (ADF) extensively for ingesting data from different source systems like relational and non-relational to meet business functional requirements. Designed and Developed event-driven architectures using blob triggers and Data Factory. Creating pipelines, data flows, and complex data transformations and manipulations using ADF and PySpark with Databricks. Automated jobs using different triggers like Events, Schedules, and Tumbling in ADF. Created, and provisioned different Databricks clusters, notebooks, jobs, and autoscaling. Ingested huge volume and variety of data from disparate source systems into Azure Data Lake Gen2 using Azure Data Factory V2. Created several Databricks Spark jobs with PySpark to perform several tables-to-table operations. Performed data flow transformation using the data flow activity. Implemented Azure, self-hosted integration runtime in ADF. Developed streaming pipelines using Apache Spark with Python. Created, provisioned multiple Databricks clusters needed for batch and continuous streaming data processing and installed the required libraries for the clusters. Improved performance by optimizing computing time to process the streaming data and saved cost to the company by optimizing the cluster run time. Perform ongoing monitoring, automation, and refinement of data engineering solutions. Designed and developed a new solution to process the NRT data by using Azure stream analytics, Azure Event Hub, and Service Bus Queue. Created Linked service to land the data from SFTP location to Azure Data Lake. Extensively used SQL Server Import and Export Data tool. Working with complex SQL views, Stored Procedures, Triggers, and packages in large databases from various servers. Experience in working on both agile and waterfall methods in a fast pace manner. Generating alerts on the daily metrics of the events to the product people. Extensively used SQL Queries to verify and validate the Database Updates. Suggest fixes to complex issues by doing a thorough analysis of the root cause and impact of the defect. Provided 24/7 On-call Production Support for various applications and provided resolution for nighttime production jobs, attend conference calls with business operations, system managers for resolution of issues. Client : Walmart, Bentonville, Arkansa MAY 2022 APR 2023 Role : Data Engineer Responsibilities: Designed, deployed, maintained, and lead the implementation of Cloud solutions using Microsoft Azure and underlying technologies Migrating Services from On-premises to Azure Cloud Environments. Collaborate with development and QA teams to maintain high-quality deployment Deployed the initial Azure components like Azure Virtual Networks, Azure Application Gateway, Azure Storage, and Affinity Groups. Conducted capacity planning and architecture of Storage Accounts. Configured Azure Traffic Manager to build routing for user traffic Infrastructure Migrations: Drive Operational efforts to migrate all legacy services to a fully Virtualized Infrastructure. Prepared capacity and architecture plan to create the Azure Cloud environment to host migrated IaaS VMs and PaaS role instances for refactored applications and databases. Installing the Applications Insights tool on the web services and configuring an Application Insight workspace in Azure. configured Application Insights to perform web tests and alerts. Build Transformations using Data bricks, Spark SQL, Scala/Python stored into ADLS Refine area and stored in azure blobs Implemented HA deployment models with Azure Classic and Azure Resource Manager Configured Azure Active Directory and manage users and groups. Configure continuous integration from source control, setting up build definition within Visual Studio Team Services (VSTS), and configure continuous delivery to automate the deployment of ASP.NET MVC applications to Azure web apps Deployed DSC from Azure Automation to On-premises or cloud environments. Installing IIS and ASP.NET modules, configuring server, and adding ASP.NET Applications via command line. Configured Azure Automation DSC configuration management to assign permissions through RBAC, assign nodes to proper automation accounts and DSC configurations, to get alerted on any changes made to nodes and their configuration. Configured stream analytics, Event hubs and worked to manage IoT solutions with Azure. Used PowerShell DSC to configure VMS, physical hosts. Used PowerShell DSC within Azure Automation runbooks to configure machines. Tested and deployed DSC configurations to all servers Developed automation solution using PowerShell scripting for day-to-day operational tasks Modification of Certs, Password, and Storage Accounts on the cloud platform Setting up and administering service accounts Designed and deployed cloud solutions using Microsoft Azure. Migrated services from on-premises to Azure cloud environments. Set up and configured Azure components like Virtual Networks, Application Gateway, and Storage. Planned and managed storage account capacity and architecture. Configured Azure Traffic Manager for traffic routing. Prepared architecture plans for hosting IaaS VMs and PaaS applications in Azure. Installed and configured Application Insights for web service monitoring. Built data transformations using Databricks and Spark, storing results in Azure. Managed deployment workflows, including provisioning and load balancing. Client : YASH Technologies, India APR 2019 OCT 2021 role : Data Engineer / Snowflake Data Engineer Responsibilities: Participate in daily scrum discussions, collaborate with team to define stories, discuss status and blockers Build new Data Lake environment by leveraging the cloud technologies AWS, Snowflake Find application-level optimization opportunities, do POCs, evaluate fitment of new tools and technologies. Analyze and understand the approach for data to be migrated between the on-premises databases Oracle and SQL Server to Snowflake Cloud data platform Developed reusable Stored Procedures on Snowflake for loading the type I and Type II tables Implemented the framework for Ingestion Process from Oracle to Snowflake Configure entries for Ingestion Process for extracting the data from Oracle and Ingest data into Snowflake using Python framework Utilized AWS Services, Built the docker images using Jenkins docker which runs on EKS cluster. Design and document the steps for validation of data that needs to be migrated from Oracle, SQL Server to Snowflake (Cloud Data Platform) Script/Code for movement of data between multiple Sources, Databases using Python, UNIX. Providing technical vision and leadership; hands-on technology solutions implementation to meet business requirements Collaborate with peers and the leadership team on process improvement ideas, policy and procedure enhancements, and opportunities to improve the customer service experience Provide continuous support to different teams, vendor partners for any of the support issues on pipeline or the delivery of the build Worked with Infrastructure and Network Security teams to implement Security and Authentication between Snowflake and GCP,AWS. Closed worked with Data Quality and Governance teams to provide AWS Cloud watch Logs and Query usage by users. Worked with GCP,AWS Redshift, Glue, Lambda for ETL process. Develop scripts, stored procedures, views, functions, Streams, Tasks, workflows, and complex queries Write analysis document on existing ETL, ELT load processes which helps in effort estimations in migration process Provide level of estimates, write detailed Design/Solution documents, develop ETL loading strategies along with offshore developers Craft complex, high performance SQLs in RDBMS (SQL Server, DB2, Oracle, Teradata Snowflake) distributed querying (Snowflake, Redshift, Spark SQL) and NoSQL systems. Validation of Data loads from Source to stage and then ODS and Data Mart Layers Build automated process to perform Database and ETL (Extract Transform and Load) level object dependency analysis Use troubleshooting skills and work closely with operations, support, engineering and/or other functions to ensure successful migrations. Support end to end Dataflow in the new Cloud Database/Data platform. Work with data engineers on development efforts including designing, coding, testing, and debugging during each cycle Accountable for Enhancements and Testing of New changes to the Snowflake Cloud Platform Perform database level dependency analysis on existing and new database objects, identify objects that can be decommissioned or migrated to another existing database Identify risks, actions, and issues through proactive communication and collaboration with stakeholders and various teams ensuring all stakeholders are on board Client : Infinite Locus, India JUN 2016 MAR 2019 Role : Data Engineer/ETL Developer Responsibilities: Participate in discussions with business users, gathering requirements on future enhancements, support existing features, fix any fallouts as needed Participate in daily scrum discussions, collaborate with team to define stories, discuss status and blockers Assess current architecture, create detailed migration plan utilizing work breakdown structure to establish critical path, agreed-upon milestones/deliverables, and resource plans according to cloud migration strategy Assist in data management deliverables such as business need analysis and high-level data modeling techniques Work with technical and business peers on development efforts including designing, analyzing, assessing, and reporting. Develop scripts, stored procedures, views, functions, Streams, Tasks, workflows, and complex queries Write analysis document on existing ETL, ELT load processes which helps in effort estimations in migration process Provide level of estimates, write detailed Design/Solution documents, develop ETL loading strategies along with offshore developers Worked with Infrastructure and Network Security teams to implement Security and Authentication between Snowflake and GCP,AWS. Closed worked with Data Quality and Governance teams to provide AWS Cloud watch Logs and Query usage by users. Worked with GCP,AWS Redshift, Glue, Lambda for ETL process. Craft complex, high performance SQLs in RDBMS (SQL Server, DB2, Oracle) and distributed querying Validation of Data loads from Source to stage and then ODS and Data Mart Layers Build automated process to perform Database and ETL (Extract Transform and Load) level object dependency analysis Use troubleshooting skills and work closely with operations, support, engineering and/or other functions to ensure successful migrations. Developed reusable Stored Procedures on Snowflake for loading the type I and Type II tables Implemented the framework for Ingestion Process from Oracle to Snowflake Support end to end Dataflow in the new Cloud Database/Data platform. Work with data engineers on development efforts including designing, coding, testing, and debugging during each cycle Accountable for Enhancements and Testing of New changes to the Snowflake Cloud Platform Perform database level dependency analysis on existing and new database objects, identify objects that can be decommissioned or migrated to another existing database Identify risks, actions, and issues through proactive communication and collaboration with stakeholders and various teams ensuring all stakeholders are on board Client : VARITE INDIA PRIVATE LIMITED, India AUG 2015 MAY 2016 Role: ETL Developer Responsibilities: Worked closely with business teams, product owners for business needs. Understand Business needs and document requirements, mappings as per the reporting needs Meet with Source Teams, Business Analyst to fill the gaps and document the data element as per reporting requirements Create Mappings using Mapping Designer to load the data from various sources using different transformations like Aggregator, Expression, Stored Procedure, and External Procedure, Filter, Joiner, Lookup, Router, Sequence Generator, Source Qualifier, and Update Strategy transformations Design and develop Business Intelligence data warehousing solutions using Informatica for heterogeneous data sources viz. flat files, Xml files, and relational sources. Testing, Debugging, Validating and performing performance tuning on the optimized developed solution. Code deployment after the successful review and sign off on the agreed and developed results from the customer. Closed worked with Data Quality and Governance teams to provide AWS Cloud watch Logs and Query usage by users. Coordinate with different Test and Development Teams to resolve the data gaps between different modules. Perform Impact Analysis for changes in the system and provide feasible solution. Writing SQL scripts for data being extracted/moved through different systems. Analyze data and technical system problems to design and implement effective, flexible solutions. Supporting the implementation, testing, and validation of Applications/Systems. Assisting with ongoing data architecture processes & governance. Performing data audits to identify problem areas & implementing strategies to optimize data integrity and process improvements Worked with Infrastructure and Network Security teams to implement Security and Authentication between Snowflake and GCP, AWS. Writing SQL scripts for data being extracted/ moved through different systems. Developed Python scripts for Change Detection and automated Database changes . Design and develop Business Intelligence data warehousing solutions using Informatica for heterogeneous data sources viz. flat files, Xml files, and relational sources. Testing, Debugging, Validating and performing performance tuning on the optimized developed solution. Code deployment after the successful review and sign off on the agreed and developed results from the customer. Create Mappings using Mapping Designer to load the data from various sources using different transformations like Aggregator, Expression, Stored Procedure, and External Procedure, Filter, Joiner, Lookup, Router, Sequence Generator, Source Qualifier, and Update Strategy transformations Analyze data and technical system problems to design and implement effective, flexible solutions. supporting the implementation, testing, and validation of Applications/Systems. Assisting with ongoing data architecture processes & governance. Keywords: continuous integration continuous deployment quality analyst sthree information technology procedural language Washington |