Home

Arvind Goud - Sr Data Engineer/Azure cloud engineer/Azure solutions architect
[email protected]
Location: Remote, Remote, USA
Relocation: no
Visa: H1B
Aravind Goud Mandapaka

PROFESSIONAL SUMMARY:
Driven professional with more than 9 years of experience in Data Architecture, Engineering, Automation, Pipelines, Integration, lineage, Quality, Governance & Orchestration for large scale E2E business scenarios.
Proven ability to effectively interact with stakeholders, comprehensively understand their unique business requirements, and deliver tailored solutions that align with organizational objectives.
Strong knowledge of implementing Data warehousing and Integration solutions using Azure Cloud services, Snowflake, Python, Airflow, DBT, Azure Data Factory, Databricks Docker, Kubernetes.
Implemented Star, Snowflake Schemas & Slowly changing Dimensions while Dimensional modelling for numerous scenarios.

CERTIFICATIONS:


EDUCATION:
Master of Science in Information Systems Aug 2017 -- Aug 2019
Northeastern University, Boston

Bachelor of Engineering in Information Technology July 2011 May 2015
Osmania University, Hyderabad, India.

TECHNICAL SKILLS:
Languages: Java, Python, Pyspark, SQL
Azure Cloud: Storage, Monitoring, Security, Networking, Compute & IAM
Databases: Azure Synapse, PostgreSQL, Snowflake, MSSQL, MySQL, MongoDB
ETL: DBT, Apache Airflow, Spark, Azure Data Factory, Databricks, (SSIS), SSAS, Talend
BI Reporting: Power BI, Tableau
Others: Docker, Kubernetes, NumPy, Pandas, AWS, Linux, Postman, Flyway, GIT

WORK EXPERIENCE
Ocean Spray Cranberries Data Engineer Oct. 2019 present
Architected and implemented medium to large scale BI solutions on Azure Data Platform services (App Service, Azure Data Lake, Data Factory, HD Insights, Databricks)
Migrated existing pipelines and databases from On-prem systems to Azure reducing development/maintenance costs by 40% & improved efficiency.
Collaborated with internal customers to gauge their requirements and provided cloud solutions via POCs and conducted post-POC analysis too.
Designed and implemented Data Mesh, automated setting up a template for Data products & Domains for their respective API, Airflow, DBT, snowflake artifacts generation with a single API call using python & Kubernetes.
Built and continue to provide on-going support to Enterprise level data warehouse in Azure PostgreSQL and supported creation of Finance, Consumption, Commercial, Trade, Supply chain models in Power BI on top of the EDW.
Worked on live streaming data & batch FTP process for Trade Promotion Management System & Consumption data, involving ETL, archive setup and QC anomaly detection.
Led the implementation of a high-impact use case leveraging Apache Spark to optimize large-scale data processing, resulting in a 30% improvement in processing speed and a 20% reduction in overall resource consumption. Developed and fine-tuned complex Spark jobs, efficiently handling massive datasets to meet stringent performance requirements. Implemented data partitioning and caching strategies, enhancing overall job efficiency, and reducing execution times.
Conducted performance tuning and optimization, identifying bottlenecks, and implementing solutions to enhance Spark job execution, ultimately enhancing the overall data processing pipeline.
Single point of contact to setup & manage any Azure DB servers, roles, rules, CI/CD pipelines for the Data Platform Team & the Advanced Analytics team.
Designed and deployed Spark-based workflows on Azure Databricks, enabling scalable data processing and analytics. Utilized Databricks notebooks to develop and optimize Spark jobs, achieving a 25% improvement in overall processing speed and enhancing the responsiveness of data-driven applications.
Collaborated with cross-functional teams to integrate Azure Databricks into the broader data ecosystem, facilitating seamless data flow and enhancing the overall data architecture.
Created an alternate Disaster Recovery mechanism by loading data from a restored server to current PROD/DEV servers using Azure Data Factory & blob storage.
Enabled real time analysis by creating a single data pipeline to ingest entire North America's Supply Chain data from numerous API endpoints, there by improving efficiency of the manufacturing plants by 28%.
Automated DBT & Flyway to set up Snowflake artifacts saving cost on persistent storage by 15%.
Scheduled Cronjobs to run data pipelines in regular intervals on Azure Kubernetes service and setup alerting mechanism.
Experienced in setting up workflows in Apache Airflow. Created DAGs to run python scripts, DBT macros, Data Mesh in DEV/QA/PROD environments.
Designed and documented the best practices to support continuous process automation for data modelling, ingestion, deployment and pipeline workflows.

Nationwide Insurance Data Engineer Sep. 2017 Sep. 2019
Developed transformation logics to integrate data from Hadoop Data Lake (Hive QL), ODBC Cubes, data marts, excel spread sheets and storing the resulting data into SQL server using Azure Data Factory.
Designed and implemented a robust use case utilizing PowerShell scripting to automate and streamline daily ETL (Extract, Transform, Load) processes within the data pipeline.
Perform analysis of system data to identify patterns, exceptions, and erroneous data.
Create high-level process models (activity hierarchy diagrams, data flow diagrams, workflow diagrams, system interface diagrams) to represent business processes for the requirements.
Worked with Power BI service and Power BI desktop.
Created advanced DAX functions to display data analysis in sales and inventory reports.
Working in the Agile development environment with frequently changing requirements.

Accenture Technology Software Engineer Jun. 2014 Aug. 2017
Develop an in-depth understanding of the business processes and data in the corporation and apply the knowledge towards developing and maintaining the Corporate Data Model and a Metadata Portal
Implement cloud services on Azure including Azure Storage, Azure Data Factory, Azure Containers, data bricks, network security group.
Experience In using Azure data factory to extract, transform and load data (ETL/ELT).
Implement data pipeline across data stores using Talend.
Create On-demand tables in Azure Blob files.
Used Azure Data Factory for data transformation and validation as well as data cleaning.
Write and implement SQL Queries, stored procedures.
Experience with data ingestion to one or more Azure Services (Azure Storage, Azure SQL) and processing data in Azure data bricks.
Worked with cloud database and data warehouse (Azure SQL, Amazon RDS).
Create On-demand tables in Azure blob files using ADF.
Created Stored procs, triggers, functions, and subqueries/joins to aggregate and validate data between source and target systems.
Enhance (or create - if new system) system logical data model to meet requirements, applying the Data Modeling Standards and the logical Data Naming Conventions. Apply the same enhancements in the physical data model, using the Data Modeling Standards and the physical Data Naming Conventions appropriate for the specific back-end database.
Keywords: continuous integration continuous deployment quality analyst business intelligence database

To remove this resume please click here or send an email from [email protected] to [email protected] with subject as "delete" (without inverted commas)
[email protected];3008
Enter the captcha code and we will send and email at [email protected]
with a link to edit / delete this resume
Captcha Image: