Sridhar Purapati - Snowflake Data Engineer |
[email protected] |
Location: Dallas, Texas, USA |
Relocation: Remote / Hybrid |
Visa: H1B |
Sridhar Purapati
Snowflake Data Engineer (732)813-3639 (Ext:104) [email protected] Dallas, TX Remote / Hybrid H1B 10+ years of IT experience Summary: 10+ years of IT experience in the Data field with expertise in designing data-intensive applications using Snowflake, Informatica, Denodo, Databricks on Cloud Data Warehouse / Data Mart, Data Visualization, Data Virtualization and Data Quality solutions. Experience and understanding of architecting, designing and operationalization of large-scale data and analytics solutions on Snowflake Cloud Data Warehouse. Functional knowledge of GSC, MFG and Sales & Marketing business module processes. Worked as ETL Lead and Architect for handling small size projects. Experience on migrating on Premises ETL code to Cloud. Worked on Snowflake cost optimization to change the different sizes of WH based on the work loads. Expertise in data management standard methodologies such as data integration, data security, data warehousing, data analytics, metadata management and data quality. Strong skills in Information Technology with a strong background in ETL tools like Informatica Power Center 9, Pentaho 8 etc. and in AWS s3, Snowflake and Denodo technologies etc. Experience in building data pipelines using Azure Data Factory, Azure data bricks, and loading data to Azure Data Lake, Delta Lake tables and granting database access. Developed and maintained data pipelines and worked with large data sets in ADF. Using PySpark performed various transformations/actions and the result data is saved back to Delta lake from there to target database Snowflake. Good Knowledge on Amazon Web Services like EMR, S3 buckets, EC2 instances, RDS, CloudWatch, RedShift and data ingestion with AWS Lambda functions. Experience with Snowflake utilities, SnowSQL, SnowPipe, Big Data model techniques using Python. Experience with snowflake Virtual Warehouses,Time Travel, database, Schema and Table structures. Worked on Denodo Virtualization port for near real time data sources. Worked on creating Pentaho transformations & jobs for data integration with heterogeneous sources. Define virtual warehouse sizing for Snowflake for different types of workloads. Excellent understanding of relational and dimensional data models. Worked on Fivetran ELT tool for data Ingestion into Snowflake. Worked on Python Pandas and PySpark for data manipulation. Experience in building and migrating large scale data platforms using Snowflake. Strong knowledge in DW Concepts, Dimensional Modeling, SCD2, Data modeling, ETL/ELT. Strong hands-on experience in tuning the SQL queries/ETL mappings. Good knowledge on BigData file formats like Parquet, Avro, ORC etc. Good experience of extracting data through various file formats like XML files, Json and Flat file sources etc. Created ETL mappings using Informatica Power Center to transfer data from multiple sources like Flat Files, Demand & Supply planning systems and Oracle ERP into a common target area. Experience in Performance Tuning of Mappings and Sessions, elimination of source and target bottlenecks. Implemented SCD Type2 for historical loads for customer dimension. Involved in test plans, error handling strategy and performance tuning for jobs and interfaces. Created extensive documents on the development, implementation & daily loads of the ETL jobs & Interfaces. Strong knowledge in Ralph Kimball Methodology Star Schema, Snowflake Schema, FACT tables & Dimensions. Good knowledge on different types of Integration & Replication tools like HVR,ODI,Matillion,Talend,DBT. Good knowledge of Oracle SQL, PL/SQL. Good knowledge of UNIX and Shell Scripting. Technical Skills: ETL & ELT Tools Informatica Power Center 9, Pentaho 8, ODI, Matillion,Talend Replication Tool HVR, Fivetran, DBT AWS Cloud Services S3, Amazon Athena, Cloudwatch,SQS, SNS Azure Cloud Services Azure Data Factory, DataBricks, ADLS Gen2 RDBMS Oracle, MySQL, SQL Server Cloud DW Snowflake, RedShift Operating Systems Windows 7/8/XP/2008/2012, Ubuntu Linux Programming & Scripting Python, PySpark, SQL, PL&SQL, Unix Shell Scripting Database Modeling: Dimension Modeling, Star Schema Modeling, Snowflake Modeling Data Virtualization Denodo Visualization/Reporting Tableau, Power BI Version Control Tools Github SDLC Process: Agile Software Development, Waterfall Methodology Education: Bachelor of Technology (B. Tech) from JNTU in 2007. Certifications: SnowPro Core Certification: https://www.credly.com/badges/3a852ac8-445c-4a44-baa2-019b85fc51d5 Azure Data Engineer Certification: https://www.credly.com/badges/09e99597-086f-4af9-8702-8b28f2cfdf77/ Professional Experience: Working as a Tech Lead in Logitech from Aug 2018 to Jun 2023. Worked as a Senior Software Engineer in Zensar from Feb 2014 to Aug 2018. Worked as an ETL Developer in J2LOGIX from Apr 2013 to Jan 2014. Project 1: Project : Logitech Portfolio Projects on MFG, GSC & S&M Modules Client : Logitech, USA. Role : Tech Lead Duration : Aug 2018 June 2023. Technologies : Snowflake, Denodo, Informatica, Pentaho, AWS S3, Databricks,Matillion Project Description: Logitech principally deals in Hardware and Computer peripherals like Mouse, Keyboards, and PC Compatible Cameras. Logitech has upgraded the DWH environment from an on-prem to cloud to cater to their day-to-day business needs. Logitech decided to convert Informatica code to Pentaho for data warehouse to R12 conversion to cater their business intelligence (BI) reporting requirement. These BI analytics would mainly cover Sales, Finance, MFG, GSC, Order Entry, and Forecasting modules. Data warehouse is built on a Snowflake cloud database. Data from Oracle EBS R12 is loaded into a data warehouse using ETL tool Pentaho with staging area is AWS S3. In between Denodo is used as a virtual data port to connect reporting tools. Tableau, OBIEE and Hyperion are primarily used as reporting and web publishing tools. Responsibilities: Worked as Data Lead to design the ETL pipeline as per the requirements. Worked on creating database objects like tables, views,stages,file formats etc. in Snowflake. Worked on Fivetran to transform the data into Snowflake from different sources of data. Worked on Pentaho to create the transformations and Jobs to load the data into target db Snowflake. Retrofitted Enterprise data warehouse Informatica ETL code to Snowflake AWS cloud environment. Worked on Snowpipe for Logitech IoT events streaming data from AWS Cloudwatch. Used Python Pandas packages for file conversions like XL to CSV format. Involved in migration of on-premises history data to Snowflake cloud data warehouse. Used Talend & Pentaho ETL strategies and tools and technologies to build the enterprise data sources. Extracting data through various file formats like XML files, Json and Flat file sources etc. Involved in creating mappings by using Pentaho 8 ETL tool. Involved in creating Denodo base views and real time data sources from heterogeneous sources. Worked on Denodo to create the business views as per the business stakeholders. Writing complex SQL scripts in snowflake cloud data warehouse to business analysis and reporting. Created test cases and involved in performing the unit testing. Handling, loading, and implementing delta loads for huge volumes of data. Worked on HVR replication related tasks of Oracle EBS with the DBA team. Worked as ETL Architect to handle the small size projects. Worked in building Snow pipe for continuous data load, Clone and Time Travel. Used Temporary and Transient tables on diff data sets and Auto-scale functionality. Involved in creating complex cloud-based queries in Snowflake. Coordinate with on-site counterparts and communicate the project status and concerns daily. Plan and execute ETL tool migrations working with different teams such as DBAs and BI-OPS teams. Used PySpark to performed various transformations/actions in Databricks and the result data is saved back to Delta lake from there to target database Snowflake. Moving large amounts of data into and out of other AWS data stores to Snowflake, such as Amazon Simple Storage Service (Amazon S3) using copy command. Implementation of data streaming feature with Snow Pipe, Storage Integrations. Performance Tuning and cost optimization by implementing dynamic warehouse selection based on data. Worked on Databricks PySpark to develop the DataFrames for POS data. Developed and maintained end-to-end operations of ETL data pipelines and worked with large data sets by using Azure Data bricks and Data Factory. Project 2: Project : EBS-R12 Conversion/Cloud Implementation Client : Logitech, USA. Role : ETL Developer Duration : Feb 2014 Aug 2018. Technologies : Informatica, Snowflake, Pentaho, AWS services, Denodo, Oracle Project Description: Logitech principally deals in Hardware and Computer peripherals like Mouse, Keyboards, and PC Compatible Cameras. Logitech has upgraded Oracle 11i ERP to EBS R12 to cater to their day-to-day business needs. Logitech decided to convert 11i data warehouse to R12 conversion to cater their business intelligence (BI) reporting requirement. These BI analytics would mainly cover Sales, Finance, Order Entry, and Forecasting modules. Data warehouse is built on the Oracle 11g database. Data from Oracle EBS R12 is loaded into a data warehouse using ETL tool Informatica. Business objects, OBIEE and Hyperion are primarily used as reporting and web publishing tools. Currently Logitech started cloud implementation in Snowflake by using Pentaho ETL tool, AWS s3 and Denodo. Responsibilities: Retrofitted existing mappings as per R12 Business logic. Retrofitted existing 11i tables as per R12 Business logic. Migrated On Premise databases to Snowflake cloud data warehouse. Involved in creating new mappings by using Lookup, Update Strategy, Expression and Router transformations. Involved in creating sessions and workflows. Analyzed 11i data and involved in updating the conversion data. Involved in creating bridges tables to update the R12 conversion data. Created test cases and involved in performing the unit testing. Involved in creating mappings by using Pentaho 8 ETL tool. Worked in AWS services s3, Redshift and Snowflake. Involved in creating cloud database objects in Redshift and Snowflake. Created different types of Snowflake tables during the Cloud migration. Plan and execute ETL tool migrations working with different teams such as DBAs and BIOPS teams. Involved in ingesting the data from the different sources to Snowflake via AWS S3. Fixed the data type issues while retrofitting the code from On-Prem to Cloud. Involved in creating Denodo base/business views objects as per the requirement. Coordinate with on-site counterparts and communicate the project status and concerns daily. Project 3: Project : Calix SFDC-Oracle Integration Client : Calix, USA. Role : ETL Developer Duration : Apr 2013 Jan 2014. Technologies : Informatica Power Center 9, Windows, UNIX, Oracle (EBS), Salesforce (PSA). Project Description: Calix is a North American provider of broadband communications access systems. Calix has shipped over eight million ports of its Unified Access portfolio to more than 900 North American and international customers, whose networks serve over 40 million subscriber lines in total. Calix has a Professional Services Workbench application in which the Project Managers maintain the Project related information and E-BS application. The project Calix PSA-EBS integration application is developed to integrate the PSA and EBS data bi-directionally. The proposed process is to implement the FinancialForce PSA module to replace the existing PMW workbench with Oracle EBS at Project level. Responsibilities: Created mappings to load source data into target tables using transformations like the Expression, Filter, Router, Lookup and Update Strategy. Utilized lookup transformations to lookup the values from different tables. Created Workflows and Non-Reusable Sessions in workflow Manager. Created Session Parameters to parameterize the DB and App connections in ETL. Created parameter file in Informatica UNIX server to place the connection objects values. And added execution permission in the parameter file. Worked on performance tuning by using Informatica Source Qualifier in ETL. Involved in unit testing and verifying the data in PSA and EBS. Used SQL queries to check the expected output data in Oracle. Involved in verifying the expected output data in Salesforce (PSA). Used Informatica Scheduler to schedule the workflow jobs for every 5 minutes. Fixed the bugs during QA and UAT testing. Involved in monitoring the all-workflow jobs in QA. Keywords: quality analyst business intelligence sthree database information technology procedural language Texas |