RAJESH MIRIAM - BI Lead Developer/ ETL Dev Lead |
[email protected] |
Location: Houston, Texas, USA |
Relocation: no |
Visa: h1b |
RAJESH MIRIAM
BI Lead Developer/ ETL Dev Lead [email protected] Mobile:5122561851 ___________________________________________________________________________________________________________________________________________________ PROFESSIONAL SUMMARY: Over 19+ years of IT experience with expertise in analysis, design, development and implementation of Data warehousing using ETL/Snowflake, BI and Tidal/Control-M tools with Oracle, SQL server and Teradata databases on windows/Unix platforms. Highly skilled Snowflake Developer with 6+ years of experience in designing and developing scalable data solutions. Proficient in designing, developing, and optimizing end-to-end data integration solutions(in snowflake development). IT professional with experience in all phases of software development life cycle, with forte at successful project implementations - analyzing systems, understanding & defining requirements, development, technical designs & reviews and project planning. Extensive experience in ETL process consisting of data transformation, data sourcing, mapping, conversion and loading. Skilled in leveraging Snowflake's cloud data platform and ETL tools to extract, transform, and load data from diverse sources into Snowflake data warehouses. Skilled in harnessing the power of Snowflake's cloud-native architecture to seamlessly integrate data from disparate sources, transforming raw data into actionable insights with speed and accuracy. Proven track record of implementing scalable and efficient ETL pipelines in Snowflake, optimizing performance and resource utilization for enhanced data processing. Proficient in Design and Development of process required to Extract, Transform and Load data into the Data warehouse using Snowflake Cloud Database and AWS S3. Expertise in Building/Migrating data warehouse on Snowflake cloud database. Played a key role in Migrating Oracle objects into Snowflake Environment Using AWS Services Expertise in working with Snowflake Snow pipes, Internal/External Stages, Clone, Tasks and Streams Involved in Zero Copy cloning Cloning databases for Dev and QA environments. Storage Considerations for Staging and Permanent Databases / Tables. Experience in Agile methodology. Created Internal and External stages and transformed data during load. Created Integration objects, file formats, stages, and used COPY INTO/Snowpipe to ingest CSV/TXT/JSON data continuously from AWS S3 bucket. Experienced in handling of Structured & Semi Structured data loads & unloads into the Snowflake DW. Experienced in Continuous Data Protection Lifecycle: Time Travel, Fail-Safe zone. Experienced in data migration from traditional on-premises to cloud systems. Queried Historical Results/Data based on Timestamp, Offset, Query ID. Worked on streams, Secure Views and Materialized View. Experience with Snowflake Datawarehouse, deep understanding of Snowflake architecture and processing experience with performance tuning of Snowflake data warehouse with Query Profiler, Caching and Virtual data warehouse scaling. Snowpipe, Zero Copy cloning, cloning Objects in the account, Time Travel, Data Retention Settings for Storage, Data masking, Cluster Design and Tuning. User Management, Service Account Settings, Data Sharing from Prod to Stage and Dev Environments. Proficient in writing complex SQL scripts leveraging Statistical Aggregate functions and Analytical functions to aid in Extract, Transform, Load (ETL) processes within the Snowflake cloud data warehouse. Established DBT process to improve performance, scalability and reliability. Migrated legacy transformation code into modular DBT data models. Written and optimized SQL queries within DBT to enhance data transformation process and improve overall performance. Utilized DBT to convert raw datasets, enabling efficient analysis and reporting. Experience in analyzing large datasets to identify business opportunities and provide actionable insights to the sales and marketing teams. Proficient knowledge and hands-on experience with Extraction, Transformation, and Loading (ETL) using various versions of Informatica Power Center. Experience in Dimensional Data Modeling, Slowly Changing Dimensions Type I, II & III, ODS, Star/Snowflake Schema modeling, Fact & Dimension tables, OLAP & OLTP Systems. Experience in Dimensional Layer Modeling (Facts, Dimensions, Business Measures and Grain of the Fact etc.), Entity-Relationship Modeling and in-depth knowledge of Complex Data Models of Relational, Star, and Snowflake schema. Knowledgeable on Informatica Administrative Tasks, deployment to production, and Scheduling with Informatica. Technically proficient in identifying and resolving performance bottlenecks. Proficient in performance analysis, monitoring and SQL query tuning in Teradata as well as Oracle. Involved in migration projects to migrate data from data warehouses on Oracle/DB2 and migrated those to Teradata. Working Experience in UNIX Shell Scripting and Perl Scripting. Good knowledge of analytical and logical conceptual level programming skills. Highly motivated, self-driven, result-oriented, customer focused and capable of working in line with organizational objectives. Excellent at problem solving - identifying and researching errors or data issues. Perform root cause analysis and develop robust solution framework. Developing, Designing, Analyzing & Reviewing database objects - functions, stored procedures, cursors, triggers, views. Driven & facilitated System Testing & UAT, conducting knowledge sharing sessions for business users as well. Good Understanding of API security models. PROFESSIONAL EXPERIENCES: ETL Dev Lead Shell Sep-2022 to till date Project Details: As an outcome to the 2008 collapse of financial markets, The Dodd-Frank Act became law in US in July 2010, establishing the Commodity Futures Trading Commission (CFTC) as a major new energy regulatory. Energy companies that transact in energy commodities must reevaluate their practices because CFTC rules will affect hedging, risk management, and derivatives transacting activities. The Dodd Frank Repository (DFR) was put in as a tactical solution to support Shell Trading s larger business initiative to comply with the CFTC mandates derived from the Dodd-Frank law. Failure to be compliant with the reporting timelines specified by the CFTC can result in a halt in the trading activities, fines, and potential criminal charges against the company and individuals concerned. Under regulation 17 CFR 45.4(c)(2)(ii), Any shell trading entity that is a registered swap dealer or major swap participant, shall report collateral data electronically to a swap data repository. Roles & Responsibilities: Bulk loading from the external stage (AWS S3), internal stage to snowflake cloud using the COPY command. Loading data into snowflake tables from the internal stage using Snowsql. Played a key role in Migrating Oracle Database objects into Snowflake Environment Using AWS Services Used COPY, LIST, PUT and GET commands for validating the internal stage files. Used import and Export from the internal stage (snowflake) from the external stage (AWS S3). Writing complex snowsql scripts in snowflake cloud data warehouse to business analysis and reporting. Used SNOW PIPE for continuous data ingestion from the S3 bucket. Created Clone objects to maintain zero-copy cloning. Data validations have been done through information schema. Performed data quality issue analysis using Snow SQL by building analytical warehouses on Snowflake Experience with AWS cloud services: S3, IAM, Roles, SQS. Cloned Production data for code modifications and testing. Involved in source system Analysis, mapping analysis, and end to end Data lineage analysis. Involved in data profiling and data quality checks. JSON data extraction, transformation, and data ingestion into Snowflake. Used Time Travel, Data Retention Settings for Storage for Crucial Tables which helped to analyze the data for testing. Wrote complex SQL scripts with Joins, Sub Queries, Co-related subqueries, window functions etc. UDFs and UDTFs and procedures for transformation into data warehouse facts and dimension tables. Worked with devops team, Snowflake Admin team, infrastructure, and project teams to ensure continuous operation of build and test systems. Worked with business in UAT and PROD phases to assist on data deliverables. Worked with Support team to provide the knowledge transfer as part of post-production support. Involved in analysis, troubleshooting, and fixing the issues. Wrote SPs to achieve the business requirements (Implemented SCD type 1, 2). Used Cloud watch logs to monitor the jobs and analyze the issues for any failures. Experience with dimensional modelling using star and snowflake models Developed and maintained DBT models, including transformations, macros, and tests, ensuring scalability and maintainability of data pipelines. Developed and maintained DBT models for data transformation and analytics, ensuring data quality and consistency. Developed ETL/ELT pipelines using Snowflake s features such as Snowpipe, Tasks, for automated data loading and processing. Identified and addressed performance bottlenecks in Snowflake, DBT environments through query optimization, index tuning, and infrastructure scaling. Provided technical guidance and support to team members on Snowflake architecture and best practices. Conducted comprehensive assessments of source databases and applications to identify migration complexities, data dependencies, and potential risks. Analyzed schema structures, data types, and stored procedures to determine compatibility and migration feasibility, providing recommendations for remediation and optimization. ETL Lead Wynn Las Vegas, NV Mar-2018 to Sep-2022 Roles Performed: Created Database objects like Tables, Simple Views, Sequences as per the Business requirements. Hands on Experience on DDL, DML, TCL and DRL Statements Dealing with Aggregate Functions and Analytical Functions. Using Global Temporary Tables (GTT), External tables (To load data into the stage table). Creating Backup Tables regularly using CTAS. Involved in all Phases of the Software Development Life Cycle. Analyze the Existing code and do the impact analysis with the help of Seniors. Handling Nulls using NVL, NVL2 and Coalesce. Creating Indexes on tables to improve the performance by eliminating the Full Table Scans. Worked on Complex Queries using Case Statements, String and Date Functions. Involved in source system Analysis, mapping analysis, and end to end Data lineage analysis. Established DBT process to improve performance, scalability and reliability. Migrated legacy transformation code into modular DBT data models. Written and optimized SQL queries within DBT to enhance data transformation process and improve overall performance. Utilized DBT to convert raw datasets, enabling efficient analysis and reporting. Involved in data profiling and data quality checks. JSON data extraction, transformation, and data ingestion into Snowflake. Used Time Travel, Data Retention Settings for Storage for Crucial Tables which helped to analyze the data for testing. Worked with Support team to provide the knowledge transfer as part of post-production support. Involved in analysis, troubleshooting, and fixing the issues. Used various materializations to load data into target DWH system. Query Performance Tuning by Clustering Tables and creating Materialized views. Created Airflow workflows and coordinators to automate data pipelines daily, weekly, and monthly. Worked on Source - target mapping and technical design documents Designed and implemented Snowflake data warehouses from scratch, including schema design, table structures, and optimization for performance and scalability. Developed and maintained dimensional models (star schemas, snowflake schemas) to support complex reporting and analytics requirements. Built robust ETL/ELT pipelines within Snowflake using SQL, stored procedures, and tasks for seamless data ingestion, transformation, and loading from various sources. Conducted performance tuning of Snowflake databases and queries to improve overall system performance and reduce query execution times. Utilized clustering keys, partitioning, and materialized views to optimize data storage and query performance. Leveraged Snowflake's native capabilities like COPY INTO, INSERT INTO, and MERGE statements for efficient and parallelized data loading. Integrated Snowflake with external systems and data sources using Snowflake's connectors, APIs, and external functions for seamless data exchange and interoperability. Established data pipelines and workflows for real-time and batch data ingestion from sources such as AWS S3, Azure Blob Storage, and on-premises databases. Collaborated with cross-functional teams, including data engineers, analysts, and business stakeholders, to understand requirements and deliver data solutions aligned with business objectives. Documented data warehouse design, data models, and ETL processes to ensure transparency, maintainability, and knowledge transfer within the team. Implemented high availability (HA) and disaster recovery (DR) solutions for Snowflake environments, including multi-cluster configurations, data replication, and failover mechanisms to ensure business continuity and data resilience. Implemented custom alerts and notifications for critical events such as resource contention, query failures, and data ingestion issues, enabling timely intervention and resolution. Utilized Snowflake's migration tools and utilities for schema migration, data loading, and validation to ensure a smooth transition to the Snowflake platform. Implemented Continuous Integration/Continuous Deployment (CI/CD) pipelines to streamline the deployment of Snowflake objects and configurations using tools like Jenkins or GitLab CI/CD. Sr BI Dev lead Architect T-Mobile - Atlanta, GA Jan' 2013 to Feb-2018 Roles Performed: Architected and implemented complex data integration solutions using Informatica PowerCenter, orchestrating data movement and transformation across heterogeneous systems and platforms. Designed end-to-end ETL architecture frameworks to facilitate seamless data integration, transformation, and loading processes across diverse data sources and destinations. Developed scalable and extensible ETL architectures that accommodate current data requirements while anticipating future growth and evolving business needs. Defined data integration patterns (batch, real-time, micro-batch) based on business requirements, data volume, latency, and frequency of updates. Led the design and development of ETL workflows, mappings, and sessions to extract, transform, and load large volumes of data from various source systems into the data warehouse. Implemented data quality processes and standards to ensure the accuracy, completeness, and consistency of enterprise data assets, reducing data errors. Spearheaded performance tuning initiatives to optimize ETL processes, improving data processing times and resource utilization. Implemented data validation, cleansing, and enrichment processes at each stage of the ETL pipeline to ensure data integrity and accuracy. Conducted data profiling and analysis to identify data quality issues and performance bottlenecks, implementing corrective measures to enhance data reliability and efficiency. Integrated Informatica PowerCenter with external systems and applications using APIs, web services, and custom connectors to facilitate seamless data exchange and interoperability. Designed and implemented data archiving and purging strategies to manage data growth and improve database performance, reducing storage costs. Participated in project planning, estimation, and risk management activities, ensuring on-time delivery and adherence to project timelines and milestones. Worked on Source - target mapping and technical design documents. Used Pre and Post Session SQL command (DDL & DML) to drop and recreate the indexes on data warehouse and involved Migration of code between the Environments and maintaining the code. Involved in creating Deployment Groups, Labels to migrate the informatica objects to production repository during the deployment. Used version control tool accurev & GITHUB, for deploying scripts into pre-prod and prod environment. Worked on different tasks in workflow like sessions, event wait , decision , email , command , worklets , assignment and scheduling of workflows. Involved in preparing the migration document to move informatica objects from lower environments to higher environments. Extensively used Add currently Processed Flat File name port to load the flatfile name. Technical developer VWRI, Pennsylvania Oct -14-2010 to Dec-20-2012 Roles Performed: Involved in requirements gathering from the business users by working closely with architects and business analysts to understand system and business requirements. Responsible for designing & developing various Systems adhering to relational and dimensional modeling standards. Worked on Source - target mapping and technical design documents Extracted data stored in Oracle 11g/10g and Flat files and loaded data into Teradata Datawarehouse models that capture current state/future state data elements and data flows using ERWIN. Involved in data Transformations between different Informatica folders and also Repositories along with DBATeam Oracle performance tuning to achieve insert/updates in multiple tables through different procedures. Responsible for monitoring the production night cycle and providing production support in case of any failures ETL Developer Genentech, US Jan'2007 to Oct'2010 Roles Performed: Extensively used Mapping Variables, Mapping Parameters, and Workflow variables for full load and incremental load. Extensively used Informatica Transformation like Router, Filter, Lookup, Joiner, Aggregator, Sorter, Rank, Stored Procedure Transformation, Java Transformation, Mapplets etc. and all transformation properties. Improved performance using Informatica Partitions and other Performance Tuning techniques. Developed re-usable components in Informatica and UNIX. Implemented CDC ETL logic on Oracle and SQL server using Informatica 7.1.3. Software Engineer British Airways, London Aug-01 2005 to Dec-30-2006 Roles Performed: Study and analysis of the mapping document indicating the source tables, columns, data types, transformations required, business rules to be applied, target tables, columns and data types. Involved in Designing and Development of the ETL (Extract, Transformation & Load) strategy to populate the Data Warehouse from the various source systems feeds using Informatica. Wrote PL/SQL procedures, functions, triggers and packages for processing business logic in the database. Designed coding specification and documentation. ETL Developer American Skyline Insurance Company, US Dec-12-2004 to Jul-22-2005 Roles Performed: Involved in Designing and Development of the ETL (Extract, Transformation & Load) strategy to populate the Data Warehouse from the various source systems feeds using Informatica. Used Transformation Developer to create Joiners, filters, Lookups, Expressions and Aggregation Transformations that are used in the mappings. Worked on Informatica power center tool - Source Analyzer, Warehouse Designer, Mapping Designer, Mapplets and Transformation Developer Extensively used ETL to load data from different databases and flat files to Oracle Testing for Data Integrity and Consistency post load Copic Insurance, US ETL Developer Feb-03-2004 to Nov-20-2004 Roles Performed: Study and analysis of the User requirements document. Extensively used Source Analyzer and Warehouse Designer to import the source and target database schemas, and mapping designer to map source to the target.Used Transformation Developer to create Joiners, filters, Lookups, Expressions and Aggregation Transformations that are used in the mappings. Optimizing the SQL Queries by creating indexes on tables, and drop for fast accessing of the records even if it contains large number of records. Loading data from production to development to generate test data TECHNICAL SKILLS: ETL Tools Informatica Power Center 8.1.1/10.2/10.5 [ ] Informatica Analyst [ ] Power exchange 9.1, Snowflake Data Loading Tools (SnowSQL, Snowpipe) Data Modeling Tools Erwin 8.0, MS Visio Programming Languages SQL, Oracle PL/SQL, Java Script, Basic UNIX SHELL Scripting, HTML, Java Databases Oracle 10/9i/8, MYSQL, MS SQL server, Netezza, Teradata, DB2, SOQL BI Tools Informatica Scheduler Reporting Tools Tableau, OBIEE, MS BI, COGNOS BI- Framework manager, Query studio, Report studio Scheduler Tools Auto SYS, Tidal,Control-M Protocols TCP/IP, HTTP and HTTPS Operating Systems Linux, Mac OS, WINDOWS 98/00/NT/XP,Unix Keywords: cprogramm continuous integration continuous deployment quality analyst business intelligence sthree information technology microsoft procedural language Colorado Georgia Idaho Nevada |