ugandher palamasi - SR Data engineer |
[email protected] |
Location: Portland, Oregon, USA |
Relocation: YES |
Visa: GC EAD |
UGANDHAR P
860-787-5080 [email protected] DATASTAGE,BIG DATA, DATAWEREHOUSE, CLOUD SERVICES Highly Motivated, Solutions Driven Senior Developer with over 17 years of experience in the areas of Integration Architecture, Design and Development. Involved in complete Software Development life cycle (SDLC) of various projects, including Requirements Gathering, System Designing, Data Modeling, Workday Prism, ETL Design, Development, Production Enhancements, Support and Maintenance. Excellent Interpersonal and communication skills with an ability to remain highly focused and to articulate technical problems & solutions in a mixed team environment. Good understand and hands-on in Big Data Solutions which includes working with Hadoop, HDFS, Hive, HBase, Spark and Sqoop. Implemented the reusable Framework in Spark, Python&Sqoop to handle the dynamic metadata changes and load data into the RDBMS database. Design and implementation of the reusable ETL framework for easy maintenance Designed and developed multiple applications in Spark using Python & Scala. Have good knowledge and hands-on Jenkins, GitHub. Executed multiple end-to-end Enterprise data warehousing projects. Excellent ETL/BI SDLC including development and unit testing experience with extensive with good understanding of Data Analysis, Data Warehouse Architectures and Database concepts. Excellent knowledge in Development of Big Data Hadoop ecosystem - Map Reduce, Impala,Hive,Presto, Oozie workflow and Sqoop. Strong knowledge in Big data management using Hadoop, HDFS, Pig latin, Impala,Hive, pySpark,HBase,Presto, Sqoop, Kafka, Linux and Tableau. Practical exposure and good knowledge in implementing batch and real time analytics of larger data sets processing using Scala. Experience in designing and developing Hadoop applications for automated Business Analytics in Retail Banking. Experience in loading multiple larger datasets into HDFS and processing datasets by using Hive. Write PySpark,SQLcode to process data from source and load into hive table by inferring the schema according to business requirement. Experience in manipulating/analyzing large datasets and finding patterns and insights within structured data Exposure in implementing and deployment of web applications in AWS EC2 instance. Exposures of dealing with AWS S3 object storage from Spark. Exposure in configuring EMR clusters on AWS. Design and Develop ETL processes in AWS Glue to Migrate Campagin data from external sources lik S3,ORC and Parquet files in to AWS Redshift. Used AWS Glue for the data transformation, validate and data cleansing. Good understanding of Spark s RDD API. Good understanding of Spark s Dataframe and API. Good knowledge in python scripting and implementation of data validation rules using Pandas and Numpys libraries with python. Knowledgeable Oracle developer skilled in data collection, analysis and management. Experience with Data flow diagrams, Data dictionary database normalization theory techniques. Entity relational model design techniques. Effectively made use of Table functions, Table partitioning, collections, analytical functions, materialized views, indexes (Btree, BITMAP & Function based). Had a very good exposure in finance products implementation (Moody s, Axiom) in setting for different financial products like swaps, futures, securities finance etc. Strong knowledge in OBIEE, Cognos reporting tool including development and testing RPD Changes like creating new subject areas, adding measures, creating alias tables using Administration tool. Excellent communication, interpersonal, intuitive, technical, analysis, problem solving skills. DOMAIN EXPERTISE: Insurance, Retail, Utility and Banking Technical Summary Big Data Tools Hadoop, Spark, Hive, Sqoop, Yarn, Map Reduce,Hbase,Presto Programming Languages Pyspark, Spark SQL, Python Databases Oracle 11g/12c, Teradata, Hive, MySQL Server 2016, DB2, PL/SQL ETL Tools Datastage DMexpress. IBM DataStage v11.5, v9.1, v8.5 and v7.5, SSIS, Talend, Matillion, DBT Operating Systems UNIX, LINUX,WINDOWS Others AWS, CA7,Autosys scheduling, Database architectur, Performance Turning, Workday Prism Education Master s degree in Electrical Engineering Certification IBM Certified Solution Developer - InfoSphere DataStage v11.3 AWS certified Cloud Practitioner Professional Experience Client: USAA, Plano, TX Mar 2021 to Present Lead Data engineer Projects: DataMod for Insurance and AML Data Team Responsibilities: Responsible for loading the data from Legacy database (Teradata, Oracle, Sqlserver), to HDFS as hive tables using Sqoop jobs. Developing and maintaining ETL pipelines to extract data from various sources and transform it into usable formats for Workday Prism Created transformations using Pyspark on transactional tables like DPI, ATB, Teller into COR posting systems. Extensively worked with Spark Python libraries to manipulate the data using broadcast joins and Sort merge joins. Implemented SQOOP for large dataset transfer between Hadoop and RDBMS Experienced in optimizing Hive queries by tuning configuration parameters. Involved in creation of Python scripts to create a centralized framework for all Big data applications across the bank. Designing and implementing data models to support reporting and analytics needs using Workday Prism. Involved in code integration using GitHub and deployment to edge node using Udeploy. Prepared ca7 DOC05 and Agent ca7 job files in PNC standard format to run the scripts periodically. Experienced in optimizing Hive queries by tuning configuration parameters. Prepared the Low-Level Design Document for every transformation that I am responsible to build. Performing data ingestion into Hadoop system from various other legacy source systems like Teradata, oracle, Sqlserver etc.and processing datasets by using Hive , impala and python script using CA7 scheduler. Ensuring scoop job is workflow and configuration of Oozie workflow is working as expected. Used Parquet File format for storing the data to effectively utilize the Cluster space as well to retrieve the Data faster while running the Jobs. As Parquet is Columnar storage its compression ratio is higher and effective for read heavy Operations. Peer reviewing and approving code in GitHub for production deployment of the code defined in Change Request. Creating Data comparison scripts and validation rules using extended python libraries like Pandas, Numpys & NLTK modules to match source and target data. Testing legacy Data (Oracle 12c) thoroughly and perform unit, regression and User acceptance testing. Participating in brain storming sessions with business users which helps in understanding requirements thoroughly and perform validations. Participating in scrum calls and coordinating with offshore and ensuring deliverables are not impacted. Technical Environment: Spark, Scala, Hive, HBase, Airflow, Oracle, MySQL, Unix, & Python, Jenkins, GitHub& Bitbucket. Client: Portland General Electric, Portland, OR April 2018 to Jan 2019 April 2020 Feb 2021 Lead Data engineer Projects: HR Optimization (WorkDay Implementation), PeopleSoft Support and Development, WebSphere Upgrade Responsibilities: Analyzing the requirements for CCAR regulatory reports data issued by Federal government. Designing ETL scripts and performing Legacy data loads from upstream systems to Financial Data warehousing by ensuring ETL flow is maintained across application (Oracle 11g &12c &Exalytics) for Assets and Liabilities and counterparty information using Moody s financial analytics tool. Developing in Scala and working in a fast paced agile environment developing features planned as part of a roadmap Staying up-to-date with new features and functionality of Workday Prism and incorporating them into existing workflows to improve efficiency and accuracy. Understand and create complex algorithms, write beautiful and concise code that can run at scale to build data driven features used within the app Participating in the technical design of our solutions and Participating to code reviews Spark/Scala Framework dynamically adding on Target database if any Metadata changes at source system without manual intervention for different sources. Worked on Sqoop to ingest & retrieve data from various databases like Oracle DB & MySQL Have Experience in UNIX scripting Experience in Git, Jenkins, CI/CD Pipelines, Data warehouse and Data lake formations. Creating autosys and jil jobs to ensure smooth loading data from source to target. Performing ETL validations (unit testing) to ensure that credit risk mitigants (EAD, RWA) for all the financial products are calculated exactly without any deviations. Preparing and executing oracle procedures /functions, SQL queries to fetch the data using Oracle 11G and sometimes by using Oracle stored procedures. Using Explain plan, Oracle hints and creation in of new indexes to improve the performance of SQL Worked on the performance turning . Involved in all the existing releases and helped the team in resolving critical issues. Technical Environment: Spark, Scala, Hive, Oracle, MySQL, Unix, & Python, Jenkins, GitHub, Moody s financial analytics. Client: Gap Inc., San Francisco,CA September 2015 March 2018 Data engineer Projects: PFI Ship from Store, Item Master, SII Strategic Inventory Initiative Roles: Lead DataStage Developer Responsibilities: Migration of Insight reports (Crystal report)to OBIEE using Oracle Business intelligence tool and Oracle BI Publisher. Understanding the requirements of the client, business scenario(s) and the application using functional specification document Developed various Adhoc reports as per the requirements for user interface design like Table, Chart, Pivot, Narrative and view Selector Reports by using Answers component for testing purpose. Implemented Prompts to facilitate dynamic filter condition to End-users. Responsible for creating and maintaining RPD and creation of Dimensional Hierarchies & Level Based Measures. Responsible for designing customized interactive dashboards in OBIEE using drill down, guided navigation, prompts, filters, and variables. Responsible for creating RTF templates in BI publisher and Integration of reports with OBIEE Dashboard. Involved in unit testing of presentation layer and organized the information data into related subject areas is easy for end users to use as basis for reporting. Testing Roles by granting users with appropriate access to the subject areas and reports. Good experience in customization of OBIEE Dashboard user interface using CSS styles and image appearance by configuring skins and styles. Worked on Customization of Dashboard with radio buttons, hiding prompts, saved analytics request and dashboard objects such as Text, Images, Folders Technical Environment: OBIEE 11g/12c ,OBI Publisher ,Toad, Oracle 11g, DMExpress , Bugzilla,ALM Client: Portland General Electric, Portland OR. February 2014 August 2015 Projects: HR Optimization (WorkDay Implementation), PeopleSoft Support and Development, WebSphere Upgrade ETL developer/ DataStage Developer Responsibilities: Analyzing the requirements for CCAR regulatory reports data issued by Federal government. Designing ETL scripts and performing Legacy data loads from upstream systems to Financial Data warehousing by ensuring ETL flow is maintained across application (Oracle 11g &12c &Exalytics) for Assets and Liabilities and counterparty information using Moody s financial analytics tool using DataStage. Participating in the technical design of our solutions and Participating to code reviews Have Experience in UNIX scripting Experience in Git, Jenkins, CI/CD Pipelines, Data warehouse and Data lake formations. Creating autosys and jil jobs to ensure smooth loading data from source to target. Performing ETL validations (unit testing) to ensure that credit risk mitigants (EAD, RWA) for all the financial products are calculated exactly without any deviations. Preparing and executing oracle procedures /functions, SQL queries to fetch the data using Oracle 11G and sometimes by using Oracle stored procedures. Using Explain plan, Oracle hints and creation in of new indexes to improve the performance of SQL Worked on the performance turning . Involved in all the existing releases and helped the team in resolving critical issues. Technical Environment: Datastage, Oracle, MySQL, Unix, & Python, SVN, Moody s financial analytics. Client: Gap Inc., San Francisco CA. February 2009 January 2014 Projects: PFI Ship from Store, Item Master, SII Strategic Inventory Initiative Roles: Lead DataStage Developer Responsibilities: Worked with Business Teams to understand the business process flows and gathering the requirements at high level Worked with Architects to create the high-level design documents and the system requirements Analyzed the data flow between the systems to identify the mappings and business rules and created the data mapping documents Worked on the proposal to migrate the DataStage jobs from v7.5/v8.5 to v9.1 and presented to the client Developed the DataStage interfaces on v9.1 to extract the data from DB2, Oracle, MQs and apply various business logic to load the staging tables on Teradata Worked on the ESB routing jobs to deliver the files to multiple systems based on the routing parameters Involved in writing Test Plans, Test Scenarios, Test Cases and Test Scripts and performed the Unit, Integration, system testing and User Acceptance Testing Unix shell scripts and CAWA Schedules to trigger the DataStage jobs Client: Bank of America, Charlotte NC. May 2003 January 2009 Projects: Conversion of Military Bank, Fleet Bank Merger, Customer Information System KTC (Know The Customer) Role: DataStage Developer Responsibilities: Gathering business requirements to create the initial logical models Developed mappings to extract master data from various applications Reverse engineering of the existing legacy systems, analyzing the existing code and creating the functional and technical documents Collaborated with client teams to create technical requirement documents Created DataStage jobs to extract the data from large DB2 tables, validate, transform, and load into target Oracle tables Worked in COBOL, CICS, JCL and DB2 Stored Procedures Unit Testing, Integration Testing and Stress Testing Preparation of Test Plans (System, Integration & Performance) and Defect Management Keywords: continuous integration continuous deployment access management business intelligence sthree database information technology procedural language California North Carolina Texas |