Nandini Patnaik - Data Analyst/Engineer |
[email protected] |
Location: San Jose, California, USA |
Relocation: |
Visa: H4 EAD |
Nandini Patnaik
Data Engineer [email protected] 6506608693 Objective Having 6 Years of Experience on Big data using Spark Technology and the Hadoop eco system technologies. Looking to work with an esteemed organization that gives me an opportunity to utilize my existing analytical skills and help make better business decisions. Professional Summary Exposure in design and development of solutions for Big Data using the Hadoop eco system technologies (HDFS, Hive, Apache Spark, AWS) Strong data base experience in MS SQL Server, PostgreSQL and NoSQL data bases (Casandra,MongoDB) Understanding of containers and container orchestration (Docker/Kubernetes) Proficient in all aspects of software life cycle like Build/Release/Deploy (CI/CD flow) and specialized in cloud automation. Hands on experience in multi cloud environments (Azure/AWS/GCP). Proficient in administrating Azure laaS/PaaS services like Azure Virtual machine, web and worker role, Key vault, Azure DevOps,Network services, PowerShell automation. Working Knowledge of AWS components like RDS, EMR clusters, codebuild, CloudFormation, CloudWatch, CodeWhisperer, lambda, SNS, SQA, Athena, S3, data pipeline, aws glue etc. Applying quantitative analysis to use large data sets and provide data-driven decisions such as product insights and marketing channel optimization. Implemented spark using Python and Spark SQL for faster processing of ETL/ELT data Worked with AVRO and PARQUET file formats to leverage the storage in HDFS Skills Data Management: Data Quality Management, Data analysis, Visualization of data insight, Cloud Engineering, CI/CD flow Programming Language: Python, SQL, Shell Scripting Linux/Bash Business Intelligence Tools: Datadog, MS-Excel,Tableau Data Eco System: Hadoop 2.7.0, Hive-1.2, Apache Spark 3.2.1, AWS Databases: Auroradb, SQL-exclusive,PostgreSQL, MongoDB, Casandra Work Experience KPMG, Data Engineer, Sunnyvale, CA March 2021- Feb 2024 Responsibilities: Client Data Delivery: Capital One Technologies: AWS s3, Airflow, Spark, Hive, Auroradb, Tableau,redshift Developed and implemented data sourcing from variety of sources and applied different spark optimization techniques in production environment to make the production environment run smoothly.Owning the development of configuration automation tool to accelerate quality control process for banking client. Reduced the latency of spark jobs by tweaking the spark configurations and following other performance and optimization techniques. Extensively worked on relational database modeling and writing multiple complex Queries, creating and updating SQL queries. Analyzed the ETF data to understand the entities and relationships. Hands-on experience with SQL such as Aurora db and postureSQL Write the Hive queries to extract the data to be processed in a ETL. Created a Tableau dashboard to represent matrix on pipelines and also provided analysis on baseline performance for real time BI report analysis. Experience in working in github, creating git tags and deploying it in all environment. Amazon, Data Analyst, Bangalore, India Jan 2016-19 Responsibilities: Catalog Data Flow setup: Technologies: s3, Spark, Hive, Auroradb, EMR, AWS Part of a prime project to automate the flow to set logics and analyze the report of the logic to find the % rate of success also handled documentation for launch analysis. Designed and created hive external table using shared meta-store with Partitioning, dynamic partitioning, and buckets and storing the data HDFS. Wrote SQL queries to provide a base line performance analysis on the data pipelines. Created a Post launch Analysis using pyspark Data Fram Education B-Tech : Paralamaharaja Govt. Engineering College (2011 2015) Keywords: continuous integration continuous deployment business intelligence sthree database information technology microsoft California |