Need ($60:hr on C2C) Azure Databricks (Onsite) | Auburn Hills, MI at Auburn, Pennsylvania, USA |
Email: [email protected] |
From: Suresh Babu, Teamware Solutions [email protected] Reply to: [email protected] Position: Azure Databricks (Onsite) Location: Auburn Hills, MI Technical/Functional Skills Develop deep understanding of the data sources, implement data standards, and maintain data quality and master data management. Expert in building Databricks notebooks in extracting the data from various source systems like DB2, Teradata and perform data cleansing, data wrangling, data ETL processing and loading to AZURE SQL DB. Expert in building Ephemeral Notebooks in Databricks like wrapper, driver and config for processing the data, back feeding the data to DB2 using multiprocessing thread pool. Expert in developing JSON Scripts for deploying the Pipeline in Azure Data Factory (ADF) that process the data. Expert in using Databricks with Azure Data Factory (ADF) to compute large volumes of data. Performed ETL operations in Azure Databricks by connecting to different relational database source systems using jdbc connectors. Developed Python scripts to do file validations in Databricks and automated the process using ADF. Analyzed the SQL scripts and designed it by using Pyspark SQL for faster performance. Worked on reading and writing multiple data formats like JSON, Parquet, and delta from various sources using Pyspark. Developed an automated process in Azure cloud which can ingest data daily from web service and load in to Azure SQL DB. Expert in optimizing the Pyspark jobs to run on different Cluster for faster data processing. Developed spark applications in python (Pyspark) on distributed environment to load huge number of CSV files with different schema in to Pyspark Dataframes and process them to reload in to Azure SQL DB tables. Analyzed data where it lives by Mounting Azure Data Lake and Blob to Databricks. Used Logic App to take decisional actions based on the workflow and developed custom alerts using Azure Data Factory, SQLDB and Logic App. Developed Databricks ETL pipelines using notebooks, Spark Dataframes, SPARK SQL and python scripting. Developed Spark applications using Pyspark and Spark-SQL for data extraction, transformation and aggregation from multiple file formats for analyzing & transforming the data to uncover insights into the customer usage patterns. Good Knowledge and exposure to the Spark Architecture including Spark Core, Spark SQL, Data Frames, Spark Streaming, Driver Node, Worker Node, Stages, Executors and Tasks. Involved in performance tuning of Spark Applications for setting right Batch Interval time, correct level of Parallelism and memory tuning. Expert in understanding current production state of application and determine the impact of new implementation on existing business processes. Involved in Migration of data from On-prem server to Cloud databases (Azure Synapse Analytics (DW) & Azure SQL DB). Good Hands on experience in setting up Azure infrastructure like storage accounts, integration runtime, service principal id, and app registrations to enable scalable and optimized utilization of business user analytical requirements in Azure. Expert in ingesting streaming data with Databricks Delta tables and Delta Lake to enable ACID transaction logging. Expert in building Delta Lake On top Of Data Lake and performing transformations in Delta Lake. Expert in implementation of distributed stream processing platform with low latency and seamless integration, with data and analytics services inside and outside Azure to build your complete big data pipeline. Expert in performance tuning of delta lake (optimize, rollback, cloning, time travel) implementation. Developed complex SQL queries using stored procedures, common table expressions (CTEs), temporary table to support Power BI reports. Development level experience in Microsoft Azure providing data movement and scheduling functionality to cloud-based technologies such as Azure Blob Storage and Azure SQL Database. Independently manage development of ETL processes development to delivery. -- Email is the best way to reach me Thanks & Regards Suresh Babu | Talent Acquisition Teamware Solutions Inc | 8951 Cypress Waters Blvd, Suite #1092, Dallas Texas 75019 | : [email protected] LinkedIn : https://www.linkedin.com/in/lorvensureshbabu/ Keywords: business intelligence database information technology Idaho Michigan |
[email protected] View all |
Thu Sep 07 21:45:00 UTC 2023 |