Pavan - Data Engineer |
[email protected] |
Location: Chicago, Illinois, USA |
Relocation: Yes |
Visa: H1B |
PROFESSIONAL SUMMARY
Dynamic and motivated IT professional with around 7+ years of experience as a Big Data Engineer with expertise in designing data-intensive applications using Cloud Data engineering, Data Warehouse, Hadoop Ecosystem, Big Data Analytical, Data Visualization, Reporting, and Data Quality solutions. Hands-on experience across the Hadoop Ecosystem that includes extensive experience in Big Data technologies like HDFS, MapReduce, YARN, Apache Cassandra, NoSQL, Spark, Python, Scala, Sqoop, HBase, Hive, Oozie, Impala, Pig, Zookeeper, and Flume. Built real-time data pipelines by developing Kafka producers and Spark streaming applications for consumption. Utilized Flume to analyze log files and write them into HDFS. Experienced with Spark improving the performance and optimization of the existing algorithms in Hadoop using Spark Context, Spark-SQL, Dataframe API, Spark Streaming, and Pair RDDs and worked explicitly on PySpark. Developed framework for converting existing PowerCenter mappings to PySpark (Python and Spark) Jobs. Hands-on experience in setting up workflow using Apache Airflow and Oozie workflow engine for managing and scheduling Hadoop jobs. Keywords: information technology |