JENNIFER - DATA ENGINEER |
[email protected] |
Location: Bear, Delaware, USA |
Relocation: |
Visa: H1B |
SUMMARY:
Data Engineering and SQL/ETL Developer with over 11+ years of experience, adept in designing, implementing, and optimizing advanced data solutions that enhance business operations for prominent financial, retail and telecom sectors. Extensive experience in designing, deploying, and managing Apache Kafka-based messaging systems. Adept at real-time data streaming, distributed systems, and ensuring high throughput and low-latency data pipelines. Proven ability to integrate Kafka with enterprise-level applications to enhance data flow and operational efficiency. Developed and maintained scalable web applications using Python and Flask. Developed and implemented ETL pipelines using Snowflake and Apache Airflow to ingest data from various sources, reducing data processing time. Built and deployed microservices in Docker and Kubernetes, streamlining deployment pipelines and enhancing system reliability. Experienced in creating complex T-SQL queries for data analysis and data manipulation. Organize analytical data, ensure data quality though data profiling, and aggregate data for strategic reporting, performance optimization on database solutions. Designed and built a data warehouse using Azure Synapse Analytics and Azure Data Factory for various use cases. Strong ETL/ELT Expertise with a proven history of designing and implementing complex ETL processes for large-scale data integration and transformation. Expert in creating Power BI Reports, Dashboards, Power BI Apps using Power BI Desktop and Power BI Service. Participate in delivery of a broad range of Data, Analytics, Visualization and Data Engineering. Designed and maintained scalable ETL pipelines using Apache Spark, Python, and SQL. Conducted performance tuning for SQL and NoSQL databases cutting query response times. Implemented workflows with AWS Glue and Amazon Redshift optimizing data transformation and reducing processing time. Skilled in designing efficient data models and database structures to support data warehousing and analytics needs. Solid experience in database design and development, I've worked on building efficient DDL objects and crafting DML SQL statements to streamline data management and ensure accuracy. Having a strong background in Transact-SQL (T-SQL), I've developed and optimized stored procedures, functions, triggers, and jobs, automating key processes and improving database performance over time. My experience with ETL processes involves using SSIS and similar tools to efficiently move, transform, and load data, ensuring it's ready for analysis and use across different systems. I ve focused on SQL performance tuning, optimizing queries, refining indexing strategies, and fine-tuning databases to boost data retrieval speed and efficiency. In supporting SSRS reports, I've resolved issues and worked to ensure that reports are accurate, responsive, and meet the specific needs of business users. Proficient in cloud-based data solutions, utilizing AWS services to build scalable and cost-effective data pipelines. Expertise in automating data workflows using tools like Apache Airflow, scheduling, and orchestrating data processes. Proven ability to manage end-to-end data projects, from requirements gathering to deployment and maintenance. Knowledgeable about data privacy regulations and best practices, ensuring data security and compliance. Profoundly experience in feature design and flexible usage of tabular charts, histograms, and maps to express business solutions comprehensively using visualization tools including Power BI and Excel. Proficient in developing predictive models, performing advanced statistical analysis, and creating interactive visualizations. Skilled in interpreting complex datasets, developing predictive models, and delivering actionable insights for strategic decision-making. Identified and addressed data inconsistencies using various Python libraries for preprocessing, improving data quality. Expertise in data collection, processing, exploratory data analysis, Machine Learning and Predictive Modeling. Experience in using Jira for project management, issue tracking, and agile development. Proficient in documenting project specifications, code documentation, and knowledge sharing using Confluence. Expertise in Git Ops methodologies, leveraging tools such as GitLab to streamline CI/CD processes and enhance deployment efficiency. Experience in working with Agile development methodologies, including Scrum and Kanban. Enthusiastic learner, consistently staying updated with the latest data engineering trends and technologies. Experience in the entire Software Development Lifecycle including requirements gathering, design, coding, testing, debugging and maintenance. Experience in integrating with multiple RESTful web services and parsing XML and JSON responses. Good knowledge of Software Configuration Management, version control and defect tracking tools. Experience in Test Driven Development Process of software development. Proficient in handling multiple projects simultaneously. Experience in bug fixing enhancement and providing 24/7 support in production. TECHNICAL EXPERTISE: Languages: Python, SQL and PL/SQL Frameworks & Tools: NumPy, Pandas, Matplotlib, Seaborn, scikit-learn, Flask, Apache Airflow, Apache Kafka, Snowflake, Apache Spark, Docker and MS Excel. Data Visualization and reporting: SQL Server Integration Services (SSIS), SQL Server Reporting Services (SSRS), SQL Server Analysis Services (SSAS), Power BI, Tableau Cloud Platforms: AWS S3, Redshift, Snowflake and Azure Data Factory Version Control and Tools: JIRA, Git, GitLab, Jenkins, Bitbucket and SourceTree Web Technologies: HTML 5, CSS and JavaScript, Java, C# Web Services: REST and SOAP Database: MS SQL Server 2019/2017, MS Access, Oracle EDUCATION: Master of Engineering in Information Technology CERTIFICATION: AWS Certified Solution Architect - Associate Professional Scrum Product Owner Professional Scrum Master PROFESSIONAL EXPERIENCE: Verizon, NJ Senior Data Engineer August 2020 Present B360 App B360 app is a retail app that offers add a line/ multiple line to the customer, add new plan to new/ existing customers, creating quote for new customer etc., For each plan there are different type of contracts like 1,2,3-year contract, 30 months DP, 24 months DP. Scanner App The Scanner app is a Verizon comprehensive inventory management application. It offers a range of features to streamline inventory processes, including: Physical Inventory: Conducting physical inventory checks to ensure accuracy and update stock levels. Pending Transfers: Managing pending transfers of goods between locations or departments within the organization. Store-to-Store/Store-from-Store Transfers: Facilitating transfers of goods between different stores or from a central location to individual stores. Goods Receipts: Recording and managing goods receipts, ensuring that inventory levels are updated accurately upon receipt of new stock. Responsibilities: Gathered and documented project requirements from stakeholders and translated them into functional and non-functional specifications for BI reports and applications, and transformed them into deliverable solutions. Developed and maintained real-time streaming applications using Kafka Streams and Connectors. Integrated Kafka with microservices and external databases like Cassandra, Elasticsearch, etc., ensuring seamless data flow. Integrated data from on-premises SQL Server to Azure Data Lake Storage, improving data accessibility and reducing processing time. Used GitHub Actions to help automate testing and deployment, ensuring smooth integration and continuous delivery in production environments. Developed CI/CD pipelines using Azure DevOps for automated deployment of Azure Data Factory workflows, reducing deployment time. Designed and maintained Docker-based microservices architecture, enabling easy scaling and deployment of applications. Designed RESTful APIs for client-facing and internal services, reducing latency through optimized endpoint structures through Python and flask. Optimized Kafka broker configurations, ensuring fault tolerance, scalability, and data retention. Responsible for extracting data from internal systems and external sources using T-SQL and PL/SQL, implementing business requirements through Stored Procedures, User Defined Functions (UDFs), and Views. Developed and implemented ETL pipelines using Snowflake and Apache Airflow to ingest data from various sources, reducing data processing time. Built and deployed microservices in Docker and Kubernetes, streamlining deployment pipelines and enhancing system reliability. Designed and developed ETL packages to realize full loading and incremental loading to convert and migrate data, transform data using lookup, merge joins, derived columns, and built the target data warehouse and data marts based on Star Schema. Developed and optimized ETL pipelines using Python to process and transform large datasets, improving data ingestion performance. Developed PowerBI reports and visualizations to present complex data analysis results in an easy-to-understand format. Developed script for extracting, transforming, and loading (ETL) mass data from diverse sources and formats. Conducted data extraction, transformation, and loading using SQL, and developed complex queries for retail data analysis. Developed and deployed database scripts for data quality and stability enhancements. Developed reporting solutions for different stakeholders from mock-up demos till deployment to analyse different dimensions as claims, transactions, appointments, supply, assets, and properties in PowerBI. Cleaned and validated the data in the staging area, corrected and unified the format, handled the missing values, transformed the data and split or combined columns based on specification Created data visualizations and reports using PowerBI to inform and support management decisions. Troubleshooting and correcting production ETL issues and support. Worked in an Agile environment, collaborating with cross-functional teams for project delivery AMERICAN WATERS, Camden, NJ Data Engineer July 2019 September 2020 HqO Facilities Management Application for American Water employees. It includes features like Indoor Navigation, Submit an Idea, Dining, Evacuation, Parking pass registration, Employee details. Responsibilities: Optimized Data Warehouse processes using SQL and Python to import, retrieve, and analyze data. Automated the deployment process with Docker Compose and Docker Swarm, reducing deployment times. Led Kafka upgrades and ensured Zookeeper s high availability and data integrity. Migrated on-premises data to the cloud using Azure Data Factory and Azure Blob Storage, ensuring minimal downtime during the migration process. Created Python scripts for data analysis and automation tasks, improving team productivity. Responsible for development and maintenance of data warehouse and complex reports. Led the development and maintenance of the data warehouse and reporting systems, enhancing data availability for strategic decision-making. Contributed to the design and implementation of data warehousing solutions for performance analysis, bolstering analytical functions and strategic planning. Utilized PowerShell scripts to automate deployment processes, significantly reducing deployment times across environments. Created robust datasets and stored procedures using T-SQL, PL/SQL, streamlining data processing and reporting workflows. Enhanced report efficiency by tuning embedded datasets and replacing queries with more efficient stored procedures. Collaborated with quality control teams to develop test plans and cases. Created visualizations and reports using Tableau to support management decisions. Leveraged GIT for version control and project coordination, enhancing team collaboration and project management. Developed and maintained database objects like tables, indexed views, indexes, constraints, triggers, and procedures, enhancing database functionality and integrity. Wrote calculated columns, Measures queries in power bi desktop to show good data analysis techniques. Responsible for the design and development of several RPA processes utilizing Automation Anywhere software which was later switched to Power BI towards the end of my tenure. Ucrya, Lake Mary, FL Software Developer September 2016 June2019 FinApp It is for Finance and Administration Solutions to Streamline complex finance and admin processes with automated workflows. It ensures full legal compliance, audit capabilities, and maximum efficiency of finance and admin processes by seamlessly integrating the existing systems for financial reporting, contract management, or purchase orders in automated workflows. Responsibilities: Automated testing processes using PyTest and Jenkins, improving test coverage and reducing bugs in production. Performed Data Analysis between multiple databases and the Data Warehouse to create mapping documents. Developed tabular queries for efficient analysis of report using Pivot/Un pivot in T-SQL. Created Python scripts for data analysis and automation tasks, improving team productivity. Gathered requirements and coordinated with different End Users to assure satisfactory data content, quality, data integrity and receipt of desired information. Database creation (DDL, DML, DCL), database tuning, SQL tuning, performance planning. Participated in project planning sessions with project managers, business analysts and team members to analyze business requirements and outline the proposed solution. Experience with manipulating, processing and extracting value from large, disconnected datasets. Created Datasets in T-SQL, stored procedures for SSRS and developed Excel Power View, Power Query and PowerPivot dashboards for data analysis. Created complex stored procedures to perform Index maintenance, data profiling, metadata search, critical staging procedures for loading data marts to pull the data from data sources for various reports. Gather and utilize multiple data sources, define metrics, and prepare tabular semantic data models and leverage to reporting data model. Used SQL Server to schedule ETL package executions and troubleshoot job failures. PSG GROUP OF INSTITUTIONS (Delaware & India) Research Engineer/ Data Analyst January 2013 March 2016 Clinical System Responsibilities: Created comprehensive databases and schema objects, including tables, indexes, and constraints, ensuring efficient data storage and retrieval to support application requirements. Worked extensively on data conversion and cleansing processes, transforming raw data into structured formats and ensuring high data quality and consistency for downstream applications. Sync, ensuring consistent and up-to-date data across multiple regions for enhanced availability and reliability. The Project aims at migrating the legacy system to SQL Server. The Application will load the prescribers related data from Files to the Target Tables. The Project aims at migrating the legacy system to SQL Server. The Application will load the prescribers related data from Files to the Target Tables. Gathering the functional requirements to build an Enterprise level data warehouse for Medical Agents. Analyzing the data and prepare a mapping document according to the SSIS development standards. Review of the mapping documents. Design and develop the SSIS packages according to the requirements. Drafting Design documents and technical documents for the project components. Unit testing for the developed SSIS packages. Data Analyst January 2011 December 2012 Agent Based Recommender Responsibilities: Planned, designed, and documented comprehensive database architectures based on detailed business requirements, ensuring scalability, reliability, and efficient data storage. Optimized SQL scripts, stored procedures, and triggers by employing advanced performance tuning techniques, resulting in significantly improved database response times and system efficiency. Monitored SQL Server performance meticulously using SQL Server Profiler and Index Tuning Wizard, identifying bottlenecks, and implementing enhancements to maintain optimal database performance. Designed, developed, and deployed a variety of reports using SQL Server Skilled in Agile methodologies, with extensive hands-on experience in managing Scrum stories and sprints, facilitating seamless collaboration and efficient project delivery in Python-based development environments. Implemented SQL database log shipping and transactional replication Worked extensively with a variety of file sources such as flat files, Excel files, and XML files, transforming and loading data into databases for further processing and analysis. Performed advanced VB and C# scripting within SSIS Script Components to enhance ETL processes, adding custom functionalities and error handling mechanisms. Involved in comprehensive performance tuning and optimization of SQL procedures and queries, significantly improving database efficiency and reducing query execution times. Generated ad-hoc reports using SSRS and Crystal Reports, delivering tailored insights and data visualizations to meet specific business requirements and decision-making needs. Keywords: csharp continuous integration continuous deployment business intelligence sthree active directory information technology microsoft procedural language Florida New Jersey |