Diwesh Ojha - Data Analytics Engineer , Business Intelligence |
diweshojha877@gmail.com |
Location: Buffalo, New York, USA |
Relocation: Open |
Visa: H1-B |
Diwesh Ojha
Contact: diweshojha877@gmail.com | 980-485-4544*104 | https://www.linkedin.com/in/ojhadiwesh/ ________________________________________ Professional Summary Senior Analytics Engineer with over 12 years of experience driving data-powered initiatives for global organizations. Developed and optimized data pipelines using Apache Airflow, Fivetran, and dbt, ensuring efficient ETL processes and high data integrity across multiple business functions. Built scalable data architectures leveraging Snowflake, Databricks, and BigQuery to support real-time analytics and enterprise-wide reporting. Designed and implemented machine learning models using Python (Scikit-learn, TensorFlow, XGBoost) and SQL, enabling predictive analytics for business insights. Created interactive dashboards and reports in Tableau, Power BI, and Looker, automating business intelligence workflows and executive reporting. Engineered cloud-based data solutions using Google Cloud Platform (BigQuery, Cloud Functions, Cloud Storage, Pub/Sub) to handle large-scale data processing. Optimized SQL queries and data transformations for high-performance analytics on Snowflake, BigQuery, and PostgreSQL, improving query efficiency and reducing costs. Developed data APIs and integrations using Python (Flask, FastAPI) and RESTful services to enable seamless data exchange between applications. Implemented A/B testing frameworks and experiment tracking with MLflow, SciPy, and Statsmodels, driving data-driven decision-making for product improvements. Automated data workflows and infrastructure deployment using Terraform, Docker, and Kubernetes, ensuring reliability and scalability of cloud environments. Integrated CI/CD pipelines for data engineering using GitHub Actions, dbt Cloud, and Jenkins, streamlining development and deployment of analytics solutions. Technical Skills: Programming & Scripting: Python (Pandas, NumPy, Scikit-learn, TensorFlow, PySpark), SQL (PostgreSQL, MySQL, Snowflake SQL, BigQuery SQL), R (dplyr, ggplot2, tidymodels), Shell Scripting (Bash) Data Engineering & ETL: Apache Airflow, dbt (Data Build Tool), Fivetran, Apache Spark (PySpark, Databricks), Google Cloud Dataflow, Kafka (Streaming Data) Databases & Data Warehousing: Snowflake, Google BigQuery, PostgreSQL, MySQL, MongoDB Business Intelligence & Visualization: Tableau, Power BI, Looker, Metabase, Google Data Studio Cloud & DevOps: Google Cloud Platform (BigQuery, Cloud Functions, Pub/Sub, Cloud Run), AWS (S3, Redshift, Lambda, Glue), Terraform, Docker & Kubernetes, CI/CD (GitHub Actions, Jenkins, dbt Cloud) Machine Learning & Analytics: Scikit-learn, XGBoost, TensorFlow, MLflow (Experiment Tracking), A/B Testing (SciPy, Statsmodels), Predictive Analytics & Time Series Forecasting Software Development & APIs: FastAPI, Flask RESTful APIs & GraphQL, Git (GitHub, GitLab), Agile & Scrum Methodologies Senior Analytics Engineer | ACV Auctions (Buffalo,NY) Dec 2020 Present Partnered with cross-functional teams, including engineering, sales, and marketing, to design and implement advanced analytics solutions that drive business growth. Leveraged cloud platforms and modern data tools to solve complex business challenges and deliver actionable insights to key stakeholders. Sales and Marketing Analytics: o Designed predictive analytics models in Snowflake and BigQuery, identifying at-risk dealers based on multivariate data patterns, including historical sales, engagement metrics, and customer feedback. o Orchestrated multi-team efforts to implement customer retention campaigns, resulting in the reactivation of 500+ dealers and a 20% increase in engagement rates. o Built dynamic Tableau dashboards integrating real-time data from Databricks Delta Lake, providing sales teams with actionable insights for targeting underperforming regions. o Devised machine learning algorithms for customer segmentation, enabling personalized marketing strategies for inactive dealers. Operational Efficiency for Auto Transport: o Analyzed delivery data in BigQuery ML, identifying inefficiencies in transport workflows and proposing optimization strategies that reduced delivery cycle times by ~30%. o Designed a Databricks-based geospatial analysis model to optimize vehicle routing, cutting operational costs by 15%. o Automated end-to-end data pipelines using Apache Airflow on GCP, ensuring seamless integration and data reliability across Snowflake, BigQuery, and Tableau systems. Comprehensive Product Launch Support: o Collaborated with product managers to launch two flagship products, managing requirements gathering, feasibility studies, and analytics for user adoption rates. o Conducted A/B testing for product feature performance using Databricks notebooks, leading to a 10% improvement in conversion rates. o Established a robust feedback loop with the marketing team, enabling iterative improvements based on customer insights visualized in Tableau. Stakeholder Management and Consulting: o Facilitated weekly strategy meetings with leadership teams, presenting actionable insights through interactive Tableau dashboards and Google Data Studio reports. o Developed a self-serve analytics portal using Snowflake and BigQuery, empowering stakeholders to independently access KPIs and metrics critical for decision-making. Scalable Data Infrastructure Development: o Engineered Python-based ETL pipelines for processing high-volume customer data, reducing manual intervention by 90%. o Implemented Databricks Delta Lake to manage large-scale time-series data for operational analytics. o Designed and monitored scalable pipelines in Apache Airflow and GCP Cloud Composer, ensuring timely delivery of high-quality data for business-critical use cases. Advanced Customer Analytics: o Applied time series forecasting models in BigQuery ML to predict regional sales trends, enabling proactive inventory and workforce planning. o Leveraged Databricks' Spark MLlib for advanced analytics on customer churn, uncovering actionable insights that reduced churn rates by 15%. Key Achievements: Revitalized dealer engagement strategies, contributing to a 15% increase in regional sales and improved customer satisfaction metrics. Reduced delivery times and operational costs through data-driven process optimizations. Designed analytics solutions that increased productivity for sales and marketing teams by 25%, enabling them to focus on high-value opportunities. Technical Stack: Snowflake, BigQuery, Databricks, Tableau, Apache Airflow, GCP (Cloud Storage, Cloud Composer), Python, SQL, Google Data Studio, Spark MLlib. ________________________________________ Business Intelligence Engineer | Sonwil Logistics (Buffalo,NY) Jun 2018 Dec 2020 Developed and implemented data-driven solutions to streamline logistics, enhance customer insights, and improve operational efficiency. Acted as a strategic consultant across business units, translating complex data into actionable business strategies and driving measurable impact. Market Analysis and Opportunity Identification: o Conducted comprehensive market research using BigQuery and Databricks to analyze trends and uncover opportunities for expansion into untapped regions, increasing revenue by 15%. o Built interactive dashboards in Google Data Studio that visualized customer behavior trends, enabling the sales team to prioritize high-potential leads. o Partnered with business development teams to design predictive models for lead scoring, boosting conversion rates by 20%. Supply Chain and Inventory Optimization: o Developed supply chain forecasting models using Snowflake and BigQuery ML, incorporating data from logistics, inventory, and production systems to reduce inventory costs by 20%. o Implemented geospatial analytics in Databricks to optimize delivery routes, reducing transportation costs by 25%. o Automated logistics reports with Apache Airflow on GCP, ensuring real-time data availability and reducing manual processing by 90%. Advanced Data Engineering for Operational Efficiency: o Designed scalable data pipelines using Databricks Delta Lake and GCP Cloud Storage, ensuring seamless integration of disparate data sources. o Improved data pipeline performance by 40% through the implementation of advanced ETL workflows using Apache Airflow. o Built predictive maintenance models using Python and SQL to proactively identify potential failures, reducing downtime by 30%. Cross-Functional Collaboration and Business Process Mapping: o Partnered with senior management to create detailed process maps and workflow diagrams in Tableau, identifying inefficiencies in logistics and operations. o Acted as a bridge between business units and technical teams, translating business needs into technical requirements for scalable data solutions. o Led workshops with cross-functional teams to align on key performance indicators and establish data governance frameworks. Ad Hoc Analyses and Reporting: o Conducted in-depth customer segmentation analysis using Snowflake and BigQuery, identifying high-value customer segments that contributed to a 25% increase in lead generation. o Designed automated reporting templates in Google Data Studio and Tableau, ensuring consistency in data presentation for executive stakeholders. Leadership and Mentorship: o Trained and mentored junior analysts in data engineering best practices, including cloud-based pipeline development and advanced analytics techniques. o Guided the adoption of modern tools like Databricks and GCP BigQuery, fostering a culture of innovation within the team. Key Achievements: Reduced operational costs by over $500K annually through optimized logistics and inventory management strategies. Increased lead conversion rates by 20% with data-driven market segmentation and predictive analytics. Streamlined reporting workflows, saving over 200+ hours per quarter for the analytics team. Technical Stack: Snowflake, BigQuery, Databricks, GCP (Cloud Composer, Cloud Storage), Apache Airflow, Python, SQL, Tableau, Google Data Studio. Data Analyst | Eazyfi (India) Jul 2016 Mar 2017 Analyzed large datasets to deliver actionable insights for improving customer acquisition, financial planning, and operational efficiency. Leveraged data-driven approaches and foundational analytics tools to support strategic decision-making across teams. Customer Acquisition Optimization: o Developed predictive models using Python and R (e.g., logistic regression, decision trees) to identify high-potential customers, increasing acquisition rates by 15%. o Conducted detailed segmentation analysis using Excel and SQL, enabling targeted marketing campaigns. o Visualized customer acquisition trends using Tableau, providing key insights to marketing and leadership teams. Vendor Management and Financial Analytics: o Created and maintained financial models in Excel, integrating data from SQL databases to support vendor negotiations. o Conducted ad hoc analysis on vendor performance, leading to a 10% cost reduction through renegotiated contracts. o Built automated reports using SQL queries and Excel macros, saving 10+ hours per week in manual work for the finance team. Customer Behavior Analysis: o Applied machine learning models using Python (e.g., Random Forest, k-means clustering) to analyze customer behavior and predict churn. o Provided actionable insights to marketing teams, contributing to a 20% improvement in campaign ROI. o Partnered with product teams to identify features influencing customer engagement, driving retention strategies. Data Visualization and Reporting: o Designed executive dashboards in Tableau, enabling real-time visibility into marketing campaign performance and financial KPIs. o Prepared monthly performance reports for stakeholders, integrating data from SQL databases and manual sources into comprehensive visual summaries. Collaboration and Process Improvement: o Worked with cross-functional teams to streamline reporting processes, improving data consistency and accuracy across the organization. o Introduced a scalable data documentation process for SQL queries, ensuring transparency and knowledge-sharing among team members. Key Achievements: Increased marketing efficiency by 20% through predictive modeling and targeted customer segmentation. Reduced manual reporting effort by 50% via SQL-based automation and improved workflows. Improved data accuracy and reliability for financial reporting, reducing errors by 15%. Technical Stack: Python, R, SQL, Tableau, Excel, Decision Trees, Random Forest, Logistic Regression, k-Means Clustering. Co-Founder | PhysicsWorm ( India ) Aug 2014 Oct 2016 Designed a digital learning platform using Python and SQL to provide physics education to underserved students, impacting 1,000+ learners. ________________________________________ Education Master of Science, Management Information Systems State University of New York, University at Buffalo (Aug 2017 Sep 2018) Bachelor of Technology, Production Engineering Birsa Institute of Technology, Sindri (Aug 2008 May 2012) Keywords: continuous integration continuous deployment machine learning business intelligence sthree active directory rlang Colorado New York |