Home

Aakash - Data Scientist
[email protected]
Location: Piscataway, New Jersey, USA
Relocation:
Visa: Opt-EAD
Aakash Ojha | Data Scientist
Contact: 732-769-8916 | Email: [email protected]
LinkedIn: https://www.linkedin.com/in/aakash-ojha-157582310

Summary
Experienced Data Scientist with 7 years in the field, specializing in Artificial Intelligence/Machine Learning, Statistical Modeling, Deep Learning, Data Analytics, and Natural Language Processing (NLP).
Expertise in acquiring accurate datasets, data scrubbing, and utilizing statistical techniques for data engineering.
Proficient in Exploratory Data Analysis with a curious mindset and skilled in building diverse Machine Learning Algorithms for predictive modeling.
Capable of designing visually appealing visualizations to enhance business profitability and well-versed in managing the entire data science project life cycle.
Deep understanding of statistical modeling, multivariate analysis, model testing, and problem analysis.
Proficient in transforming business requirements into analytical models and algorithms.
Experienced in building various Machine Learning models, including Logistic Regression, Linear Regression, Neural Networks, Random Forest, and more.
Familiarity with Python packages like pandas, NumPy, Seaborn, SciPy, Matplotlib, and scikit-learn.
Skilled in Regression and Classification models using scikit-learn on Python and adept at identifying trends and extracting insights from high-dimensional datasets.
Experience in Deep Learning using libraries such as TensorFlow and Keras.
Expertise in Machine Learning models like Linear, Logistic, Decision Trees, Random Forest, SVM, K-Nearest Neighbors, clustering, and Bayesian.
Proficient in data manipulation, wrangling, model building, and visualization with large datasets.
Hands-on experience in the complete Data Science project life cycle.
Knowledgeable in Deep learning and Artificial Neural Networks including Convolution Neural Networks, Recursive Neural Networks, and Recurrent Neural Networks.
Skilled in extracting data using PostgreSQL, Microsoft SQL Server, and MySQL.

Education
Master's Degree: Data Science and Strategic Analytics
Bachelor's Degree: Business and Personal/Financial Services Marketing Operations, Mercer County Community College
Associate degree: Business Administration and Management, General

Technical Skills
Machine Learning: Ridge, Lasso, Linear and Logistic Regression, Decision Trees, Random Forest, Ensemble algorithms, SVM, KNN, XGBoost, Unsupervised machine learning algorithms like k-Means Clustering, Hierarchical Clustering, PCA, Model Performance metrics (F-Score, AUC/ROC, Confusion Matrix, MAE, RMSE)
Programming Languages: Python, SQL (MySQL, PostgreSQL), R, MongoDB, Unix, PySpark (MLLib)
Python Frameworks: Flask, SageMaker, TensorFlow 1.x and 2.0, PyTorch, Spacy, PyTest, Big Data Analytics with PySpark, Jupyter Notebook, Anaconda
Python Libraries: NumPy, Pandas, Scikit-learn, Keras, Matplotlib, Seaborn
Deep Learning: ANN, CNN, RNN, LSTM, BiLSTM s, Feature Engineering, Feature Selection
NLP: AWS Comprehend, Text Classification, Sentiment Analysis, Topic Modelling, Document Classification, Custom named entity models; Attention-based models, Encoders, Decoders, Transformer models, NLP tasks using CBOW, TF/IDF, Word2Vec, lemmatization, Tokenization, stop words, n-grams, NLTK, Spacy, Gensim, Word Embeddings
Cloud Platforms: Amazon AWS, Google Cloud Platform (GCP), Azure
Data Science Portfolio: Productionizing Models, Recommendation Engines, Customer Segmentation, Time Series Forecasting, Collaborative Filtering Techniques
Reporting & Visualization Tools: Tableau, PowerBI, Informatica, Microsoft Excel
Big Data Tools: Sqoop (HDFS and HBase tables), Hadoop, Hive, ETL, Redshift, Apache Spark

Professional Experience
Stockton University
Data Scientist (August 2022 Present)
Used Scikit-Learn and stats models to develop regression algorithms such as Linear Regression, Decision Tree, Random Forest models that help in decision making.
Designed and implemented data pipelines to build a company-wide data lake using Python and multiple AWS services.
Worked with Data Engineers and Data Analysts into a cross-functional team for the deployment of models and working of the projects.
Led a team in building a natural language understanding system for analyzing customer feedback through the implementation of Language Model-based Learning (LLM), enabling us to extract valuable insights and make data-driven decisions to enhance product development strategies.
Performed Data collection, Data cleaning, Feature scaling, Feature engineering, Validation, Visualization, Data Resampling, report findings, Deep Learning, develop strategic uses of data by Python libraries like NumPy, Pandas, SciPy, Scikit-Learn, TensorFlow.
Performed Data Visualization using RStudio, used ggplot2, lattice, high charter to make interesting plots.
Developed and implemented a Vortex AI project, simulating complex natural phenomena for improved predictive modeling.
Spearheaded the creation of a personalized content recommendation system using Generative AI, enhancing user engagement and driving significant improvements in customer satisfaction metrics.
Implemented various statistical techniques to manipulate the data like missing data imputation, Principal Component Analysis PCA for dimension-reduction.
Utilized advanced algorithms to replicate the behavior of vortices, contributing to cutting-edge research in fluid dynamics.
Applied Vortex AI techniques to enhance understanding and prediction capabilities in weather and aerodynamics.
Developed a cutting-edge chatbot using Language Model-based Learning (LLM) capable of understanding and responding to complex user queries, revolutionizing customer support processes and reducing response times significantly.
Collaborated with interdisciplinary teams to refine models and optimize simulations for accuracy and efficiency.
Contributed to the advancement of AI-driven solutions in weather prediction and fluid dynamics through Vortex AI expertise.
Utilized Spark, Scala, Hadoop, Spark Streaming, MLLib, R, a broad variety of machine learning methods including classifications, regressions, dimensionality reduction etc.
Built various machine learning models including linear regression, Random Forest, SVM and decision trees using Python libraries.
Performed data cleaning and feature selection using MLLib package in PySpark.
Created various types of data visualizations using Tableau, Power BI, and other libraries like Matplotlib, Seaborn, ggplot2.
Led a project leveraging Generative AI to automate the generation of product descriptions, resulting in increased efficiency and consistency across marketing materials.
Performed Linear Regression onto the classified clusters of customers that deduced from clustering through K-NN and K-means clustering. Employed statistical methodologies such as A/B test, experiment design and hypothesis testing.
Environment: Python (Scikit-Learn/ SciPy/ NumPy/ Pandas/ Matplotlib/ Seaborn), Machine Learning (Random Forest, Gradient Boosting tree, Neural network using Keras), ML, DL, PySpark, SQL Server, RStudio, Tableau, Big Data - Spark, Scala, NLP, Word2Vec, Fast Text, Artificial Intelligence.

Tractor Supply Co
Data Scientist/ML Engineer (June 2021 - July 2022)
Built Time series Models and stored Trend, Seasonality, and residual values in separate tables.
Created Model APIs for SARIMA, ARIMA, Prophet models, enhancing tracking and usability.
Designed and implemented a Language Model using Chatterbot and transformers for an intelligent chatbot.
Engineered a sentiment analysis model leveraging web scraping techniques.
Built regression models (Lasso, Ridge, SVR, XGBoost) to predict Customer Lifetime Value.
Utilized PySpark cloud environment and Git, SQL ingestion for data processing.
Collaborated with data engineers and operations team to implement ETL processes and optimize SQL queries.
Performed data analysis using Hive and SQL, data imputation using Scikit-learn, and customer-specific feature exploration using Spark SQL.
Developed models and algorithms using Python (NumPy, SciPy, pandas, scikit-learn, seaborn) and Spark 2.0 (PySpark, MLLib).
Utilized NLP techniques to optimize customer satisfaction.
Designed data visualizations with Tableau and Matplotlib.
Performed data collection, cleaning, feature engineering, validation, visualization, deep learning, and strategic data usage with Python libraries.
Developed predictive models to analyze customer behavior and improve marketing strategies.
Implemented advanced machine learning algorithms to enhance forecasting accuracy.
Integrated machine learning models into production systems for real-time decision-making.
Environment: ETL, Hadoop, Hive, SQL, Redshift, LLM, Spark SQL, Multivariate analysis, Scikit-learn, seaborn, pandas, SciPy, NumPy, PySpark, ML Lib, Amazon SageMaker, Machine Learning algorithms (linear regression, classification, multivariate regression, clustering, regression models, classification models), Tableau, Matplotlib.

Novembal USA Inc.
Finance Intern and Data Analyst (Jan 2021 May 2021)
Accurately input financial data into accounting systems and maintain detailed financial records.
Manage accounts payable by processing vendor invoices, reviewing purchase orders, and preparing payments.
Handle accounts receivable tasks, including creating and sending invoices to clients and following up on late payments.
Conduct regular bank reconciliations to ensure that reported transactions align with bank statements.
Review employee expenditure reports, verify supporting documentation, and process reimbursement requests.
Assist in SQL-based data tasks such as data retrieval, manipulation, and querying to support financial analysis and reporting processes.
Created financial models to forecast revenue and expenses.
Performed variance analysis to identify discrepancies between actual and budgeted financial performance.
Assisted in the preparation of financial statements and reports for management review.
Developed automated reports using SQL to streamline financial reporting processes.
Analyzed financial data to provide insights into business performance and support decision-making.
Conducted market research to inform business strategy and identify growth opportunities.
Collaborated with cross-functional teams to implement financial best practices.
Environment: SQL, Excel, Financial Modeling, Variance Analysis, Financial Reporting, Market Research, Cross-functional Collaboration.

BNSF Railway
Data Scientist (June 2018 - December 2020)
Analyzed datasets to uncover trends and insights.
Created and deployed machine learning models and algorithms.
Utilized machine learning models like Linear Regression, Decision Trees, and Random Forest to predict Customer Lifetime Value.
Prepared data for effective analysis and collaborated with stakeholders to derive actionable insights.
Developed time series models such as ARIMA & LSTM to analyze campaign performance.
Created and maintained dashboards for marketing and sales performance tracking.
Optimized data pipelines for efficient data processing and implemented ML/DL models on the AWS cloud environment.
Implemented machine learning algorithms for predictive maintenance of railway equipment.
Conducted root cause analysis of equipment failures to improve reliability.
Developed data-driven solutions to enhance operational efficiency and reduce costs.
Collaborated with engineering teams to integrate data science solutions into maintenance processes.
Created predictive models to optimize inventory management and reduce downtime.
Utilized data visualization tools to present insights and recommendations to stakeholders.
Conducted A/B testing to evaluate the impact of process changes on operational performance.
Environment: Python (Scikit-Learn/ SciPy/ NumPy/ Pandas/ Matplotlib/ Seaborn), R, Tableau, AWS (Lambda, S3, EC2), PyTorch, NLP, Predictive Maintenance, Root Cause Analysis, Inventory Management, Data Visualization.

Genesys Cloud Service, INC.
Data Analyst/Scientist (June 2017 May 2018)
Analyzed datasets to uncover trends and insights.
Created and deployed machine learning models and algorithms.
Utilized machine learning models like Linear Regression, Decision Trees, and Random Forest to predict Customer Lifetime Value.
Prepared data for effective analysis and collaborated with stakeholders to derive actionable insights.
Developed predictive models to analyze customer behavior and improve marketing strategies.
Implemented advanced machine learning algorithms to enhance forecasting accuracy.
Integrated machine learning models into production systems for real-time decision-making.
Conducted market basket analysis to identify product affinities and optimize cross-selling strategies.
Developed customer segmentation models to personalize marketing efforts.
Analyzed customer feedback to identify areas for product improvement and innovation.
Utilized natural language processing techniques to extract insights from customer reviews.
Created interactive dashboards to monitor key performance indicators and track business performance.
Collaborated with marketing and sales teams to develop data-driven strategies.
Environment: Python (Scikit-Learn/ SciPy/ NumPy/ Pandas/ Matplotlib/ Seaborn), R, Tableau, AWS (Lambda, S3, EC2), PyTorch, NLP, Customer Segmentation, Market Basket Analysis, Customer Feedback Analysis, Data Visualization.
Keywords: artificial intelligence machine learning business intelligence sthree rlang Colorado

To remove this resume please click here or send an email from [email protected] to [email protected] with subject as "delete" (without inverted commas)
[email protected];3703
Enter the captcha code and we will send and email at [email protected]
with a link to edit / delete this resume
Captcha Image: