Ranjana Rammohan - Data Engineer |
[email protected] |
Location: Remote, Remote, USA |
Relocation: |
Visa: |
Ranjana Rammohan
PROFILE SUMMARY Engineering professional with overall 8 plus years experience in software development. Versatile and experienced Python developer in cloud computing platforms like Google Cloud Platform (GCP) and Amazon Web Services (AWS). Proficient in designing and implementing ML solutions using Python libraries such as scikit-learn, OpenCV, PIL. Skilled in developing scalable web applications with PHP, JavaScript and SQL databases. Adept at deploying ML models and analyse large datasets. Strong background in integrating diverse data sources, performing data validation, and implementing data quality checks to ensure accurate insights. Familiarity with ML algorithms, model training, and evaluation using libraries like scikit-learn and TensorFlow. Able to develop predictive and classification models. Skilled in using Python libraries like pandas, NumPy, and Matplotlib for exploratory data analysis (EDA) and insightful data visualization. Developed a Python-based data pipeline to automate data extraction, transformation, and loading (ETL) processes from various sources. Sound knowledge in Data Analysis, Data Validation, Data Cleansing, Data Verification and identifying data mismatches. Experience with Tableau in analysis and creation of dashboard and user stories. Strong experience in using MS Excel and MS Access to dump the data and analyze based on business needs. Developed and optimized complex SQL queries for data retrieval, aggregation, and reporting, improving query performance by [X]%. Experienced in Design, Development, Testing, Automation and Enhancement of applications. Familiar with the best development practices such as code reviews and unit testing. Leading, guiding and helping the team members with the challenges faced during the project development process and problem solving. Implemented JIRA workflows and issue types to align with Agile methodologies, enhancing team collaboration and project tracking. Sound knowledge and experience in Systems Development Life Cycle (SDLC) and Agile (Scrum) Software development. Utilized version control systems like Git to manage code repositories and streamline collaboration. Conducted code reviews and mentored junior developers to foster a strong development team. Have ability to learn new technologies and to work in team environment as well as independently to get the things done on time. EDUCATION B.Tech 2011-2015 Chennai, India CERTIFICATIONS PGP Machine learning Certified Program in Great Lakes Institute of Management, India (2019-2020) AWS Certified Data Analytics Specialty (2023 to 2026) TECHNICAL SKILLS Programming: Python, R, PHP, C#. Python Libraries: Pandas, NumPy, SciPy, Scikit-learn, PIL, Pillow, NLTK, Matplotlib, Boto3 BeautifulSoup, OpenCV. Web Development: Flask, HTML, CSS, JavaScript. BI/Analytics Tools: MS Excel, Tableau. Databases: MySQL, Microsoft SQL server, PostgreSQL, DynamoDB. Data Warehousing: Snowflake. Cloud Services: Google Cloud Platform, Amazon Web Services. Machine Learning: Regression, Classification, NLP, Feature Engineering, Text Analysis. Visualization: Tableau Version control: Git, GitHub PROFESSIONAL EXPERIENCE Angi, Tampa FL(Remote) Jan 2022 Till date Data Engineer Responsibilities: Customer Experience Improvement for retention Developed up data pipelines to collect customer data like name, date of service, service name, reviews and rating from various sources, including websites, mobile apps, and data collected from web scraping. Develop and enhance the application software development kit to provide APIs to access data and analyse using Jupyter notebook environment for EDA, Data visualization and Machine learning. Integrated data from various sources, including customer feedback, sales, and project completion rates. Used IAM for creating roles, users, groups and implemented MFA to provide additional security to AWS account and its resources using the google authenticator. Created S3 buckets in the AWS environment to store files, sometimes which are required to serve static content for a web application. Configured S3 buckets with various life cycle policies to archive the infrequently accessed data to storage classes based on requirement. Set up an AWS Lambda function to trigger the python script to publish the data in Amazon S3 and to load the real time data into DynamoDB. Created custom python scripts using the AWS SDK(Boto3) to transfer the data from S3 to DynamoDB. Integrated AWS Lambda functions to handle backend processes, improving user experience and service efficiency. Created visualizations using Matplotlib and Seaborn to illustrate customer satisfaction trends and product usage patterns. Developed and deployed AWS Lambda functions for real-time data processing, resulting in time reduction in data processing time. Experience on writing python scripts to handle metadata throughout pipeline and create Glue Catalog using AWS Lambda. Used Glue for data catalog and made it available for Query through Redshift Spectrum and Athena. Created AWS data pipeline to integrate the data from different sources into Snowflake data warehousing platform to perform the data analytics. Possess good knowledge in creating and launching EC2 instances using AMI's of Linux, and windows. Leveraged data analysis techniques to identify trends in customer preferences, guiding product recommendations and marketing strategies. Used Enterprise Snowflake Datawarehouse to populate the data for reporting team. Performed benchmark tests to read data from database, object store using pandas to compare results, identify potential improvement areas and provide recommendations. Created a in view in snowflake to calculate the retained customers in each cohort for subsequent month. Redesigned the Views in snowflake to increase the performance. Designed and implemented secure data pipelines into a Snowflake data warehouse from on-premises and cloud data sources. With Python libraries like Matplotlib, we created custom maps by fetching and processing data from Snowflake. Created an analytics dashboard using Python Plotly, matplotlib, seaborn to visualize service request data and technician performance metrics. Generated JSON files from the JSON models created for sales data and Claims using Snowflake DB. Implemented interactive visualizations to allow users to explore and analyse data trends. Developed line chart, bar chart and cohort analysis etc., in Tableau server by establishing a connection to the snowflake. Churn Prediction (Machine Learning model) Developed predictive regression models using historical data and customer behaviour including ratings, service requests, customer tenure to identify customers at risk of churning. Created Python scripts for data cleaning and transformation, improving data quality by some percentage. Performed Exploratory Data Analysis (EDA) to handle missing values, outliers, normalization etc and prepared the dataset for data modelling and evaluation. Created sentiment analysis using NLP techniques to analyze customer reviews and feedback, providing insights into product satisfaction and areas for improvement. Monitor data alerts, reporting alerts, regression test failure alerts and act, escalate and resolve the issues. Analyze feature importance to understand which factors contribute the most to churn and provided data insights to the business stakeholders. Work closely with the application customers to resolve JIRA tickets related to API issues, data issues, consumption latencies, onboarding, and publishing data. Experience on writing Redshift queries to create data products and performance tuning. Perform peer review, maintain code coverage, automate the application using CI/CD. Set up an AWS Lambda function that runs every 15 minutes to check for repository changes and publishes a notification to an Amazon SNS topic. Integrated AWS Lambda and API Gateway to handle booking requests and notifications. Utilized AWS DynamoDB for efficient data storage and retrieval of customer preferences and past bookings. Integrated various data sources, including service records and technician availability, for comprehensive insights. Read and write Parquet, JSON files from S3 buckets using Panda s data frame with various configuration. Develop code to assign default life cycle policy to buckets, objects and auto purge objects based on default policy in mercury which is internal implementation of AWS S3. Tools and Technologies Used: Python, PHP, JavaScript, HTML, CSS, Microsoft SQL Server, PostgreSQL, AWS services (EC2, S3, EBS, RDS, SNS, SQS, VPC, Cloud formation, CloudWatch, DynamoDB), Snowflake, GIT, JIRA, Tableau, ML algorithms, Jupyter Notebook, Microsoft Visual Studio Tata Consultancy Services - CoreLogic Solutions Inc., India Senior Software Engineer December 2016 August 2021 Responsibilities: Document Extraction by OCR on GCP Platform: Experience in migrating existing on-premises databases to Google Cloud Environment for better reporting experience. Extensive experience in Server infrastructure development on GCP Cloud by using GCP services like VM instances, Big Query, Google Cloud Storage, Source repositories, Security, Log monitoring, Alert monitoring, IAM role management. Extensive knowledge in GCS (Google cloud Storage) buckets and creating efficient buckets depending on the business requirement. Utilized Google Cloud Vision API for Optical Character Recognition to convert the raw images to text files. We use pre-trained machine learning models for text extraction, entity recognition, and document structure analysis. Developed unsupervised algorithms (K-Means Clustering algorithm) for creating groups of similar documents and perform classification on the groups for better accuracy and reduce manual effort by 20%. Major responsibilities include analysing large volumes of data and transforming that into structured patterns using strong analytical skills involving Python and Machine learning concepts. Write scripts using Python/PowerShell to automate and simplify internal IT processes. Involved in the data analysis, DB creation, applying database constraints and making ready to load onto database. Extracted and processed data from various sources and visualized insights using libraries like Pandas, NumPy, OpenCV, PIL, Matplotlib. Created a PHP-based web application to classify the document types using machine learning algorithms (classification algorithms) deployed on Google Cloud Platform. Integrated the application with Google Cloud Storage to handle large volumes of customer data securely. Developed front-end web applications in PHP, HTML, CSS, JavaScript for implementing requirements of a financial system and handling the end-to-end software development. Utilized Python's Pandas and NumPy libraries to perform data processing and analysis tasks on large datasets. Extracted relevant features and derived meaningful insights from unstructured textual data using various Python text processing techniques. (Regex patterns). Implemented a rubber banding concept for Images(.TIFF,.JPG,.PNG) in python using OpenCV library to directly extract the accurate data there by reducing the manual effort. Developed custom data analysis scripts in Python to streamline repetitive tasks and improve analysis efficiency. Presented EDA findings to the team and stakeholders, facilitating data-driven decision-making in the banking domain. Utilized Python's re module to develop and implement regular expressions for data parsing and text analysis tasks. Collaborated with domain experts to define complex regex patterns to extract specific information from unstructured text data. Designed and implemented custom validation rules using regular expressions to ensure the integrity of user input data in web applications. Utilized Python data visualization libraries, including Matplotlib and Seaborn, to create informative charts and graphs for reporting and presentations. Developed interactive and dynamic visualizations using Plotly, enabling users to explore data trends and patterns in real-time. Contributed to the design and development of PHP-based web applications, focusing on performance optimization and code maintainability. Utilized SQL, PostgreSQL databases to manage data and implemented complex queries to support application functionalities. Utilized GCP cloud Logging for creating logs across end-to-end application for analysis. Utilized Kibana dashboards and visualizations to monitor system performance, troubleshoot issues, and identify anomalies. Collaborated with a team of developers to deliver projects on time and within scope. Conducted code reviews and provided constructive feedback to ensure code quality and adherence to best practices. Automation of ETL tool to a python engine thereby reducing more than 60% of manual effort. Performed regression and unit testing for every modification made in the application and new builds. Provided timely and efficient production support for applications, responding to end-users' inquiries and resolving issues. Resolving the reported issues that arise out of the test cases and in production environment. Performed user acceptance testing (UAT) and system testing to validate new features and enhancements before release to production. Managed version control and code collaboration using Git, enabling seamless development and code deployment. Contributed to the creation and refinement of user stories in JIRA, providing detailed information to guide the development process. Tracked and managed software defects using JIRA, ensuring timely resolution and effective communication with the development team. Participated in regular meetings and Agile scrum ceremonies to coordinate testing efforts and provide status updates. Developed and automated data reports and methodologies. Generated and maintained analytical metrics and dashboards. Maintain model documents, templates, document specifications, and another reusable knowledge asset. Tools and Technologies Used: Python, Machine learning (Regression and Classification Algorithms), PHP, JavaScript, HTML, CSS, Microsoft SQL Server, PostgreSQL, Kibana dashboard,Google cloud platform services, GIT, JIRA, Jupyter Notebook, Microsoft Visual Studio Tata Consultancy Services Citi Banamex (Mexico), India. Software Engineer May 2015 December 2016 Responsibilities: Experienced in production support for .NET and python applications which are used in Investment Banking, Asset and Wealth Management areas under BFS domain. Provided 2nd and 3rd level production support for Python and .NET applications, resolving technical issues and ensuring minimal downtime. Assisted end-users with technical inquiries and issues related to Python and .NET applications, providing timely and effective resolutions. Collaborated with cross-functional teams, including developers and business analysts, to investigate and troubleshoot application incidents and outages. Providing the end user communication for critical issues like service delays/disruption with ETAs. Daily system and application health checks are performed to ensure the smooth running of the application. Handling user queries related to reports, daily job schedules/Database management and providing the solutions within the specified SLAs. Investigate and troubleshoot complex issues related to incidents, problems, and changes, collaborating with cross-functional teams to identify root causes and implement effective solutions. Create and maintain documentation related to known issues, workarounds, and best practices for the ServiceNow knowledge base. Participate in the planning and execution of ServiceNow platform upgrades and patching, ensuring minimal impact on production services. Conduct regular health checks on the CMDB and Asset Management modules to maintain accurate and up-to-date configuration data. Design and implement automated workflows to streamline service request fulfilment and incident resolution processes. Coordinate with third-party vendors for support escalations and coordinate activities during major incidents. Tools and Technologies Used: Python, ASP.NET, ServiceNow, Microsoft Server Studio, HTML, C#, Linux, Microsoft Visual Studio Keywords: csharp continuous integration continuous deployment machine learning business intelligence sthree database rlang information technology microsoft Florida Keywords: csharp continuous integration continuous deployment machine learning business intelligence sthree database rlang information technology microsoft Florida |