Vysnava Nikhil - Senior Data Engineer |
[email protected] |
Location: Houston, Texas, USA |
Relocation: Yes |
Visa: H-1B |
Page 1 of
5 Nikhil Raghupatruni +1 636 312 8820 [email protected] Professional Summary Experienced Consultant with a demonstrated history of around 10 years working in the information technology and service industry. Experience in Python Programming and Tkinter. Experience with AWS Glue , AWS Lambda , DynamoDB , SNS , API Gateway , AWS S3, DynamoDB Experience as a Big Data Engineer in designing data intensive applications using Hadoop system , Cloud data Engineering. Experience with Pyspark. Improving the performance and optimization of the existing algorithms using Spark. Proficient in Azure Data Factory, I have orchestrated data pipelines, seamlessly integrating data from diverse sources to Azure data platforms, ensuring data accuracy and facilitating advanced analytics workflows. Proficient in Azure Cognitive Services, I have integrated cutting-edge AI capabilities into data science workflows, enabling tasks like natural language processing, computer vision, and sentiment analysis to enhance data-driven decision-making. Proficient in PySpark's Spark Streaming, I have designed real-time data processing pipelines, providing actionable insights from streaming data sources for time-sensitive applications. I proficiently utilized a diverse range of Python libraries, including PySpark, Pytest, Pymongo, cxOracle, PyExcel, Boto3, Psycopg, embedPy, NumPy, Beautiful Soup, Pandas, Matplotlib, PyTorch, BERT, and NLP, to tackle complex data analysis and processing tasks. Proficient in PySpark's DataFrame and SQL API, I have designed and implemented complex data transformations and data manipulations, enabling seamless data integration and preparation for downstream analytics. Leveraging PySpark's MLlib and GraphX libraries, I have developed and deployed machine learning models and graph algorithms, contributing to data-driven insights and business intelligence. Experience with working and deployment of writing Fraud rules for FALCON application. Experience in creating Cloud Resources with Terraform. Experience with working in Python, AWS Serverless services and terraform. Experience in Python Data Science concepts like NumPy, Pandas and Matplotlib. Experience in a fast-paced Agile/Scrum environment. Worked with project stakeholders to groom backlog tasks and plan 1-week sprint cycles. Collaborated with the product, business, marketing and engineering teams to develop and maintain product backlogs. I have demonstrated expertise in PySpark's integration with cloud-based platforms, such as Microsoft Azure and Amazon Web Services (AWS), facilitating scalable and cost-effective big data processing solutions. Creating and managing Portfolio Plans on JIRA to enhance transparency and visibility for business and technology. Experience with Business Intelligence tools like Tableau. Experience in Data Analytics using Excel. Experience with Azure AD for integrating the Client API requests. Experience gathering requirements for a data warehouse or similar data solutions. Collaborate with business stakeholders to develop and document policies and procedures surrounding Data Governance Proven experience as a BI Developer, with a focus on Tableau and PowerBI. Experience in designing Informatica Process to extract the data. Page 2 of 5 Demonstrated expertise in configuring and customizing SonarQube rulesets to align with industry standards and project-specific requirements, resulting in improved code maintainability and security. Successfully led code quality improvement initiatives by conducting regular SonarQube scans, identifying code smells, and collaborating with development teams to implement best practices and coding standards. Collaborating with cross-functional teams, I have contributed to PySpark-based data engineering projects, providing guidance and mentorship to junior developers, and ensuring best practices for efficient and maintainable code. Experience with working in Insurance Domain. Experience on open-source technologies including html, CSS, JavaScript and Reacts js. Strong Interpersonal and communication skills, ability to work in a team as well as Independently with minimal supervision. Skills Profile Technical Operating System Windows, Linux, Mac OS, Ubuntu Environment Client/Server, Unix/Linux environment. Database SQL Server, ADO.NET, MySQL Internet Tools HTML5, CSS3, Bootstrap, JSON, JavaScript , ReactJS Languages C#, Python Web-Frameworks Flask Version Control GIT, TFS, Gitlab Python Libraries PySpark, Pytest, Pymongo, cxOracle, PyExcel, Boto3, Psycopg, embedPy, NumPy, Beautiful Soup, Pandas, Matplotlib, PyTorch, BERT, NLP IAC Tools/ Devops/Visualizati on Tools Terraform , CI/CD Pipeline, Tableau Data Governance Tools Collibra Other Technologies/ Skills Pyspark , AWS Lambda, AWS Api Gateway, AWS Step Functions, Athena, AWS Glue , SNS , FALCON Rule Writing , AWS Glue , AWS Athena , Redshift , EC2 ,Route53 Management and Leadership Skills Ability to produce realistic software schedules, Development. Motivated problem solver with strong analytical skills. Page 3 of 5 Professional Experience (In Reverse Chronological Sequence) Organization Name Capgemini America Inc Client Name State Farm Insurance Companies Project Name DCS Project Duration (From & To in Month/Year Format) July 23 ,2021 Till Date Role/Title Senior Data Engineer Work Location Remote Project Description E2E Integrated Workflow Automation is designed to do E2E Validation, including count and content, for a given policy or Agreement ID and also to use as debussing tool for production issues. Personal Lines Auto and Fire Integration Test and BAT runs large column of scenarios for which this validation must occur in downstream which includes multiple process and validation rules to be validated as data transforms, ingested, optimized and provisioned for consumption Responsibilities Project 1: DCS Data Consumption July 23 Till Date Project 2: DCS EIWA Oct 11th Till Date Project 3 : DCS - Dashboard Nov15th 2022 - Till Date Project 4: DCS - PCM Rating June 10th - Till Date Lead the team to create Enterprise Data Lake on the Azure Cloud of various data sources from the AWS. Lead the team of Data Engineers and Data Scientist to develop various ML models for real time API serving predictions. Responsible for creating API Gateway, Step functions, Lambda, DynamoDB using Terraform. With hands-on experience in PySpark, Pytest, and Pymongo, I effectively handled large-scale data processing, implemented unit testing for robust code, and seamlessly integrated MongoDB with Python applications. Responsible for integrating CI/CD pipeline using Gitlab Proficient in data manipulation and analysis using Pandas, I have leveraged its powerful functionalities to process, clean, and transform complex datasets into actionable Page 4 of 5 insights. Demonstrated expertise in handling large-scale data sets efficiently with Pandas, enabling faster data processing and ensuring seamless integration with other data analysis libraries. Responsible for writing Lambda function code for various Views. Responsible for writing unit testing code for various components. Proficiently contributed to API development, leveraging my expertise in designing and implementing robust RESTful APIs to facilitate seamless data exchange and integration between applications. Demonstrated hands-on experience in building scalable and secure APIs, adhering to industry best practices and standards to ensure optimal performance and data protection. Played an integral role in architecting API solutions that enabled cross-platform compatibility and fostered efficient communication between diverse systems, enhancing overall application functionality. Responsible for Code Review. Leveraged SonarQube's reporting capabilities to generate detailed code quality metrics and dashboards, providing stakeholders with actionable insights to make informed decisions and prioritize technical debt. Leveraging PySpark's machine learning library (MLlib), I have developed and deployed scalable machine learning models for predictive analytics and recommendation systems, contributing to data-driven decisionmaking. Participate in weekly scrum meetings and daily stand-up meetings Responsible for creating Glue Job using Terraform and building the function code using Pyspark. Responsible for designing single page applications using HTML , CSS and Javascript. Responsible for creating various POC s. Managing Offshore team Collaborate with business stakeholders to develop and document policies and procedures surrounding Data Governance Develop best practices, standards, and Page 5 of 5 methodologies to assist in the implementation and execution of Data Governance Manage workflow process in collibra via Activiti . Operating Systems Windows Software / Special Tools PyCharm, Azure Data Factory, Azure Synapse Analytics, Azure Stream Analytics, Azure Event Hubs, Azure Data Lake Storage Gen2, Azure Databricks, SQL Server on Azure, NoSQL databases, Data pipelines, ETL, Gitlab Machine learning AWS, Python, Docker, Postman, Agile Scrum, Jupyter Notebook, behave module, GIT, AWS ApiGateWay, AWS Lambda, AWS Athena and other AWS services etc., CI/CD Pipelines, Terraform , Glue, Collibra,Informatica, PySpark . Languages Python 3.x Page 6 of 5 Organization Name Capgemini America Inc Client Name Discover Financial Services Project Name FALCON Fraud Project Duration (From & To in Month/Year Format) April 2018 June 1, 2021 Role/Title Data Engineer Work Location Houston, Texas Project Description Global Payments Fraud Platform Migration - Pulse, applications those are currently running on ACI PRISM (ACI Fraud Product) are getting upgraded into in house Falcon 6.4.1.4 UNIX version. New Interface has been built to manage the routing of transactions from Switch to Falcon 6.1.1.4. This platform will be used for debit acquiring transactions and will also act as decision maker for some financial institutions those are enrolled for the blocking service. Responsibilities Project 1: Project FALCON July 23 Till Date Project 2: Score Engine Oct 11th Till Date Responsible for gathering requirements from analysts regarding the rules, development, testing and then part of deployment and Validation. Develop new features in Rule Builder Application for creating Falcon Fraud rules based on Fraud Activity and Fraud Rule Peer review using Python and Tkinter to automate. Created Business logic using Python to create Planning and Tracking functions Performed data extraction and manipulation over large local datasets using SQL in Teradata. Version control using GIT Additionally, my expertise in Psycopg, embedPy, NumPy, Beautiful Soup, and Matplotlib enabled me to work with PostgreSQL databases, integrate Python with C/C++ code, perform advanced numerical computations, web scraping, and create captivating data visualizations for enhanced insights. Created Unit test files to enable testing utilizing Python code. Leveraging Azure's scalability and costeffectiveness, I have deployed and managed Page 7 of 5 big data solutions using Azure HDInsight and Azure Synapse Analytics, empowering organizations to process and analyze large datasets efficiently. Writing every day Rules based on the requirements from the Banks on FALCON Fraud Production Expert and Bench Expert and making it ready for noon and evening deployment. Actively incorporated Pandas into data science pipelines, collaborating with machine learning teams to preprocess data and prepare it for model training and evaluation, resulting in effective predictive models. Page 8 of 5 Involving in peer -review of the rules with the FRAUD-OPS analysts on FALCON Rule Audit application and uploading the scheduler to start deployment after form confirmation. Actively participated in API testing and documentation processes, validating functionality, and providing comprehensive API documentation for enhanced usability and developer support. Continuously stayed updated with the latest advancements in API technologies and industry trends, actively incorporating cuttingedge practices into API development processes to drive innovation and efficiency. Proactively contributed to the development team's skillset by sharing SonarQube knowledge and conducting training sessions, fostering a strong code qualityoriented mindset across the organization. Validating all the rules after deployment in Rule audit application for both Production Expert and Bench Expert. Interfacing with analysts to ensure production deadlines are met. Involved in Data Cleaning and Extracting data through Pandas, Numpy. Operating Systems Windows Software / Special Tools Visual Studio Enterprise, Teradata, SQL, GitHub, IDLE Python, Ultra Edit, FALCON Fraud Manager Expert, FALCON Rule Audit tool, FALCON Case Manager, Agile Scrum, Robotic Process Automation, Java, Tkinter, Jupyter Notebook , Power BI, PySpark, AWS Sagemaker. Languages Python 3.x Organization Name SmartWorks, LLC Client Name Computer Aid Inc, Page 9 of 5 Project Name 1. United Sates Animal Laboratory Management Software (USALIMS) 2. United States Animal Health Emergency & Reporting Diagnostics (USAHEARD s) Project Duration (From & To in Month/Year Format) November 2015 April 2018 Role/Title Software Engineer, ML Work Location Harrisburg PA Project Description United Sates Animal Laboratory Management Software is a smart client application designed to provide comprehension support for all administrative workflow in animal laboratories. It manages all the data related to the specimens, accessions, test results etc. It also provides automated client report generation, distribution and advance search options. Page 10 of 5 United States Animal Health Emergency Reporting Diagnostics is a Animal Health data. It provides all the data elements needed by the states to manage and monitor animal health within their Jurisdiction, report test results to USDA s Veterinary Services, and prevent, Control, eradicate and recover from disease and all the hazard incidents at the local level. Responsibilities Consult and interact with business users and business analysts to ensure proper requirements are defined and details are identified for application needs. Design new features and bug fixes in existing front-end part of the application using HTML Template, CSS, Developing new features and/or maintaining existing applications in multi-tier web application in Microsoft .NET framework using ASP.NET MVC and VB.net Successfully utilized Pandas to generate informative summary statistics and data visualizations, empowering stakeholders to make data-driven decisions and identify patterns and trends Collaborating with data scientists and analysts, I have translated their requirements into PySpark code, enabling seamless integration of data science models and analytics into production workflows. Maintain the source code using Microsoft Team Foundation Server (TFS) Successfully collaborated with crossfunctional teams, including front-end developers and system architects, to define API requirements, ensuring smooth integration and streamlined data flow Azure Prepare Test case documentation Work Closely with Quality Assurance team to fix the issues during development Provide training and support for our users to ensure they are able to successfully utilize our tools and have the knowledge to self-serve themselves. Operating Systems Windows Page 11 of 5 Software / Special Tools Agile Scrum, .Net framework 4.5, ASP.NET, ADO.NET, Bootstrap 3.0, CSS 3.0, HTML 5.0 , JavaScript , Visual Studio 2013/2015, TFS, SQL Server 2012, IIS Languages VB.net, C# Page 12 of 5 Organization Name Megatech Client Name Synchrony Financial Project Name HighYeild Savings Account Project Duration (From & To in Month/Year Format) June 2013 April 2015 Role/Title Software Engineer Work Location Hyderabad, India Responsibilities Gathered Client requirements and converting them into the Technical Specifications and developed Web forms using C#.NET. Applications logic was developed using C# and used HTML and SVG to generate layouts Implemented MVC framework to reduce the complexity and to get full control over the behavior of the application. Rewrote legacy corporate banking module, from classic ASP and XML/XSLT to ASP.Net, HTML, CSS, JavaScript, ORM (Object-Related Modeling), WCF and AngularJS, as part of a major update. Used Microsoft Entity Framework Code First approach for the Data Access Layer for theASP.NET MVC 3 application. Designed GUI using the HTML, Java Script, CSS, Jason and Custom Controls. Keywords: cprogramm cplusplus csharp continuous integration continuous deployment artificial intelligence machine learning javascript business intelligence sthree active directory information technology Idaho Pennsylvania |