Srividya - Sr Cloud Devops Engineer |
reddysrividya392@gmail.com |
Location: San Antonio, Texas, USA |
Relocation: Yes |
Visa: GC |
Sri Vidya
Sr.Cloud DevOps Engineer Email: reddysrividya392@gmail.com Phone: (507) 301-6863 Linkedin: https://www.linkedin.com/in/vidya-reddy-b0820b348/ PROFESSIONAL SUMMARY Over 10+ years of IT industry experience in DevOps, Build and Release Engineering with expertise in AWS and complete understanding of Software Development. 6+ years of experience in DevOps Engineer, 4 years of experience in build and release engineer. Hands on experience in amazon web services which includes Cloud Formation, Elastic Load Balancer, Elastic Beanstalk, CloudWatch, IAM, Server Migration, Route 53, SQS, VPC, S3, DynamoDB, SNS, Glacier, RDS, EC2 Container Service, Lambda. Hands-on experience in Azure compute services, Azure Web apps, Azure Storage, Azure Networking, and Azure Identity & Access Management. Experience on Cloud innovations including Infrastructure as a Service, Platform as a Service and Software as a Service (IaaS, PaaS and SaaS). Experience in creating S3 buckets in AWS and creating custom policies for access management for the clients using AWS IAM (Identity Access Management). Experienced in setting up databases in AWS using RDS, storage using S3 bucket and configuring instance backups to S3 bucket by creating snapshots. Well knowledge in CI (Continuous Integration) and CD (Continuous Deployment) methodologies with Jenkins. Extensively worked on building Jenkins jobs for continuous integration and for End-to-End automation for all Build and deployments. Experience in writing chef recipes, adding them to Chef Cookbooks and templates, and creating the run-list to automate the configuration management with the Knife tool. Hands on experience on building tools like Maven and Ant. Experienced in authoring pom.xml files, performing releases with the Maven release plug-in and managing Maven repositories. Administering and supporting Azure Kubernetes infrastructure, ensuring it is secure, resilient and performance and responsible for complete DevOps activities and coordinating with the development team. Implemented monitoring with Grafana visualization infrastructure in Azure Kubernetes cluster. Automate the application deployment in the cloud using Docker technology using Elastic Container Service scheduler. Hands-on experience in building deployment/build scripts and automating solutions using scripting languages such as Shell Scripting (KSH, Bash), Python, Ruby. Hands-on experience on working with DynamoDB, MYSQL, MongoDB and CosmosDB. Install, Configure and Manage the monitoring Tools such as Nagios/AWS cloud watch for Resource Monitoring/Network Monitoring/Log Trace Monitoring. Configuration Management using Ansible (writing ad-hoc queries and play books). Worked with Ansible Playbooks for virtual and physical instance provisioning, Configuration management, patching, and software deployment. Implemented Terraform modules for deployment of various applications across multiple cloud providers and managing infrastructure. Wrote Terraform scripts to launch AWS instances and used Ansible to manage web applications, configuration files, used mount points and packages. Experience with container-based deployments using Docker, working with Docker images, Docker hub and Docker registries, installation and configuring Kubernetes and clustering them. Deployed and managed bare metal Kubernetes clusters for enhanced control and performance, optimizing infrastructure use without relying on cloud providers. Configured custom load balancing and networking policies on bare metal Kubernetes clusters to efficiently handle traffic and ensure high availability. Designed and implemented IAM user policies to control and secure AWS service and resources. Knowledge of using various routed protocols like FTP, SFTP, SSH, HTTP, HTTPS and Connect direct. Experience in working on version controller tools like GITHUB, Bitbucket, Azure repos and gitlab. Hands on working with different bug tracking tools like JIRA. Experience of working with the release and deployment of large-scale Java/J2EE Web applications. Hands-on experience in Linux/System Administration with RHEL, Cent OS, Amazon Linux. Involved in the functional usage and deployment of applications in Apache Tomcat and Web Logic Server. Experience in Splunk development creating App's, Dashboards, Data Models. Configure and deploy the application packages on to the Apache Tomcat server. Coordinated with software development teams and QA teams. Work with Engineers, QA, business and other teams to ensure automated test efforts are tightly integrated with the build system and actively address deployment and building issues. TECHNICAL SKILLS Cloud Environment EC2, VPC, EBS, AMI, SNS, RDS, ELB, CloudWatch, CloudFormation AWS Config, S3, Cloud Trail, IAM., Elastic Beanstalk, Route 53, ECR, EKS, CodeCommit, CodeDeploy, DirectConnect, Lambda. VM, App Services, Key vault, Function apps, Blob storage, Azure Active Directory (Azure AD), Service Bus, Azure Container Registry (ACR) and Azure Kubernetes service (AKS), Azure SQL, Azure Cosmos DB. Version Control Tools GIT, SVN. Perforce, Subversion. Repositories Nexus and JFrog Virtualization VMware, Vagrant, ESX, Microservices. Operating Systems Windows 98/XP/NT/2000/2003, UNIX, LINUX, RHEL 7, SOLARIS, MAC-OSX Databases Oracle, MySQL, SQL Server, MongoDB, PL/SQL Language(s) Python, Ruby, Shell, Java script, Bash, Perl Web/Application Servers IBM Web sphere, Apache Web Server, Apache Tomcat, Sun ONE Web Server & IIS Web Server, Web logic. Network Protocols TCP/IP, FTP, SMTP, SOAP, TCP/IP, HTTP/HTTPS, NDS, DHCP, NFS, Cisco Routers, LAN. Build/ CI Tools ANT, MAVEN, JENKINS, Azure Pipelines, GitHub Actions, TeamCity, GitLab CI/CD. Monitoring Tools Nagios, Splunk, Prometheus, Grafana, Dynatrace, ELK Containerization/ Orchestration Docker, Kubernetes, OpenShift. Configuration Tools Chef, Ansible, Puppet. Tracking Tools JIRA, Remedy CERTIFICATIONS Certified in Microsoft Azure Administrator Associate. Certified in Kubernetes Administrator. Certified in AWS Developer. EDUCATION Bachelors in ECE From Megha Engineering JNTUH- May 2014 PROFESSIONAL EXPERIENCE Client: GEICO, San Diego, CA Apr 2023 to Present Role: Sr DevOps Engineer Roles and Responsibilities: Handling Code Repositories in Azure Repos and providing support to application teams on day-to-day source code management activities, including branch maintenance, code merging, planning branch strategy, setting up branching policies, and controlling user access. Handling Code Repositories in Azure Repos and providing support to application teams on day-to-day source code management activities, including branch maintenance, code merging, planning branch strategy, setting up branching policies, and controlling user access. Designed and configured CI/CD pipelines using Azure DevOps by writing pipeline YAML files to automate build, test, and deployment processes. Involved in setting up self-hosted agent pools and configuring build environments. Worked extensively on Azure services including Azure Web Apps, App Services, Azure Storage, Azure SQL Database, Virtual Machines, Azure Active Directory, and Azure Event Hub. Installed and configured MongoDB in testing and production environments, setting up replica sets and enabling SSL/TLS encryption for secure communication. Configured Bastion Server as a secure gateway to access resources within a private network from a public network in Azure. Worked with Terraform to automate VNET, NSG, AKS, ACR, VMs, and Storage Accounts, replacing manual infrastructure provisioning. Developed Terraform modules for Compute, Network, and Managed Clusters, enabling easy reuse across multiple environments. Automated deployments using Ansible, including restarting servers, installing new packages, and stack monitoring. Integrated Ansible with Azure DevOps pipelines, automating configuration management for seamless infrastructure provisioning. Designed and configured Azure Virtual Networks (VNETs), Subnets, Azure Network Security Groups, DNS settings, Security Policies, and Routing. Utilized Azure Kubernetes Service (AKS) to deploy and manage containerized applications, creating AKS clusters using Terraform custom modules for Dev, Stage, and Prod environments. Worked with Kubernetes Istio Service Mesh for traffic management, security, and service monitoring across clusters. Created and deployed Helm charts for Kubernetes applications based on microservices architecture. Managed Kubernetes Persistent Volumes (PVs) to ensure efficient storage utilization and data persistence. Configured and installed Grafana and Prometheus using Helm charts for real-time monitoring and visualization of Kubernetes workloads. Implemented Azure Chaos Studio for resilience testing and controlled fault injection experiments in AKS, Virtual Machines, and App Services. Designed and executed chaos experiments including CPU stress, memory pressure, network latency, and service outages, identifying potential system weaknesses before production incidents. Automated chaos engineering tests using Azure DevOps Pipelines, integrating failure scenarios into the deployment lifecycle. Monitored and analyzed chaos experiment results using Azure Monitor, Application Insights, and Log Analytics, improving incident response strategies. Implemented self-healing mechanisms using Azure Functions and Logic Apps, automating rollback and remediation processes when failures are detected. Worked on serverless services, created and configured HTTP Triggers in Azure Functions, monitored applications using Application Insights, and performed load testing via Visual Studio Team Services (VSTS). Designed and implemented Python and Prometheus-based monitoring scripts, improving system health tracking. Automated backup and restore processes using shell scripting, ensuring data integrity and minimizing downtime in case of system failures. Configured Azure Front Door as a content delivery network (CDN) and Load Balancer, optimizing application performance across multiple regions. Designed and built a Disaster Recovery with Azure Recovery Services. Responsible for migrating legacy applications to AWS & Azure clouds as well as migration to SaaS solutions. Designed IaaS and PaaS solutions for new clients migrating from onsite infrastructure to the cloud. Implemented a production-ready, load-balanced, highly available, fault-tolerant Kubernetes infrastructure. Managed local deployments in Kubernetes, creating a local cluster and deploying application containers. Developed program connecting Bitbucket issues and to-do list tasks to update each other. Implemented DevOps Practice for Microservices using Kubernetes as Orchestrator. Client: SiteKick Technologies, Minneapolis, MN Feb 2021 to Mar 2023 Role: Site Reliability Engineer Roles and Responsibilities: Site Reliability Engineer with expertise in SRE, system administration, and operations control & batch scheduling. Proficient in BMC Control-M, scheduling and executing complex batch jobs across multi-platform environments, ensuring seamless data processing and system automation. Extensive experience with NDM, FTP, SFTP, PGP/FTP, and NFS, ensuring secure and efficient data flow across distributed systems. In-depth knowledge of Linux (Red Hat, CentOS, Ubuntu) and Windows, performing routine maintenance, troubleshooting, and performance tuning for high-availability systems. Led batch restoration activities, minimizing system downtime and service interruptions. Provided round-the-clock support, ensuring prompt issue resolution and minimizing service disruptions. Expertise in Nagios, Splunk, and other monitoring tools to proactively detect anomalies, optimize infrastructure health, and lead troubleshooting initiatives. Hands-on experience with SQL and NoSQL databases, including performance tuning, query optimization, backups, and high availability solutions. Experienced in Pivotal Cloud Foundry (PCF) for deploying, managing, and scaling cloud-native applications. Managed small teams and coordinated activities across multiple departments for seamless system integrations and batch processing. Designed and implemented automated processes for system monitoring, patching, and job scheduling, reducing manual intervention by 40%. Led continuous improvement initiatives, creating documentation for best practices in system operations, batch processing, and incident management. Collaborated with internal teams, clients, and third-party vendors to resolve escalated issues and ensure timely service delivery. Implemented DR and business continuity plans, ensuring system availability and resilience through regular testing and risk mitigation. Conducted proactive performance analysis using Splunk and Nagios, reducing system latency by 30%. Forecasted system resource requirements, ensuring scalable and cost-effective cloud deployments. Led and mentored junior engineers, fostering a collaborative environment for knowledge sharing on automation, troubleshooting, and system reliability. Client: Capital One, McLean, VA Oct 2018 to Dec 2020 Role: DevOps Engineer Roles and Responsibilities: Designed and configured Azure Virtual Networks (VNets), subnets, Azure network settings, DHCP address blocks, DNS settings, and Security policies & configured BGP routes to enable ExpressRoute and site to site VPN connections between on- premises data centers & Azure cloud. Led implementation of Azure Active Directory for single sign-on and Authentication for Web Applications. Also configured Azure Role-based Access Control (RBAC) to segregate duties within our team and grant only the amount of access to users that they need to perform their jobs based on Roles defined. Created and managed Azure Storage Accounts, creating blob, file, and table storage solutions to store logs and database backups. Configured three types of blobs, block blobs, page blobs, and append blobs in Azure for storing a large amount of unstructured object data such as text or binary data, that can be accessed via HTTP or HTTPS and enabling data redundancy and Lifecycle Rules and Events. Worked on Managing the Private Cloud Environment using Ansible and Enhanced the automation to assist, repeat and consist of configuration management using Ansible based YAML scripts. Setup custom Domains and configure network security group (NSG) rules to specify ingress/egress traffic restrictions. Worked on Terraform to create the various services like AKS, ACR, VNET, VM ..etc as infrastructure as code in various environments as per the project need Created inventory in Ansible for automating CD & developed Ansible playbooks and Roles using YAML scripting. Used ELK stacking to monitor the logs for detailed analysis, worked on dashboarding using Elastic, Logstash & Kibana (ELK), & setup real time logging & analytics for CD pipelines & applications. Worked on creation of Docker images on top of micro services and deployed on Azure Kubernetes services. Worked on Kubernetes cluster creation and creation of Deployments, services, RBAC and Ingress. Worked on using a GIT branching strategy that included developing branches, feature branches, staging branches, and master. Pull requests and code reviews were performed. Created and configured Azure Logic Apps to streamline workflow automation and integration between different services. Configured Azure Private DNS Zones to resolve private domain names to private IP addresses within the Azure Virtual Network. Configured ServiceNow to receive any instant notifications of any configuration changes in the cloud environment by orchestrating through Logic Apps. Migration of on-premise data (Oracle/ SQL Server/ MongoDB) to Azure SQL/CosmosDB using Azure Data Factory. Experience in Azure infrastructure management (Azure Web Roles, Worker Roles, SQL Azure, Azure Storage, Azure AD Licenses) using Terraform and managed Azure Infrastructure through Blueprints and Landing Zone. Coordinated with developers to establish and apply appropriate branching, labeling/naming conventions using GIT source control and analyzed and resolved conflicts related to merging of source code for GIT. Deployed multiple microservices into Azure Kubernetes by Dockerizing them and using Jenkins and Azure DevOps. Migrated the Build forge projects to Azure DevOps with all the work items, source codes, build and release pipelines by using a custom PowerShell tool. Implemented backup and disaster recovery strategies using Azure Backup and Azure Site Recovery, ensuring data resilience and high availability of critical systems. Integrated Docker container orchestration framework using Kubernetes by creating pods, and deployments. Automated infrastructure deployment using ARM templates and terraform for provisioning cloud resources in Azure, managed private cloud infrastructure with OpenStack for optimized VM provisioning and scaling, and provided on-call support for production environments, troubleshooting critical incidents through ServiceNow. Client: Cigna - Austin, TX. Oct 2015 to Sep 2018 Role: Linux System Administrator Roles and Responsibilities: Created and maintained user accounts in RedHat Enterprise Linux (RHEL)and other operating systems Troubleshooting and maintaining of TCP/IP, Apache HTTP/HTTPS, SMTP and DNS applications. Configuration of NIS, DNS, NFS, SENDMAIL, LDAP, TCP/IP, Send Mail, FTP, Remote access Apache Services on Linux & UNIX Environment. Used both GIT and Bitbucket source control systems to manage code. Coordinate/assist developers with establishing and applying appropriate branching, labeling/naming conventions using GIT source control. Used JIRA to track issues and Change Management. Migrated different projects from Perforce to SVN Performing NIC bonding on Linux Systems for redundancy. Diagnosed and resolved problems associated with DNS, DHCP, VPN, NFS, and Apache. Created Bash/shell scripts to monitor system resources and system maintenance. Create and Update Documentation for the current Patching process. Coordinate with Lines of Business to schedule patching. Create Change request for Patching in Production environment. Booted systems into different run levels for troubleshooting and system maintenance. Network troubleshooting using traceroute, net stat, ipconfig, etc. Configured Prometheus and Grafana for system monitoring and integrated logs with Graylog to enhance centralized logging capabilities. Managed deployments and networking configurations in OpenStack environments, optimizing performance and security. Client: Goldman Sachs, Hyderabad, India. Dec 2014 to Oct 2015 Role: Software Engineer Roles and Responsibilities: Developed, tested, and deployed scalable applications, ensuring high performance and reliability. Designed and implemented RESTful APIs, improving system integration and communication. Automated deployment processes using CI/CD pipelines, reducing manual intervention and deployment time. Optimized application performance through code refactoring and database indexing, enhancing system efficiency. Collaborated with cross-functional teams to gather requirements and deliver software solutions aligned with business objectives. Wrote clean, maintainable, and efficient code following best practices and coding standards. Conducted code reviews and provided constructive feedback to enhance team productivity. Implemented monitoring and logging solutions to improve system observability and troubleshoot issues efficiently. Designed and executed unit, integration, and end-to-end tests to ensure software quality and reliability. Followed Agile and DevOps methodologies to accelerate software delivery and improve collaboration. Conducted root cause analysis for system failures and implemented proactive solutions to prevent future incidents. Keywords: continuous integration continuous deployment quality analyst sthree database active directory information technology procedural language California Minnesota Texas Virginia |