Databricks Data engineer with pyspark ( Azure or AWS) | Atlanta, GA - Day 1 Onsite - Medtronic at Atlanta, Georgia, USA |
Email: [email protected] |
From: Gangadar Reddy, Centraprise [email protected] Reply to: [email protected] Role: Data engineer with pyspark Location: Atlanta, GA Hire type: Contract Job description: Team Please follow the below points before submitting: Submit 11+ Years resumes EX-CTS OR EX- MEDTRONIC OR ANY CONSULTING FIRMS IN THE RECENT PROJECTS Who is Local Exp. with any of Medical Device Companies Very High Desired Databricks Data engineer with good exp. on Pyspark (Azure or AWS) As a Senior Data Engineer, he/she will be responsible for designing, developing, and maintaining data pipelines using PySpark. You will work closely with data scientists, analysts, and other stakeholders to ensure the efficient processing and analysis of large datasets, while handling complex transformations and aggregations. **Required Skills and Experience:** - Bachelor's degree in Computer Science, Engineering, or a related field. - 10 years of experience in data engineering, with a focus on PySpark, Neo4j or Neptune DB or any other Graph DB. -Strong understanding of data modeling, database architecture and schema design. - Proficiency in Python and Spark, with strong coding and debugging skills. - Experience with big data technologies such as Hadoop, Hive, and Kafka. - Hands-on experience with cloud platforms such as AWS, Azure, or Google Cloud Platform (GCP). - Strong knowledge of SQL and experience with relational databases (e.g., PostgreSQL, MySQL, SQL Server). - Experience with data warehousing solutions like Redshift, Snowflake, Databricks or Google BigQuery. - Familiarity with data lake architectures and data storage solutions. - Knowledge of CI/CD pipelines and version control systems (e.g., Git). - Excellent problem-solving skills and the ability to troubleshoot complex issues. - Strong communication and collaboration skills, with the ability to work effectively in a team environment. **Preferred Skills:** - Knowledge of machine learning workflows and experience working with data scientists. - Understanding of data security and governance best practices. - Experience with containerization technologies such as Docker and Kubernetes. -Experience with orchestration tools like Apache Airflow or AWS Step Functions. - Familiarity with streaming data platforms and real-time data processing. **Key Responsibilities: ** - Design, develop, and maintain scalable and efficient ETL pipelines using PySpark. - Collaborate across functional areas to translate business process, problems into optimal data modeling and analytical solutions that drive business value. - Design data model by interacting with several business teams. - Manage data collection process providing interpretation and recommendations to management. - Build and optimize graph database solutions to support data-driven decision making and advanced analytics, integrate into data pipelines. - Optimize and tune PySpark applications for performance and scalability. - Collaborate with data scientists and analysts to understand data requirements, review Business Requirement documents and deliver high-quality datasets. - Implement data quality checks and ensure data integrity. - Monitor and troubleshoot data pipeline issues and ensure timely resolution. - Stay up-to-date with the latest trends and technologies in big data and distributed computing. Regards, Praveen Samji | Delivery Manager - TA Centraprise Corp 33 Wood Avenue South, Suite 600, Iselin NJ 08830 Mobile: 571-267-5947 Direct: 469-289-5767 Email: [email protected] Keywords: continuous integration continuous deployment database Georgia New Jersey Databricks Data engineer with pyspark ( Azure or AWS) | Atlanta, GA - Day 1 Onsite - Medtronic [email protected] |
[email protected] View all |
Fri Nov 08 18:59:00 UTC 2024 |