AWS Data Engineer at Remote, Remote, USA |
Email: [email protected] |
From: Ashi, Key Infotek [email protected] Reply to: [email protected] Location: Remote About the Role: We are seeking a skilled AWS Data Engineer to join our growing team. The ideal candidate will have hands-on experience working with AWS cloud technologies, Python, PySpark, Scala, and a strong background in data engineering and data pipeline development. You will be responsible for building scalable data solutions, writing automated tests, and collaborating with cross-functional teams to deliver high-quality data products. Key Responsibilities: Design, develop, and optimize ETL/ELT pipelines on AWS using PySpark, Python, and Scala to process large-scale datasets. Work with AWS services such as S3, Lambda, Glue, EMR, and Athena to architect and deploy scalable data solutions. Implement data transformations, aggregations, and integrations between multiple data sources and systems. Write unit tests, integration tests, and end-to-end tests using Pytest and Behave frameworks to ensure data pipeline reliability. Collaborate with data scientists and analysts to ensure the availability of high-quality, well-structured data for analysis and machine learning purposes. Perform data modeling, performance optimization, and tuning of existing pipelines and queries. Monitor, troubleshoot, and optimize data pipeline performance using AWS-native monitoring tools. Develop and maintain documentation of data engineering processes, systems, and best practices. Required Skills & Qualifications: 8+ years of experience working as a Data Engineer or in a related role. Strong experience with AWS services such as S3, Lambda, Glue, EMR, Athena, and IAM. Proficiency in Python and PySpark for data engineering tasks. Experience with Scala for data processing in distributed environments. Hands-on experience with writing automated tests using Pytest and Behave (or similar frameworks). Solid understanding of data management, ETL/ELT processes, and data pipeline orchestration. Strong SQL skills and experience with database design, querying, and performance tuning. Experience with version control systems (e.g., Git) and CI/CD pipelines. Excellent problem-solving and communication skills, with the ability to collaborate effectively in a team environment. Nice to Have: Experience with containerization (Docker) and orchestration tools (Kubernetes). Experience with data warehousing solutions such as Redshift, Snowflake, or BigQuery. Knowledge of machine learning frameworks and MLOps practices. Education: Bachelor's degree in Computer Science, Data Science, Information Technology, or a related field (or equivalent experience). Keywords: continuous integration continuous deployment sthree AWS Data Engineer [email protected] |
[email protected] View all |
Wed Oct 09 19:44:00 UTC 2024 |