Lead Data Engineer Portland OR at Portland, New York, USA |
Email: [email protected] |
From: Ashwani, Gtech [email protected] Reply to: [email protected] Job Title Lead Data Engineer Project Location Portland, OR Duration 12 Month+ Contract WORK EXPERIENCE Job Description: Key skills - Strong SQL/Data warehousing concepts/Pyspark/AWS/ Snowflake Key Responsibilities - Design, implement and support an analytical data infrastructure Managing AWS resources including EC2, EMR, S3, Glue, Redshift, etc. Interface with other technology teams to extract, transform, and load data from a wide variety of data sources using SQL and AWS big data technologies Explore and learn the latest AWS technologies to provide new capabilities and increase efficiency Collaborate with Data Scientists and Business Intelligence Engineers (BIEs) to recognize and help adopt best practices in reporting and analysis Help continually improve ongoing reporting and analysis processes, automating or simplifying self-service support for customers Maintain internal reporting platforms/tools including troubleshooting and development. Interact with internal users to establish and clarify requirements in order to develop report specifications. Work with Engineering partners to help shape and implement the development of BI infrastructure including Data Warehousing, reporting and analytics platforms. Contribute to the development of the BI tools, skills, culture and impact. Write advanced SQL queries and Python code to develop solutions Skill Sets - This position requires a Bachelor's Degree in Computer Science or a related technical field, and relevant employment experience Work experience with ETL, Data Modelling, and Data Architecture. Expert-level skills in writing and optimizing SQL. Experience with Big Data technologies such as Hadoop/Hive/Spark. Solid Linux skills. Experience operating very large data warehouses or data lakes. Expertise in ETL optimization, designing, coding, and tuning big data processes using Apache Spark or similar technologies. Experience with building data pipelines and applications to stream and process datasets at low latencies. Show efficiency in handling data - tracking data lineage, ensuring data quality, and improving discoverability of data. Sound knowledge of distributed systems and data architecture (lambda)- design and implement batch and stream data processing pipelines, knows how to optimize the distribution, partitioning, and MPP of high-level data structures. Knowledge of Engineering and Operational Excellence using standard methodologies. Skills Critical / Mandatory / Desirable Experience in Cloud Computing, e.g., AWS Mandatory Experience in distributed technology tools, viz. SQL, Spark, Python, PySpark Critical Experience in writing code which is optimized for performance Critical Airflow workflow scheduling tool for creating data pipelines Mandatory GitHub source control tool & experience with creating/ configuring Jenkins pipeline Mandatory Experience in EMR/ EC2, Databricks etc. Mandatory DWH tools incl. SQL database, Presto, and Snowflake Mandatory Keywords: business intelligence sthree |
[email protected] View all |
Thu Mar 14 00:00:00 UTC 2024 |