GCP Data Engineer at Remote, Remote, USA |
Email: [email protected] |
From: Rakesh, Blue Ocean Ventures [email protected] Reply to: [email protected] Hi, Job Title: GCP Data Engineer Location: Remote Duties & Responsibilities At least 4 years of Data Engineering experience Export the data from the Hadoop ecosystem to ORC or Parquet file Build scripts to move data from on-prem to GCP Build Python/PySpark pipeline Transform the data as per the data model Improve the pipeline performance and efficiency Mandatory skills: Hands-on experience building Python/PySpark pipeline Strong Hadoop/Hive background Strong in SQL Data warehousing & Data Lake Understanding of Data model Google experience Cloud Storage, Cloud Composer, Dataproc & BigQuery Understanding data files format like ORC, Parquet, Avro Preferred skills: Understanding of GCP services Experience using Cloud Warehouses like BigQuery (preferred), Amazon Redshift, Snowflake etc. Working knowledge of Distributed file systems like GCS, S3, HDFS etc. Understanding of Airflow / Cloud Composer CI/CD and DevOps ETL tools e.g., Informatica (IICS) Ab Initio, Infoworks, Pentaho, SSIS Keywords: continuous integration continuous deployment sthree |
[email protected] View all |
Thu Dec 15 22:50:00 UTC 2022 |