GCP Data Engineer || Onsite - Irving, TX at Irving, Texas, USA |
Email: [email protected] |
From: Naveen, Smart IT Frame [email protected] Reply to: [email protected] Hi, Greetings from Smart IT Frame, Hope you are doing well!!! Smart IT Frame specializes in enabling you with your most critical line of resources. Whether its for permanent staffing, contract staffing, contract-to-hire or executive search, we understand the importance of delivering the most suitable talent; on time and within budget. With our Core focus in emerging technologies, we have provided global technology workforce solutions in North America, Canada & India. We take pride in delivering specialized talent, superior performance, and seamless execution to meet the challenging business needs of customers worldwide. Role : GCP Data Engineer Location: Onsite - Irving, TX Type: Contract Shift Day 9AM TO 6PM EST 1. GCP, Big Query, Cloud Dataproc, Airflow, Composer, Airflow 2. PySpark, Python, Java, Hive, SQL 3. Hadoop architecture, HDFS experience Required Skills: Big Query, Cloud Dataproc, Airflow Nice to have skills: PySpark Roles & Responsibilities: Analyzes complex data structures from disparate data sources and design large scale data engineering pipeline Develops large scale data structures and pipelines to organize, collect and standardize data that helps generate insights and addresses reporting needs Collaborates with product business and data science team to collect user stories and translate into technical specifications Uses knowledge in Cloud & Hadoop architecture, HDFS commands and experience designing & optimizing queries to build data pipelines Uses strong programming skills in PySpark, Python, Java or any of the major languages to build robust data pipelines and dynamic systems Builds highly scalable and extensible data marts and data models to support Data Science and other internal customers on Cloud. Integrates data from a variety of sources, assuring that they adhere to data quality and accessibility standards. Analyzes current information technology environments to identify and assess critical capabilities and recommend solutions Experiments with available tools and advice on new tools to determine optimal solution given the requirements dictated by the model/use case Required Qualifications : 3+ years of progressively complex related experience in cloud data engineering and data analysis Knowledge in programing languages such as PySpark, Java, Python, Hive, SQL Knowledge in Cloud Technology, Hadoop architecture, HDFS commands and experience designing & optimizing queries against data in the HDFS environment Strong knowledge of large-scale search applications and building high volume data pipelines, preferably using Dataproc, composer services on GCP or other Cloud Platforms 2+ years of development experience in GCP and native tools & services such as DataProc, Composer, BigQuery, Airflow 2 to 3+ years of PySpark or Scala or Python programing proficient in SQL 3+ years of working with data & data warehouse experience Keywords: information technology Texas |
[email protected] View all |
Tue Sep 05 19:02:00 UTC 2023 |