Principal Data Engineer (GCP) at Remote, Remote, USA |
Email: [email protected] |
From: Ashwani Raghuvanshi, Tek Inspirations [email protected] Reply to: [email protected] Job Description - Client: Nerdery Title - Principal Data Engineer (GCP) Location: Remote Contract to Hire Position Important Note: For the Principal Data Engineer role we are seeking a candidate with at least 15 years overall technical experience and at least 2-3 years experience leading large GCP projects. I know Principal means different things in different companies and I know this context will be helpful to you and your team as you search for candidates. *We would love to find a Contractor for this openings who is open to converting to Full-time We will move these candidates to the first interview immediately, no recruiter screen. The interview process for both roles is a 3-step process and we will move mountains to get the interviews scheduled and completed this week if you have candidates. Responsibilities Oversee the entire technical lifecycle of a cloud data platform including, but not limited to framework decisions, breaking down features into technical stories, writing technical requirements, and production readiness. Design and implement a robust, secure data platform in GCP using industry best practices, native security tools, and integrated data governance controls. Translate a defined data governance strategy into technical requirements, implementing controls, documenting processes, and fostering a data-driven culture within the organization. Using complex SQL knowledge and experience, will work with relational databases, Big Query, query authoring (SQL), and working familiarity with various databases. Build analytics tools that utilize the data pipeline to provide actionable insights into customer acquisition, operational efficiency, and other critical business performance metrics. Design and implement scalable and reliable data pipelines on GCP. Implement Change Data Capture (CDC) techniques and manage Delta Live Tables for real-time data integration and analytics, ensuring data consistency and enabling incremental data updates in cloud-based data platforms. Design, configure, and manage Data Lakes in GCP, utilizing services like Google Cloud Storage, BigQuery, and Dataproc, to support diverse data types and formats for scalable storage, processing, and analytics. API architecture design, including RESTful services and microservices, integrating Machine Learning models into production systems to enhance data-driven applications and services. Build the infrastructure, using IaC, required for extraction, transformation, and loading (ETL) of data from a wide variety of data sources using SQL and GCP Migrate and create data pipelines and infrastructure from AWS or Azure to GCP. Write and maintain robust, efficient, scalable Python scripts for data processing and automation. Use a strong understanding of data pipeline design patterns, and determine the best for the use case. Work with unstructured datasets. Build processes supporting data transformation, data structures, metadata, dependency, and workload management. Manipulate, process, and extract value from large, disconnected datasets. Work with stakeholders, including the Executive, Product, Data, and Design teams, to assist with data-related technical issues and support their data infrastructure needs. Assume responsibility for the solution and stability of the data in transit and at rest Collaborate directly with the client to identify and implement data security and compliance requirements. Keep client data secure using best practices. Build internal processes, frameworks, and best practices for our data engineering domain. Foster cross-functional collaboration as a technical liaison between engineering and other project disciplines (Design, Quality, Project Management, Strategy, Product, etc.) Support the growth of other data engineers through mentorship Own the technical review process for team members and provide ongoing technical feedback and recommendations Participate in the internal leadership of their respective domain; provide input to the strategic direction of the domain, assist with domain initiatives, and maintain best practices within the domain Assess the technical skills of prospective candidates and provide recommendations to hiring managers Assist with sales requests as needed by providing technical recommendations and estimates to prospective clients Skills & Qualifications Bachelors in Computer Science or related field or equivalent experience required 8+ years of relevant experience In-depth knowledge of Google Cloud Platform (GCP) data services such as BigQuery, Dataflow, Dataproc, and Pub/Sub, with proven experience in designing and implementing data pipelines, data storage, and analytics solutions in GCP. Experience designing and implementing data governance and compliance policies at scale Ability to take technical requirements and produce functional code Experience with Git and specified technologies. Proficiency in Python and SQL. Experience with migrating data pipelines and infrastructure to GCP from multiple infrastructure stacks. Deep understanding of data modeling, ETL processes, and data warehousing principles. Familiarity with data pipeline orchestration tools and practices, such as Pub/Sub, Streaming, and Cloud Functions. Excellent problem-solving and analytical skills. Ability to communicate with technical and non-technical client stakeholders Proactive collaborator works with colleagues to improve their technical aptitude A successful history of manipulating, processing and extracting value from large disconnected datasets Experience building and optimizing big data pipelines, architectures and data sets Experience using and/or creating APIs Experience with any of the following additional database management systems: MS SQL Server, MongoDB, PostgreSQL, NoSQL (e.g. Cassandra), Sybase, IBM Db2, or Oracle Database Experience with big data tools: Hadoop, Spark, Kafka, etc. are a plus Keywords: microsoft Principal Data Engineer (GCP) [email protected] |
[email protected] View all |
Fri Jul 12 03:04:00 UTC 2024 |