Databricks Engineer at Austin, Texas, USA |
Email: [email protected] |
From: ankit, kkassociates [email protected] Reply to: [email protected] Position: Databricks Engineer Location: Austin, TX(Onsite) Duration: Long Time Databricks Platform Expertise: Develop, maintain, and optimize ETL pipelines using Databricks and Apache Spark to process large volumes of structured and unstructured data. Data Pipelines & Workflow Automation: Design and implement automated data workflows and ETL pipelines, ensuring reliability, scalability, and efficient processing of data across multiple environments (development, staging, production). Data Modeling & Transformation: Work with data scientists and analysts to design and implement effective data models and transformations to ensure data is structured for business insights and machine learning applications. Cloud Integration: Leverage cloud platforms (AWS, Azure, or GCP) to build and optimize data pipelines, integrate Databricks with other cloud-native tools (e.g., S3, Redshift, Snowflake, ADLS, BigQuery), and ensure seamless data processing in the cloud. Machine Learning Integration: Collaborate with data scientists to integrate Databricks with machine learning workflows, ensuring that data pipelines support model training, validation, and inference. Performance Optimization: Monitor and optimize the performance of Databricks notebooks and Spark clusters, ensuring low-latency and cost-effective data processing. Data Quality & Governance: Implement data validation, quality checks, and governance practices to ensure the accuracy, consistency, and integrity of data across the organization. Collaboration: Work closely with cross-functional teams including data engineers, analysts, and product teams to understand business requirements and design data solutions that meet these needs. Documentation & Best Practices: Document all processes, pipelines, data models, and architectures to ensure reproducibility, compliance, and knowledge sharing within the team. Troubleshooting & Support: Provide support for data issues, troubleshoot pipeline failures, and optimize jobs for reliability and performance. Mentorship: Provide guidance to junior engineers, helping them adopt best practices for working with Databricks, Apache Spark, and cloud data solutions.Required Skills & Qualifications: Education: Bachelors degree in Computer Science, Engineering, Data Science, or a related field (Masters degree is a plus). Experience: Minimum 3-5 years of experience in data engineering, with a focus on building and managing big data systems and pipelines. Proven experience with Databricks (especially Apache Spark) and its integration with cloud-based storage, processing, and analytics tools. Strong experience with cloud platforms (AWS, Azure, GCP) and their respective services (e.g., S3, Redshift, BigQuery, Azure Data Lake). Expertise in Python, Scala, or SQL, with a focus on writing efficient data transformation scripts. Solid understanding of distributed computing, cluster management, and performance optimization in big data environments. Technical Skills: Proficiency in Apache Spark and hands-on experience using Spark in Databricks. Strong SQL skills for querying and managing large datasets in relational and NoSQL databases. Experience with data orchestration tools (e.g., Apache Airflow, Databricks Jobs, or similar). Familiarity with data lakes and data warehouses (e.g., AWS S3, Delta Lake, Snowflake, or Redshift). Experience with version control tools like Git and collaboration platforms (e.g., GitHub, GitLab). Knowledge of machine learning frameworks and integration with Databricks (e.g., MLflow). Problem-Solving: Strong problem-solving skills with the ability to analyze and troubleshoot complex data pipeline issues. Ability to design and implement scalable data architectures that align with organizational goals. Collaboration & Communication: Strong communication skills, both verbal and written, with the ability to explain technical concepts to non-technical stakeholders. Experience collaborating with data scientists, analysts, and business teams to translate business needs into technical solutions Regards, Ankit |Technical Recruiter |KK Associates LLC Contact: (469) 215-2333 Email Id: [email protected] URL: www.kksoftwareassociates.com https://www.linkedin.com/in/ankit-shukla-19152a231 Keywords: sthree Idaho Texas Databricks Engineer [email protected] |
[email protected] View all |
Thu Nov 14 06:03:00 UTC 2024 |