Home

Certified Databricks Data Engineer || USC AND GC ONLY || Remote at Remote, Remote, USA
Email: [email protected]
From:

GUNJAN,

PINAKA STAFFING

[email protected]

Reply to:   [email protected]

Client: Adobe

Location:  Remote (Pacific Time)

Duration: Four month contract

C2C

Requirements: Background Check (Candidates must have LinkedIn profile with picture or be able to share Photo ID with submittal)

Title:  Certified Databricks Data Engineer

Key Skills: Databricks Certification, Spark, performance tuning, data ingestion, large scale data warehouse (20TB or above).  

Job Description:

We are seeking an exceptional Senior Data Engineer specializing in Databricks and Apache Spark with a strong focus on performance optimization and data architecture. The ideal candidate will possess deep technical expertise in tuning and optimizing large-scale data processing systems, with a robust understanding of data architecture principles that enhance performance. This role is critical to our data infrastructure, ensuring our pipelines are efficient, scalable, and reliable.

Key Responsibilities:

               Performance Optimization and Tuning:

               Lead efforts in optimizing Spark job performance, including tuning memory management, resource allocation, and execution plans to achieve the highest efficiency and throughput.

               Analyze performance metrics and logs to identify areas for improvement and implement best practices for Spark optimization.

               Develop and implement advanced partitioning, caching strategies, and data layout optimizations to minimize processing time and cost.

               Advanced Pipeline Development and Data Architecture:

               Architect, develop, and optimize complex data pipelines and ETL workflows using Databricks and Spark, ensuring they are highly performant, scalable, and reliable.

               Design and implement data architectures that support high performance, including choosing appropriate data storage formats, indexing strategies, and data partitioning schemes.

               Collaborate on the design and implementation of data models that support efficient querying and data retrieval.

               Technical Leadership in Performance Engineering:

              Serve as the go-to expert on Spark performance optimization, guiding the team in adopting advanced techniques and tools for performance improvement.

               Provide mentorship and training on performance tuning, data architecture, and best practices in Spark and Databricks to the engineering team.

               Performance Monitoring and Troubleshooting:

               Set up and maintain comprehensive monitoring and alerting systems for Spark applications to proactively detect performance issues.

               Diagnose and resolve complex performance problems, implementing preventive measures to avoid recurrence.

               Collaboration and Stakeholder Engagement:

               Collaborate closely with data scientists, data analysts, and other engineering teams to understand performance requirements and ensure optimal data processing capabilities.

               Communicate complex performance findings and recommendations effectively to technical and non-technical stakeholders.

Required Skills and Experience:

Expert-Level Knowledge in Apache Spark Performance:

               Extensive experience with Spark internals, including in-depth understanding of the Catalyst optimizer, Tungsten execution engine, and Spark SQL query execution.

               Proven track record of optimizing Spark job performance in production environments, particularly in reducing latency and improving resource efficiency.

Databricks Expertise:

               Deep familiarity with Databricks architecture, cluster management, and advanced configuration settings that affect performance.

               Experience with optimizing Databricks notebooks and workflows, leveraging the platforms features for performance gains.

Data Architecture Skills:

               Strong expertise in data architecture, including designing data models, data flows, and data storage solutions that optimize performance.

               Knowledge of best practices in data management, including data partitioning, indexing, and the use of data lakes or data warehouses.

Strong Programming and Optimization Skills:

               Proficient in Python and/or Scala for Spark development, with a strong emphasis on writing efficient, high-performance code.

               Expertise in SQL query optimization and tuning.

Comprehensive Knowledge of Big Data Ecosystem:

               Familiarity with data storage formats (e.g., Parquet, ORC), distributed computing principles, and cloud infrastructure services (AWS, Azure, GCP).

               Innovative Problem-Solving:

               Ability to creatively address and solve performance issues, leveraging deep technical knowledge and innovative approaches.

Preferred Qualifications:

Certifications and Professional Recognition

               Databricks Certified Professional Data Engineer or equivalent certifications focused on performance and optimization.

               Contributions to the community through blogs, talks, or publications on Spark performance and data architecture topics.

Experience with Performance Benchmarks and Load Testing

               Experience in designing and conducting performance benchmarks and load tests to validate and verify system performance under various conditions.

Keywords: golang Idaho
Certified Databricks Data Engineer || USC AND GC ONLY || Remote
[email protected]
[email protected]
View all
Fri Oct 04 04:03:00 UTC 2024

To remove this job post send "job_kill 1812403" as subject from [email protected] to [email protected]. Do not write anything extra in the subject line as this is a automatic system which will not work otherwise.


Your reply to [email protected] -
To       

Subject   
Message -

Your email id:

Captcha Image:
Captcha Code:


Pages not loading, taking too much time to load, server timeout or unavailable, or any other issues please contact admin at [email protected]
Time Taken: 0

Location: ,