Home

Hiring For Certified Databricks Engineer, Remote :: Certification is Must at Remote, Remote, USA
Email: [email protected]
Title :-

Certified

Data Bricks Data Engineer

Location:  Remote (Pacific Time)

Duration: Four month contract

Requirements: Background Check (Candidates must have LinkedIn profile with picture or be able to share Photo ID with submittal)

Certification of Data Bricks is Must

Title:  Certified Databricks Data Engineer

Key Skills: Databricks Certification, Spark, performance tuning, data ingestion,

large-scale
data warehouse (20TB or above). 

Job Description:

We are seeking an exceptional Senior Data Engineer specializing in Databricks and Apache Spark with a strong focus on performance optimization and data architecture.
The ideal candidate will possess deep technical expertise in tuning and optimizing large-scale data processing systems, with a robust understanding of data architecture principles that enhance performance. This role is critical to our data infrastructure,
ensuring our pipelines are efficient, scalable, and reliable.

Key Responsibilities:

               Performance Optimization and Tuning:

               Lead efforts in optimizing Spark job performance, including tuning memory management, resource allocation, and execution plans to achieve the
highest efficiency and throughput.

               Analyze performance metrics and logs to identify areas for improvement and implement best practices for Spark optimization.

               Develop and implement advanced partitioning, caching strategies, and data layout optimizations to minimize processing time and cost.

               Advanced Pipeline Development and Data Architecture:

               Architect, develop, and optimize complex data pipelines and ETL workflows using Databricks and Spark, ensuring they are highly performant, scalable,
and reliable.

               Design and implement data architectures that support high performance, including choosing appropriate data storage formats, indexing strategies,
and data partitioning schemes.

               Collaborate on the design and implementation of data models that support efficient querying and data retrieval.

               Technical Leadership in Performance Engineering:

              Serve as the go-to expert on Spark performance optimization, guiding the team in adopting advanced techniques and tools for performance improvement.

               Provide mentorship and training on performance tuning, data architecture, and best practices in Spark and Databricks to the engineering team.

               Performance Monitoring and Troubleshooting:

               Set up and maintain comprehensive monitoring and alerting systems for Spark applications to proactively detect performance issues.

               Diagnose and resolve complex performance problems, implementing preventive measures to avoid recurrence.

               Collaboration and Stakeholder Engagement:

               Collaborate closely with data scientists, data analysts, and other engineering teams to understand performance requirements and ensure optimal
data processing capabilities.

               Communicate complex performance findings and recommendations effectively to technical and non-technical stakeholders.

Required Skills and Experience:

Expert-Level Knowledge in

Apache Spark Performance:

               Extensive experience with Spark internals, including in-depth understanding of the Catalyst optimizer, Tungsten execution engine, and Spark SQL
query execution.

               Proven track record of optimizing Spark job performance in production environments, particularly in reducing latency and improving resource efficiency.

Databricks Expertise:

               Deep familiarity with Databricks architecture, cluster management, and advanced configuration settings that affect performance.

               Experience with optimizing Databricks notebooks and workflows, leveraging the platforms features for performance gains.

Data Architecture Skills:

               Strong expertise in data architecture, including designing data models, data flows, and data storage solutions that optimize performance.

               Knowledge of best practices in data management, including data partitioning, indexing, and the use of data lakes or data warehouses.

Strong Programming and Optimization Skills:

               Proficient in Python and/or Scala for Spark development, with a strong emphasis on writing efficient, high-performance code.

               Expertise in SQL query optimization and tuning.

Comprehensive Knowledge of Big Data Ecosystem:

               Familiarity with data storage formats (e.g., Parquet, ORC), distributed computing principles, and cloud infrastructure services (AWS, Azure, GCP).

               Innovative Problem-Solving:

               Ability to creatively address and solve performance issues, leveraging deep technical knowledge and innovative approaches.

Preferred Qualifications:

Certifications and Professional Recognition

               Databricks Certified Professional Data Engineer or equivalent certifications focused on performance and optimization.

               Contributions to the community through blogs, talks, or publications on Spark performance and data architecture topics.

Experience with Performance Benchmarks and Load Testing

               Experience in designing and conducting performance benchmarks and load tests to validate and verify system performance under various conditions.

Thanks & Regards

Manish Kumar

Sr. Technical Recruiter

Email:        
 [email protected]

Linkedin:
   Manish Chaudhary Jaat | LinkedIn

ANVETA, Inc

Address: 1333 Corporate Drive, Suite #108 Irving, TX 75038, USA

--

Keywords: information technology golang Idaho Texas
Hiring For Certified Databricks Engineer, Remote :: Certification is Must
[email protected]
[email protected]
View all
Fri Oct 04 19:27:00 UTC 2024

To remove this job post send "job_kill 1813357" as subject from [email protected] to [email protected]. Do not write anything extra in the subject line as this is a automatic system which will not work otherwise.


Your reply to [email protected] -
To       

Subject   
Message -

Your email id:

Captcha Image:
Captcha Code:


Pages not loading, taking too much time to load, server timeout or unavailable, or any other issues please contact admin at [email protected]
Time Taken: 0

Location: ,