← All Jobs
Posted Apr 25, 2026

Data Engineer - Databricks Specialist

Apply Now
We are seeking an experienced Data Engineer with deep expertise in Databricks to design, build, and maintain scalable data pipelines and analytics solutions. This role requires 5 years of hands-on experience in data engineering with a strong focus on the Databricks platform. Key Responsibilities: - Data Pipeline Development & Management - Design and implement robust, scalable ETL/ELT pipelines using Databricks and Apache Spark - Process large volumes of structured and unstructured data - Develop and maintain data workflows using Databricks workflows, Apache Airflow, or similar orchestration tools - Optimize data processing jobs for performance, cost efficiency, and reliability - Implement incremental data processing patterns and change data capture (CDC) mechanisms Databricks Platform Engineering: - Build and maintain Delta Lake tables and implement medallion architecture (bronze, silver, gold layers) - Develop streaming data pipelines using Structured Streaming and Delta Live Tables - Manage and optimize Databricks clusters for various workloads - Implement Unity Catalog for data governance, security, and metadata management - Configure and maintain Databricks workspace environments across development, staging, and production Data Architecture & Modeling: - Design and implement data models optimized for analytical workloads - Create and maintain data warehouses and data lakes on cloud platforms (Azure, AWS, or GCP) - Implement data partitioning, indexing, and caching strategies for optimal query performance - Collaborate with data architects to establish best practices for data storage and retrieval patterns Performance Optimization & Monitoring: - Monitor and troubleshoot data pipeline performance issues - Optimize Spark jobs through proper partitioning, caching, and broadcast strategies - Implement data quality checks and automated testing frameworks - Manage cost optimization through efficient resource utilization and cluster management - Establish monitoring and alerting systems for data pipeline health and performance Collaboration & Best Practices: - Work closely with data scientists, analysts, and business stakeholders to understand data requirements - Implement version control using Git and follow CI/CD best practices for code deployment - Document data pipelines, data flows, and technical specifications - Mentor junior engineers on Databricks and data engineering best practices - Participate in code reviews and contribute to establishing team standards Required Qualifications Experience & Skills: - 5+ years of experience in data engineering with hands-on Databricks experience - Strong proficiency in Python and/or Scala for Spark application development - Expert-level knowledge of Apache Spark, including Spark SQL, DataFrames, and RDDs - Deep understanding of Delta Lake and Lakehouse architecture concepts - Experience with SQL and database optimization techniques - Solid understanding of distributed computing concepts and data processing frameworks - Proficiency with cloud platforms (Azure, AWS, or GCP) and their data services - Experience with data orchestration tools (Databricks Workflows, Apache Airflow, Azure Data Factory) - Knowledge of data modeling concepts for both OLTP and OLAP systems - Familiarity with data governance principles and tools like Unity Catalog - Understanding of streaming data processing and real-time analytics - Experience with version control systems (Git) and CI/CD pipelines Preferred Qualifications - Databricks Certified Data Engineer certification (Associate or Professional) - Experience with machine learning pipelines and MLOps on Databricks - Knowledge of data visualization tools (Power BI, Tableau, Looker) - Experience with infrastructure as code (Terraform, CloudFormation) - Familiarity with containerization technologies (Docker, Kubernetes)
Interested in this role?Apply on iHire