Skip to content
mimi

Senior Databricks Data Engineer

BTI

Quantico · On-site Full-time Senior 1w ago

About the role

Overview

We are seeking a Senior Databricks Data Engineer to design, build, and operate a Data & AI platform with a strong foundation in the Medallion Architecture (raw/bronze, curated/silver, and mart/gold layers). This platform will orchestrate complex data workflows and scalable ELT pipelines to integrate data from enterprise systems such as PeopleSoft, D2L, and Salesforce, delivering high-quality, governed data for machine learning, AI/BI, and analytics at scale.

You will play a senior technical role in guiding engineering standards, reusable patterns, platform reliability, and production readiness across the Data & AI platform.

You will play a critical role in engineering the infrastructure and workflows that enable seamless data flow across the enterprise, ensure operational excellence, and provide the backbone for strategic decision-making, predictive modeling, and innovation.

Responsibilities

Data & AI Platform Engineering (Databricks-Centric)

  • Provide senior-level technical leadership in the design, optimization, and standardization of Databricks engineering patterns across the Data & AI platform.
  • Design, implement, and optimize end-to-end data pipelines on Databricks, following the Medallion Architecture principles.
  • Build robust and scalable ETL/ELT pipelines using Apache Spark and Delta Lake to transform raw (bronze) data into trusted curated (silver) and analytics-ready (gold) data layers.
  • Operationalize Databricks Workflows for orchestration, dependency management, and pipeline automation.
  • Apply schema evolution and data versioning to support agile data development.

Platform Integration & Data Ingestion

  • Lead the design of reusable ingestion frameworks and integration patterns that support scalable, reliable, and governed data onboarding across enterprise systems.
  • Connect and ingest data from enterprise systems such as PeopleSoft, D2L, and Salesforce using APIs, JDBC, or other integration frameworks.
  • Implement connectors and ingestion frameworks that accommodate structured, semi-structured, and unstructured data.
  • Design standardized data ingestion processes with automated error handling, retries, and alerting.

Data Quality, Monitoring, and Governance

  • Establish senior-level data quality, observability, and governance practices to improve trust, reliability, lineage, and operational transparency across the platform.
  • Develop data quality checks, validation rules, and anomaly detection mechanisms to ensure data integrity across all layers.
  • Integrate monitoring and observability tools (e.g., Databricks metrics, Grafana) to track ETL performance, latency, and failures.
  • Implement Unity Catalog or equivalent tools for centralized metadata management, data lineage, and governance policy enforcement.

Security, Privacy, and Compliance

  • Provide senior technical guidance on secure data engineering practices, access-control patterns, and compliance implementation across Databricks and related cloud environments.
  • Enforce data security best practices including row-level security, encryption at rest/in transit, and fine-grained access control via Unity Catalog.
  • Design and implement data masking, tokenization, and anonymization for compliance with privacy regulations (e.g., GDPR, FERPA).
  • Work with security teams to audit and certify compliance controls.

AI/ML-Ready Data Foundation

  • Partner with data science and AI/ML teams to shape reusable, production-ready data engineering patterns that support scalable model development, deployment, and monitoring.
  • Enable data scientists by delivering high-quality, feature-rich data sets for model training and inference.
  • Support AIOps/MLOps lifecycle workflows using MLflow for experiment tracking, model registry, and deployment within Databricks.
  • Collaborate with AI/ML teams to create reusable feature stores and training pipelines.

Cloud Data Architecture and Storage

  • Contribute to senior-level cloud data architecture decisions related to data lake design, storage optimization, compute efficiency, security, and cost management.
  • Architect and manage data lakes on Azure Data Lake Storage (ADLS) or Amazon S3, and design ingestion pipelines to feed the bronze layer.
  • Build data marts and warehousing solutions using platforms like Databricks.
  • Optimize data storage and access patterns for performance and cost-efficiency.

Documentation & Enablement

  • Mentor engineers and promote senior-level engineering standards through documentation, code reviews, reusable frameworks, and knowledge-sharing practices.
  • Maintain technical documentation, architecture diagrams, data dictionaries, and runbooks for all pipelines and components.
  • Provide training and enablement sessions to internal stakeholders on the Databricks platform, Medallion Architecture, and data governance practices.
  • Conduct code reviews and promote reusable patterns and frameworks across teams.

Reporting and Accountability

  • Take ownership of complex data engineering deliverables, production issues, technical risks, and cross-team dependencies requiring senior-level judgment and coordination.
  • Submit a weekly schedule of hours worked and progress reports outlining completed tasks, upcoming plans, and blockers.
  • Track deliverables against roadmap milestones and communicate risks or dependencies.

Required Qualifications

  • Senior-level hands-on experience designing, building, optimizing, and supporting production-grade Databricks data engineering solutions in enterprise environments.
  • 5+ years of hands-on experience with Databricks, Delta Lake, and Apache Spark for large-scale data engineering.
  • Deep understanding of ELT pipeline development, orchestration, and monitoring in cloud-native environments.
  • Experience implementing Medallion Architecture (Bronze/Silver/Gold) and working with data versioning and schema enforcement in enterprise grade environments.
  • Strong proficiency in SQL, Python, or Scala for data transformations and workflow logic.
  • Proven experience integrating enterprise platforms (e.g., PeopleSoft, Salesforce, D2L) into centralized data platforms.
  • Familiarity with data governance, lineage tracking, and metadata management tools.

Preferred Qualifications

  • Experience serving as a senior engineer, technical lead, or mentor on enterprise data platform or Databricks implementation initiatives.
  • Prior UMGC or USM experience preferred.
  • Experience with Databricks Unity Catalog for metadata management and access control.
  • Experience deploying ML models at scale using MLFlow or similar MLOps tools.
  • Familiarity with cloud platforms like Azure or AWS, including storage, security, and networking aspects.
  • Knowledge of data warehouse design and star/snowflake schema modeling.

Skills

Apache SparkAzure Data Lake StorageDatabricksDelta LakeGrafanaJDBCMLflowPythonScalaSQL

Don't send a generic resume

Paste this job description into Mimi and get a resume tailored to exactly what the hiring team is looking for.

Get started free