Senior / Staff Data Engineer

Index Exchange

Toronto · Hybrid Full-time Senior 1mo ago

About the role

About Index Exchange

At Index Exchange, we’re reinventing how digital advertising works—at scale. As a global advertising supply-side platform, we empower the world’s leading media owners and marketers to thrive in a programmatic, privacy-first ecosystem.

We’re a proud industry pioneer with over 20 years of experience accelerating the ad technology evolution. Our proprietary tech is trusted by some of the world’s largest brands and media owners and plays a crucial role in keeping the internet open, accessible, and largely free.

We process more than 550 billion real-time auctions every day (in comparison, Google processes 8.5 billion searches per day) with ultra-low latency. Our platform is vertically integrated from servers to networks and runs primarily on our own metal and cloud infrastructure. This end-to-end infrastructure is designed to provide both stability and agility, enabling us to adapt quickly as the market evolves.

At the core of it all is our engineering-first culture. Our engineers tackle internet-scale problems across tight-knit, global teams. From moving petabytes of data and optimizing with AI to making real-time infrastructure decisions, Indexers have the agency and influence to shape the future of advertising. We move fast, build thoughtfully, and stay grounded in our core values.

About the role:

We are hiring a Senior / Staff Data Engineer to build and evolve the data processing and pipeline layer that powers reporting, billing systems, and real-time data products at Index Exchange.

This role focuses on designing and operating large-scale batch and streaming data pipelines, enabling reliable, scalable, and efficient data transformation across the platform.

You will work on systems that transform raw, high-volume event data into clean, queryable, and production-grade datasets, supporting both API-driven data products and analytical workflows.

You will work on high-scale data systems that:

Process billions of events per day across distributed pipelines
Power core business datasets (reporting, billing, marketplace metrics)
Operate across batch (Spark) and streaming (Kafka / Flink) architectures
Require careful balancing of:
- data correctness
- processing efficiency
- latency vs cost trade-offs

You will solve problems such as:

Designing pipelines that scale without exploding compute costs
Managing data correctness at scale (deduplication, late data, joins)
Building systems that support both:
- historical backfills
- near real-time updates
Evolving pipelines from centralized processing (Hadoop) toward more distributed and efficient patterns
Streaming pipelines and Streaming DWs.

What we’re looking for:

Strong experience in data engineering at scale
Deep expertise in:
- Spark (required)
- SQL and data modeling
Experience with:
- Airflow or workflow orchestration
- Kafka or streaming systems
Strong understanding of:
- distributed data processing
- data modeling (large-scale datasets)
- performance optimization
Ability to:
- own pipelines end-to-end
- debug complex data issues
- work in high-scale, evolving environments

Staff-Level Expectations (if applicable)

Define data processing standards and patterns across teams
Lead large-scale pipeline and platform initiatives
Influence data architecture and modeling decisions
Drive improvements across:
- reliability
- cost efficiency
- scalability

Here’s what you’ll be doing:

Data Pipelines (Batch + Streaming)

Design and operate pipelines using:
- Spark (primary)
- Kafka / Flink (streaming)
Transform raw event data into:
- cleaned datasets (silver layer)
- business-ready datasets (gold / reporting tables)

Core Data Models & Datasets

Build and maintain canonical datasets (aggregated datasets, reporting tables)
Define data contracts and ensure consistency across pipelines
Support evolving use cases:
- reporting
- billing
- ML / experimentation

Workflow Orchestration

Build and maintain Airflow DAGs for:
- pipeline scheduling
- dependency management
- backfills
Improve reliability and observability of workflows

Data Processing Optimization

Optimize pipelines for:
- performance (runtime, throughput)
- cost (compute efficiency)
- scalability (data growth)
Improve:
- partitioning strategies
- data layout
- job execution patterns

Streaming & Near Real-Time Pipelines

Build pipelines that support:
- incremental updates
- streaming transformations
- aggregation at scale
Contribute to evolving patterns such as:
- edge aggregation
- streaming
- batch convergence
- real-time data availability

Platform & System Design Responsibilities

Define and evolve data processing patterns:
- batch vs streaming
- aggregation strategies
- incremental vs full recompute
Work across:
- Spark (core processing)
- Kafka (transport)
- Flink (streaming compute)
- storage systems (Hadoop / Ceph)
Contribute to:
- data platform architecture decisions
- pipeline standardization
- reusable data processing frameworks
Influence trade-offs:
- latency vs cost
- correctness vs performance
- compute vs storage

You will work closely with:

APIs & data products
Data Systems / Platform teams
ML and experimentation teams
Application Engineering

Why you’ll love working here:

Comprehensive health, dental, and vision plans for you and your dependents
Paid time off, health days, and personal obligation days plus flexible work schedules
Competitive retirement matching plans
Equity packages
Generous parental leave available to birthing, non-birthing, and adoptive parents
Annual well-being allowance plus fitness discounts and group wellness activities
Commuter benefits and discounts, where available
Employee assistance program
Mental health first aid program that provides an in-the-moment point of contact and reassurance
One day of volunteer time off per year and a donation-matching program
Bi-weekly town halls and regular community-led team events
Multiple resources and programming to support continuous learning

A workplace that supports a diverse, equitable, and inclusive environment –

Equal employment opportunity

At Index Exchange, we believe that successful products are built by teams just as diverse as the audience who uses them. As such, we are committed to equal employment opportunities. We celebrate diversity of race, color, ancestry, religion, sex, national origin, sexual orientation, age, citizenship, marital status, disability, gender identity or expression, or veteran status. Additionally, we realize that diversity is deeper than any status or classification—diversity is the human experience. For those who show grit, passion, and humility—Index will welcome you.

Accessibility for applicants with disabilities

Index Exchange welcomes and encourages individuals with disabilities to apply to work with us.

If you require an accommodation, please share the details of your request and any information how we can assist you with the hiring recruiter when they contact you. Index Exchange will make reasonable efforts to ensure accommodation requests are met throughout the recruitment process.

Index everywhere, Index anywhere

Our corporate headquarters are in Toronto, with major offices in New York, Montreal, Kitchener, London, San Francisco, and many other global cities. As a major global advertising exchange, we are committed to operating as a tightly knit global team and embracing and empowering talent wherever our colleagues may be.

Skills

AirflowCephFlinkHadoopKafkaMLSparkSQL

Don't send a generic resume

Paste this job description into Mimi and get a resume tailored to exactly what the hiring team is looking for.

Get started free