All jobs · Machine Learning Engineer jobs

AI Compiler Engineer

Blumind

Canada · On-site Full-time Senior 6d ago

About the role

About Blumind

Blumind is a deep-tech startup at the forefront of the AI revolution. We are building a new class of semiconductor: ultra-low-power analog AI processors designed for edge inferencing. Our technology enables complex AI, from "always-on" keyword spotting to advanced vision and language models, to run on a fraction of the power of traditional digital chips. We are on a mission to make high-performance, on-device AI ubiquitous, from wearables and mobile, to smart home devices, automotive and industrial IoT.

Role Overview

We are looking for an AI Compiler Engineer to help bridge modern machine learning frameworks with our novel analog hardware platform.

As an AI Compiler Engineer at Blumind, you will design and build the compiler stack that maps neural network models onto our analog compute architecture. You will work at the intersection of machine learning, compilers, and hardware, translating high-level models into efficient, hardware-aware execution.

Key Responsibilities

Design and develop compiler infrastructure for Blumind’s analog AI chips
Build lowering pipelines from ML frameworks (e.g., PyTorch, TensorFlow) to hardware-specific representations
Leverage and extend MLIR to represent, optimize, and schedule computations
Develop custom dialects, passes, and transformations for analog compute primitives
Implement quantization, approximation, and noise-aware optimizations tailored to analog hardware
Collaborate closely with hardware, firmware, and ML teams to co-design efficient execution strategies
Optimize performance, power efficiency, and accuracy trade-offs across the stack
Contribute to runtime integration and execution tooling

What We’re Looking For

Required

8+ years software development industry experience with a strong background in compiler design and implementation
Experience with LLVM and/or MLIR (MLIR strongly preferred)
Proficiency in C++ (and/or Rust)
Solid understanding of machine learning fundamentals and neural network architectures
Experience building or working with deep learning frameworks (e.g., PyTorch, TensorFlow, ONNX)
Familiarity with graph-level and tensor-level optimizations

Preferred

Experience developing MLIR dialects and passes
Background in hardware-aware compilation (e.g., GPUs, TPUs, NPUs, or custom accelerators)
Understanding of quantization, mixed precision, and model compression techniques
Exposure to analog or in-memory compute paradigms
Experience with scheduling, tiling, and memory optimization strategies
Familiarity with Python for ML tooling and prototyping
Knowledge of compute-in-memory (CIM), analog/mixed-signal architectures, or neuromorphic compiler stacks.

Education

Degree in Computer Science, Electrical Engineering, or a related technical or scientific field with a focus on Machine Learning preferred.

What Makes This Role Unique

Work on first-of-its-kind analog AI compute systems
Define the compiler layer for a completely new hardware paradigm
High impact role shaping performance, efficiency, and developer experience
Tight collaboration across silicon, systems, and AI teams
Opportunity to contribute to open-source ecosystems (e.g., MLIR/LLVM)

Location

Preferred locations are Toronto/Ottawa or Silicon Valley. Candidates willing to relocate to Canada will also be considered

We thank all applicants for their interest. Only candidates being considered for the role will be contacted.

Requirements

8+ years software development industry experience with a strong background in compiler design and implementation
Experience with LLVM and/or MLIR (MLIR strongly preferred)
Proficiency in C++ (and/or Rust)
Solid understanding of machine learning fundamentals and neural network architectures
Experience building or working with deep learning frameworks (e.g., PyTorch, TensorFlow, ONNX)
Familiarity with graph-level and tensor-level optimizations

Responsibilities

Design and develop compiler infrastructure for Blumind’s analog AI chips
Build lowering pipelines from ML frameworks (e.g., PyTorch, TensorFlow) to hardware-specific representations
Leverage and extend MLIR to represent, optimize, and schedule computations
Develop custom dialects, passes, and transformations for analog compute primitives
Implement quantization, approximation, and noise-aware optimizations tailored to analog hardware
Collaborate closely with hardware, firmware, and ML teams to co-design efficient execution strategies
Optimize performance, power efficiency, and accuracy trade-offs across the stack
Contribute to runtime integration and execution tooling

Skills

C++LLVMMLIRONNXPyTorchRustTensorFlow

Don't send a generic resume

Paste this job description into Mimi and get a resume tailored to exactly what the hiring team is looking for.

Get started free