Senior/Staff Build and CI Engineer
San Francisco Compute Company
About the role
About SFCompute
The San Francisco Compute Company runs large-scale GPU clusters (H100s, H200s, B300s) on contracts you can exit. Need 256 H100s for three days? Buy them at market price, cancel what you don't use. We operate the stack from UEFI up, so you're never paying a reseller markup or waiting on a support ticket. Customers include NVIDIA, MIT, Liquid AI, and Roboflow. We're a small team that has managed over $1B of hardware and is building what we think will be the defining infrastructure marketplace for the AI era.
The Role
We need someone who has run a serious build system at a previous job, ideally a large Bazel monorepo, and wants to do it again here. Our codebase is a TypeScript monorepo, a Rust workspace, a protobuf layer that wires them together, and a growing pile of services and container images. CI works. It isn't hermetic, it isn't deterministic, and the cache hit rates are nowhere near where they should be. That's the work.
You'll own the build and CI experience top to bottom. We're not religious about Bazel. If Buck2 fits better, or a simpler setup gets us 80% of the value, that's fine. The goal is local and CI builds that produce the same artifact, fast incremental feedback for every engineer, and a credible roadmap for what this looks like at 10x our current size.
What You'll Do
- Audit the current build and test pipeline (Bun for TypeScript, Cargo for Rust, buf for protobuf, plus Docker and Helm) and write down where it fails on reproducibility, hermeticity, and speed
- Pick a build system and migrate us onto it without breaking shipping
- Stand up remote execution and remote caching that actually move CI and local build times
- Pin toolchains, seal dependencies, and stop the host environment from leaking into builds
- Run the long-term roadmap for build, test, and CI as the team and codebase grow
- Work alongside application and infrastructure engineers throughout, since the migration touches all of them
What We're Looking For
- Senior or staff-level experience running Bazel, Buck2, Pants, or a comparable system somewhere the build system genuinely mattered
- Experience operating remote execution and remote caching in production
- Comfortable across language ecosystems. We run TypeScript and Rust today, with Python showing up.
- Strong opinions on determinism and reproducibility, with the judgment to know when full hermeticity is worth the cost and when it isn't
- CI ops chops: queue health, flake budgets, real test signal, build time budgets you can defend
- Able to scope your own work. There's no spec for what our build system should look like.
- Nice to have: experience moving a codebase onto Bazel (or off of it), polyglot or protobuf-heavy monorepos, prior work on developer infrastructure at an autonomy, robotics, or systems company
Why This Role
Build systems are one of the few pieces of infrastructure where every hour you save shows up for every engineer in the company. Doing this well before we're 10x the size is one of the most leveraged things we can do right now. You pick the tools, you set the standards, and you own the outcome.
Benefits
- Generous equity grant
- Competitive salary
- Visa sponsorships
- 401(k) matching up to 4%
- Medical, dental, and vision insurance (100% premium coverage for employees and dependents)
- Unlimited paid time off
- 10+ observed holidays
- Paid parental leave (biological, adoptive, and foster parents)
- Daily lunch coverage
- Unlimited office book budget
Skills
Don't send a generic resume
Paste this job description into Mimi and get a resume tailored to exactly what the hiring team is looking for.
Get started free