Senior Software Engineer / Research Engineer – Large‑Scale Statistical Systems (USCB Program)

CPMC LLC

Vienna · On-site Full-time Senior 3mo ago

About the role

Overview

CPMC is seeking a Senior Software Engineer who thrives on technically demanding problems involving large-scale computation, complex algorithms, and high‑performance data processing. You will engineer the U.S. Census Bureau’s Disclosure Avoidance System (DAS), a system that executes advanced statistical and differential privacy algorithms across massive datasets.

This role is ideal for an engineer who enjoys:

Understanding how algorithms behave under realworld scale
Turning research prototypes into robust, high‑performance systems
Diagnosing subtle numerical, performance, or correctness issues
Building distributed systems that must be reproducible, efficient, and scientifically trustworthy

You’ll be building the computational machinery that makes cutting‑edge statistical methods run reliably at national scale.

Key Responsibilities

Engineer productiongrade implementations of complex statistical and differential privacy algorithms, ensuring correctness, stability, and performance
Translate research code (Python/R) into optimized, maintainable systems, often requiring algorithmic insight and careful handling of numerical edge cases
Design and optimize largescale data processing pipelines for ingestion, transformation, validation, and output generation
Profile, benchmark, and optimize distributed workloads (Spark, EMR, containerized compute) to reduce runtime and cost
Diagnose algorithmic performance issues—from data skew to solver behavior to memory pressure
Collaborate deeply with statisticians to understand algorithmic assumptions, constraints, and expected behavior under scale
Develop reproducible experiment frameworks, including parameter tracking, environment isolation, and deterministic execution
Build automation and tooling that enable researchers to run large experiments safely and efficiently
Tune compute and solver configurations (Spark, Gurobi, storage layouts, partitioning strategies) for largescale statistical workloads
Support distributed execution environments and contribute to DevOps/automation where needed to keep the system reliable

Required Qualifications

10+ years professional software engineering experience
Strong programming skills in Python (primary) and familiarity with R
Experience with distributed computing (Spark, EMR, or equivalent)
Strong background in performance engineering, profiling, and debugging complex systems
Experience building and maintaining largescale data pipelines
Handson experience with AWS (EMR, S3)
Experience with CI/CD, automated testing, and environment management
Familiarity with basic probability and statistics concepts (e.g. hypothesis testing, probability distributions, least squares, etc.)
Ability to read, reason about, and improve scientific or researchoriented code

Preferred Qualifications

Experience collaborating with statisticians or working in scientific computing environments
Familiarity with numerical methods, statistical computing, or algorithmic evaluation
Experience with optimization solvers (e.g., Gurobi) or largescale simulations
Knowledge of differential privacy or privacypreserving computation
Experience with containerization (Docker, Kubernetes)
Experience with HPC or large distributed systems

Why This Role Is Technically Unique

You work on algorithmically complex systems where correctness and performance both matter
You operate at nationalscale data volumes with strict reproducibility requirements
You collaborate with researchers pushing the boundaries of statistical privacy
You solve problems where the bottleneck might be a numerical instability, a distributed shuffle, a solver configuration, a data partitioning strategy, or an algorithmic assumption that breaks at scale
You directly influence the performance and reliability of a system that protects the confidentiality of Census data

Soft Skills

Organizational Skills: Can plan and prioritize work. Follows tasks to their logical conclusion and makes sure that everything has been done to the right standard. Good attention to detail.
Team Work: Able to enthuse and maintain project interest. Comfortable working both individually and as part of a team. Prepared to challenge ideas within a group in a constructive way.
Communications: Ability to communicate clearly and efficiently to team members and clients, verbally and in writing. Able to present ideas in a variety of ways depending upon audience and context. Excellent active listening skills.
Problem Solving: Natural inclination for planning strategy and tactics. Ability to analyze problems and determine root cause, generating alternatives, evaluating and selecting alternatives and implementing solutions.
Results oriented: Able to drive things forward regardless of personal interest in the task.

Skills

AWSAWS EMRAWS S3CI/CDDockerEMRGurobiKubernetesPythonRSparkSQL

Similar roles

backend developer

skoobe

Fullstack Software Architect / Lead Engineer

Wistar Informatik AG

Java Backend Engineer (all gender)

Alten

Don't send a generic resume

Paste this job description into Mimi and get a resume tailored to exactly what the hiring team is looking for.

Get started free