CL
Senior Software Engineer / Research Engineer – Large‑Scale Statistical Systems (USCB Program)
CPMC LLC
Vienna · On-site Full-time Senior 1mo ago
About the role
Overview
CPMC is seeking a Senior Software Engineer who thrives on technically demanding problems involving large-scale computation, complex algorithms, and high‑performance data processing. You will engineer the U.S. Census Bureau’s Disclosure Avoidance System (DAS), a system that executes advanced statistical and differential privacy algorithms across massive datasets.
This role is ideal for an engineer who enjoys:
- Understanding how algorithms behave under realworld scale
- Turning research prototypes into robust, high‑performance systems
- Diagnosing subtle numerical, performance, or correctness issues
- Building distributed systems that must be reproducible, efficient, and scientifically trustworthy
You’ll be building the computational machinery that makes cutting‑edge statistical methods run reliably at national scale.
Key Responsibilities
- Engineer productiongrade implementations of complex statistical and differential privacy algorithms, ensuring correctness, stability, and performance
- Translate research code (Python/R) into optimized, maintainable systems, often requiring algorithmic insight and careful handling of numerical edge cases
- Design and optimize largescale data processing pipelines for ingestion, transformation, validation, and output generation
- Profile, benchmark, and optimize distributed workloads (Spark, EMR, containerized compute) to reduce runtime and cost
- Diagnose algorithmic performance issues—from data skew to solver behavior to memory pressure
- Collaborate deeply with statisticians to understand algorithmic assumptions, constraints, and expected behavior under scale
- Develop reproducible experiment frameworks, including parameter tracking, environment isolation, and deterministic execution
- Build automation and tooling that enable researchers to run large experiments safely and efficiently
- Tune compute and solver configurations (Spark, Gurobi, storage layouts, partitioning strategies) for largescale statistical workloads
- Support distributed execution environments and contribute to DevOps/automation where needed to keep the system reliable
Required Qualifications
- 10+ years professional software engineering experience
- Strong programming skills in Python (primary) and familiarity with R
- Experience with distributed computing (Spark, EMR, or equivalent)
- Strong background in performance engineering, profiling, and debugging complex systems
- Experience building and maintaining largescale data pipelines
- Handson experience with AWS (EMR, S3)
- Experience with CI/CD, automated testing, and environment management
- Familiarity with basic probability and statistics concepts (e.g. hypothesis testing, probability distributions, least squares, etc.)
- Ability to read, reason about, and improve scientific or researchoriented code
Preferred Qualifications
- Experience collaborating with statisticians or working in scientific computing environments
- Familiarity with numerical methods, statistical computing, or algorithmic evaluation
- Experience with optimization solvers (e.g., Gurobi) or largescale simulations
- Knowledge of differential privacy or privacypreserving computation
- Experience with containerization (Docker, Kubernetes)
- Experience with HPC or large distributed systems
Why This Role Is Technically Unique
- You work on algorithmically complex systems where correctness and performance both matter
- You operate at nationalscale data volumes with strict reproducibility requirements
- You collaborate with researchers pushing the boundaries of statistical privacy
- You solve problems where the bottleneck might be a numerical instability, a distributed shuffle, a solver configuration, a data partitioning strategy, or an algorithmic assumption that breaks at scale
- You directly influence the performance and reliability of a system that protects the confidentiality of Census data
Soft Skills
- Organizational Skills: Can plan and prioritize work. Follows tasks to their logical conclusion and makes sure that everything has been done to the right standard. Good attention to detail.
- Team Work: Able to enthuse and maintain project interest. Comfortable working both individually and as part of a team. Prepared to challenge ideas within a group in a constructive way.
- Communications: Ability to communicate clearly and efficiently to team members and clients, verbally and in writing. Able to present ideas in a variety of ways depending upon audience and context. Excellent active listening skills.
- Problem Solving: Natural inclination for planning strategy and tactics. Ability to analyze problems and determine root cause, generating alternatives, evaluating and selecting alternatives and implementing solutions.
- Results oriented: Able to drive things forward regardless of personal interest in the task.
Skills
AWSAWS EMRAWS S3CI/CDDockerEMRGurobiKubernetesPythonRSparkSQL
Don't send a generic resume
Paste this job description into Mimi and get a resume tailored to exactly what the hiring team is looking for.
Get started free