Skip to content
mimi

Senior Software Engineer / Research Engineer – Large‑Scale Statistical Systems (USCB Program)

CPMC LLC

Vienna · On-site Full-time Senior 1mo ago

About the role

Overview

CPMC is seeking a Senior Software Engineer who thrives on technically demanding problems involving large-scale computation, complex algorithms, and high‑performance data processing. You will engineer the U.S. Census Bureau’s Disclosure Avoidance System (DAS), a system that executes advanced statistical and differential privacy algorithms across massive datasets.

This role is ideal for an engineer who enjoys:

  • Understanding how algorithms behave under realworld scale
  • Turning research prototypes into robust, high‑performance systems
  • Diagnosing subtle numerical, performance, or correctness issues
  • Building distributed systems that must be reproducible, efficient, and scientifically trustworthy

You’ll be building the computational machinery that makes cutting‑edge statistical methods run reliably at national scale.

Key Responsibilities

  • Engineer productiongrade implementations of complex statistical and differential privacy algorithms, ensuring correctness, stability, and performance
  • Translate research code (Python/R) into optimized, maintainable systems, often requiring algorithmic insight and careful handling of numerical edge cases
  • Design and optimize largescale data processing pipelines for ingestion, transformation, validation, and output generation
  • Profile, benchmark, and optimize distributed workloads (Spark, EMR, containerized compute) to reduce runtime and cost
  • Diagnose algorithmic performance issues—from data skew to solver behavior to memory pressure
  • Collaborate deeply with statisticians to understand algorithmic assumptions, constraints, and expected behavior under scale
  • Develop reproducible experiment frameworks, including parameter tracking, environment isolation, and deterministic execution
  • Build automation and tooling that enable researchers to run large experiments safely and efficiently
  • Tune compute and solver configurations (Spark, Gurobi, storage layouts, partitioning strategies) for largescale statistical workloads
  • Support distributed execution environments and contribute to DevOps/automation where needed to keep the system reliable

Required Qualifications

  • 10+ years professional software engineering experience
  • Strong programming skills in Python (primary) and familiarity with R
  • Experience with distributed computing (Spark, EMR, or equivalent)
  • Strong background in performance engineering, profiling, and debugging complex systems
  • Experience building and maintaining largescale data pipelines
  • Handson experience with AWS (EMR, S3)
  • Experience with CI/CD, automated testing, and environment management
  • Familiarity with basic probability and statistics concepts (e.g. hypothesis testing, probability distributions, least squares, etc.)
  • Ability to read, reason about, and improve scientific or researchoriented code

Preferred Qualifications

  • Experience collaborating with statisticians or working in scientific computing environments
  • Familiarity with numerical methods, statistical computing, or algorithmic evaluation
  • Experience with optimization solvers (e.g., Gurobi) or largescale simulations
  • Knowledge of differential privacy or privacypreserving computation
  • Experience with containerization (Docker, Kubernetes)
  • Experience with HPC or large distributed systems

Why This Role Is Technically Unique

  • You work on algorithmically complex systems where correctness and performance both matter
  • You operate at nationalscale data volumes with strict reproducibility requirements
  • You collaborate with researchers pushing the boundaries of statistical privacy
  • You solve problems where the bottleneck might be a numerical instability, a distributed shuffle, a solver configuration, a data partitioning strategy, or an algorithmic assumption that breaks at scale
  • You directly influence the performance and reliability of a system that protects the confidentiality of Census data

Soft Skills

  • Organizational Skills: Can plan and prioritize work. Follows tasks to their logical conclusion and makes sure that everything has been done to the right standard. Good attention to detail.
  • Team Work: Able to enthuse and maintain project interest. Comfortable working both individually and as part of a team. Prepared to challenge ideas within a group in a constructive way.
  • Communications: Ability to communicate clearly and efficiently to team members and clients, verbally and in writing. Able to present ideas in a variety of ways depending upon audience and context. Excellent active listening skills.
  • Problem Solving: Natural inclination for planning strategy and tactics. Ability to analyze problems and determine root cause, generating alternatives, evaluating and selecting alternatives and implementing solutions.
  • Results oriented: Able to drive things forward regardless of personal interest in the task.

Skills

AWSAWS EMRAWS S3CI/CDDockerEMRGurobiKubernetesPythonRSparkSQL

Don't send a generic resume

Paste this job description into Mimi and get a resume tailored to exactly what the hiring team is looking for.

Get started free