All jobs

Decision Intelligence Engineer

Humana

Providence · flexible Full-time Senior $129k – $178k/yr 1mo ago

Apply with a tailored resume Save job

About the role

About Us

Join our compassionate community and prioritize health

We are seeking a talented Decision Intelligence Engineer to design, implement, and enhance the reinforcement learning policy that powers Humana's Next Best Action platform. You will take charge of the entire RL development lifecycle, including feature engineering, reward design, training, evaluation, and deploying models that impact over 8 million members. Your expertise will translate member journey data into clear, explainable, and auditable decision-making intelligence in the healthcare space.

This role is both hands-on and research-focused: you will implement and assess RL algorithms, develop training pipelines, work closely with data and platform engineers, and ensure robust model operation within the bounds of clinical eligibility and program-specific rewards.

Key Responsibilities

Reinforcement Learning Model Development

Design and assess RL algorithms for healthcare decision-making, including policy gradient methods (such as PPO, A3C), value-based methods (like DQN, Q-learning), and offline RL techniques (CQL, Decision Transformer).
Establish and refine the member state representation and action space as new programs and data sources are integrated.
Utilize reward shaping and the Bellman equation to incorporate clinical eligibility and specific program objectives into learning objectives.
Balance exploration and exploitation in a production healthcare setting where the impact on members is significant.

Model Evaluation and Production Safety

Develop simulation and backtesting environments using historical member data to ensure policy quality before deployment.
Identify and resolve common RL issues such as policy collapse and credit assignment errors throughout member journeys.
Set reward threshold criteria and automate nightly evaluations in the Databricks training workflow, preventing the promotion of subpar policies to production.
Document training runs, tracking hyperparameters, reward trends, and feature importance for comprehensive model oversight.

Training Pipeline Engineering

Oversee the nightly Databricks training workflow, ensuring accurate feature engineering and distributed RL training across all eligible members.
Work with the Data Engineering team to ensure that training data and reward signals are precise and reliably calculated.
Create quality PySpark feature engineering jobs while maintaining data lineage within Databricks Unity Catalog.
Manage model artifacts and lifecycle with MLflow Model Registry, ensuring a rollback capability is always available.

Multi-Agent and Constraint-Aware Decisioning

Implement multi-agent RL strategies where collective cooperation among member households is necessary.
Embed hard business rules and eligibility constraints directly into the RL learning objectives instead of relying on post-decision filters.
Coordinate with the Rules Engine team to ensure alignment between eligibility criteria and RL policy priorities.

Collaboration and Governance

Engage with Decisioning Team 1 to ensure seamless integration of model outputs into real-time decision-making processes.
Partner with platform architects to establish contracts for feedback loops, ensuring disposition outcomes are effectively utilized in training cycles.
Document model behavior and limitations for clinical stakeholders, ensuring compliance and explainability requirements are met.
Utilize AI-assisted engineering tools for efficient development, ensuring all core model logic adheres to rigorous peer review processes.

Make an impact with your skills

Required Qualifications

At least 8 years of software engineering experience with large-scale production systems, preferably focusing on data-intensive platforms that serve millions of users.
A minimum of 3 years implementing reinforcement learning or deep learning systems, with experience in various algorithms such as policy gradients and value-based methods.
Deep understanding of the Bellman equation, reward shaping, and constraint mapping in real-world RL applications.
Skilled in diagnosing RL-specific issues, including credit assignment problems and distributional shifts in populations.
Proficient in Python 3.x and experienced in using PyTorch or TensorFlow for policy network development.
Experienced with Ray RLlib for scalable RL training solutions.
Familiar with Databricks, PySpark, and Delta Lake for handling ML pipelines with large data sets.
A history of delivering reliable ML systems in production environments.

Preferred Qualifications

Experience with multi-agent RL frameworks.
Knowledge of probabilistic modeling, Markov Decision Processes, and linear programming for constrained decision-making.
Experience in regulated sectors, such as healthcare or finance, emphasizing safety and compliance.
Familiarity with simulation environment development tools and feedback loop architectures using Kafka.
Exposure to OpenTelemetry for observability in ML operations.

Visa Sponsorship

This role does not offer work visa sponsorship.

Additional Information

Humana offers competitive benefits that support holistic well-being, including medical, dental, vision insurance, 401(k) retirement savings, paid time off, short-term and long-term disability, and more.

Travel: This is primarily a remote position, but occasional travel to Humana's offices may be required for training or meetings.

Scheduled Weekly Hours: 40

Pay Range: $129,300 - $177,800 per year, plus eligibility for a bonus incentive plan based on performance.

Skills

A3CAWS LambdaBellman equationCQLDatabricksDecision TransformerDeep learningDQNMLflowMarkov Decision ProcessesOpenTelemetryPPOPyTorchPySparkQ-learningRay RLlibReinforcement LearningTensorFlowUnity Catalog

Don't send a generic resume

Paste this job description into Mimi and get a resume tailored to exactly what the hiring team is looking for.

Get started free