Skip to content
mimi

Lead Data Scientist

Datavant

Bismarck · On-site Full-time Lead $184k – $200k/yr 2w ago

About the role

Join Datavant, a leading data collaboration platform in healthcare, where our mission is to make the world's health data secure, accessible, and actionable. We are at the forefront of transforming how data is connected and used to enhance health, catering to a diverse range of organizations including providers, health plans, researchers, and life sciences companies. By becoming a part of Datavant, you will join a passionate and collaborative team dedicated to creating transformative change in healthcare.

Role Overview

We seek an enthusiastic Lead Data Scientist with a focus on AI innovations in the healthcare sector. This pivotal position requires a candidate capable of tackling an array of challenges using vast healthcare data sets. You will work on understanding complex medical documents, aiming to extract meaningful insights, intent, and structure from unstructured medical and administrative records. Our goal is to develop intelligent systems that can efficiently interpret and manage high-stakes healthcare documentation at scale. This position combines applied machine learning, natural language processing (NLP), and strategic product development. You will closely collaborate with cross-functional teams to:

  • Design and implement models focused on entity extraction, intent detection, and document structure comprehension.
  • Address challenges such as long-context reasoning, layout-aware NLP, and managing ambiguous inputs.
  • Evaluate model performance, especially when ground truth is partial, uncertain, or evolving.
  • Develop the roadmap and success metrics for modernizing legacy document processing systems with smarter, scalable solutions.

We thrive in a high-trust, high-ownership culture, emphasizing rapid experimentation and delivering value quickly. If you are excited about creating systems that enhance the usability, accuracy, and safety of healthcare data, we encourage you to apply.

Your Responsibilities

  • Play a vital role in our product success by developing models for document understanding tasks.
  • Conduct error analysis and data cleaning tasks to improve model performance.
  • Collaborate with the team to shape the development roadmap for various capabilities.
  • Work alongside data scientists and engineers to optimize machine learning models and integrate them into end-to-end pipelines.
  • Define key performance metrics for models based on product use-cases and business requirements.
  • Establish systems for the ongoing enhancement of models and data quality, such as active and continuous learning frameworks.

Qualifications for Success

  • Over 6 years of experience in data science and machine learning, specifically in constructing NLP models.
  • Proficiency in Python.
  • Familiarity with advanced language models (transformers, LLMs, etc.).
  • Expertise in data analysis tools including SQL, Numpy, and Pandas.
  • Experience with deep learning frameworks such as PyTorch (preferred) or TensorFlow.
  • Proven experience in leading ML/AI projects from concept to deployment.
  • Ability to influence key performance indicators with AI initiatives.
  • Skill in navigating ambiguous situations successfully.

Preferred Qualifications

  • Experience with document layout analysis, engaging vision or multi-modal techniques.
  • Familiarity with Spark/PySpark.
  • Knowledge of Databricks.
  • Experience in the healthcare sector.

Your Progression

After 3 Months, You Will:

  • Have an in-depth understanding of the technologies that underpin our platform.
  • Be fully integrated into current model development initiatives within your team.

After 1 Year, You Will:

  • Independently conduct literature reviews and research to create models for new and existing products.
  • Own the performance of models, collaborating effectively with product managers, customer success managers, and engineers.
  • Be recognized as a subject matter expert on Datavant's models, offering insights and guidance to other teams.

Datavant is an equal opportunity employer committed to fostering a diverse workforce. We encourage all qualified applicants to apply without regard to race, color, sex, sexual orientation, gender identity, religion, national origin, disability, veteran status, or any other legally protected status. The estimated total cash compensation for this role is between $184,000 and $200,000 USD. Please note that many clients may require post-offer health screenings and vaccinations. Exemptions will be reviewed on a case-by-case basis. This job is not eligible for employment sponsorship. To learn more about our commitment to diversity and inclusion, or for more information on your rights, please review our statements related to EEO and employee privacy.

Skills

LLMsNumpyPandasPyTorchPySparkPythonSparkSQLTensorFlowTransformers

Don't send a generic resume

Paste this job description into Mimi and get a resume tailored to exactly what the hiring team is looking for.

Get started free