Skip to content
mimi

Data Scientist V

VTG

Herndon · On-site Full-time 3d ago

About the role

Overview

Byte Systems, a subsidiary of VTG, is seeking a Ontologically Aware Data Scientist to design, build, and operationalize knowledge centric AI/ML systems that integrate ontology engineering, semantic data modeling, and advanced analytics. This role blends deep experience with RDF/OWL/SHACL and graph data models with practical machine learning, natural language processing, and context engineering. You will develop semantic foundations and AI-driven tooling that enable multiple organizations to communicate using shared ontologies and that support natural language to SPARQL interfaces with reduced hallucination.

What will you do?

Responsibilities • Develop and maintain logical, semantically rich, extensible ontologies (RDF/OWL/SKOS) aligned to foundational/top level ontologies (BFO, DOLCE, CCO, OBO Foundry). • Build and evolve canonical vocabularies that unify heterogeneous datasets across organizational boundaries. • Design, implement, and maintain knowledge graphs and semantic data models using AnzoGraph including modeling tradeoffs, query optimization, and production operations. • Create SHACL validation layers; run regular quality checks to ensure consistency, backward compatibility, and schema integrity. • Write performant SPARQL queries; develop mapping views, CONSTRUCT transformations, and federated semantic integrations. • Develop hybrid ontology–GenAI capabilities, including natural language to SPARQL pipelines with context engineering frameworks (e.g., DSPy) to reduce hallucination and improve schema grounding. • Apply machine learning techniques—classification, embedding based retrieval, entity resolution, and model fine tuning—to support ontology evolution, reification, and inference.

• Partner with cross functional teams (Data Engineering, Developers, content owners) to understand information needs and deliver semantic/analytical solutions. • Produce clear, accessible technical documentation; deliver onboarding and training for both technical and non technical users. • Apply strong software engineering discipline to semantic assets: version control, code review, automated SHACL/dbt tests, CI/CD pipeline alignment. • Support both new development and ongoing operations/maintenance across unclassified and classified environments.

Do you have what it takes?

Required Qualifications • A bachelor’s degree and 14 years of relevant experience OR a Master’s degree with 12 years of relevant experience OR A PhD with 9 years of relevant experience • Prefer an advanced degree in Philosophy, Computer Science, Information Science, or a quantitative field, or equivalent experience. • Deep proficiency in ontology languages (RDF, OWL, SHACL) and vocabularies (SKOS). • Strong SPARQL skills; experience with semantic ETL, query optimization, and graph reasoning. • Proficiency in Python (plus R/Java as needed) for data science, graph processing, and ML model development. • Strong engineering practices: Git, PR workflow, testing/validation, CI/CD. • Active TS adjudication with ability to obtain the SCI/CI Poly.

Desired Qualifications • Hands-on experience with ontology/graph platforms such as Protégé, AnzoGraph, Stardog, PoolParty, or Semaphore. • Experience integrating disparate datasets into a unified semantic model and writing effective production-grade queries. • Familiarity with upper ontologies (BFO, DOLCE, Common Core Ontologies, OBO Foundry). • Knowledge of machine learning, including supervised/unsupervised methods, embeddings, and model fine tuning.

Requirements

  • A bachelor’s degree and 14 years of relevant experience OR a Master’s degree with 12 years of relevant experience OR A PhD with 9 years of relevant experience
  • Prefer an advanced degree in Philosophy, Computer Science, Information Science, or a quantitative field, or equivalent experience
  • Deep proficiency in ontology languages (RDF, OWL, SHACL) and vocabularies (SKOS)
  • Strong SPARQL skills; experience with semantic ETL, query optimization, and graph reasoning
  • Proficiency in Python (plus R/Java as needed) for data science, graph processing, and ML model development
  • Strong engineering practices: Git, PR workflow, testing/validation, CI/CD
  • Active TS adjudication with ability to obtain the SCI/CI Poly

Responsibilities

  • This role blends deep experience with RDF/OWL/SHACL and graph data models with practical machine learning, natural language processing, and context engineering
  • You will develop semantic foundations and AI-driven tooling that enable multiple organizations to communicate using shared ontologies and that support natural language to SPARQL interfaces with reduced hallucination
  • Develop and maintain logical, semantically rich, extensible ontologies (RDF/OWL/SKOS) aligned to foundational/top level ontologies (BFO, DOLCE, CCO, OBO Foundry)
  • Build and evolve canonical vocabularies that unify heterogeneous datasets across organizational boundaries
  • Design, implement, and maintain knowledge graphs and semantic data models using AnzoGraph including modeling tradeoffs, query optimization, and production operations
  • Create SHACL validation layers; run regular quality checks to ensure consistency, backward compatibility, and schema integrity
  • Write performant SPARQL queries; develop mapping views, CONSTRUCT transformations, and federated semantic integrations
  • Develop hybrid ontology–GenAI capabilities, including natural language to SPARQL pipelines with context engineering frameworks (e.g., DSPy) to reduce hallucination and improve schema grounding
  • Apply machine learning techniques—classification, embedding based retrieval, entity resolution, and model fine tuning—to support ontology evolution, reification, and inference
  • Partner with cross functional teams (Data Engineering, Developers, content owners) to understand information needs and deliver semantic/analytical solutions
  • Produce clear, accessible technical documentation; deliver onboarding and training for both technical and non technical users
  • Apply strong software engineering discipline to semantic assets: version control, code review, automated SHACL/dbt tests, CI/CD pipeline alignment
  • Support both new development and ongoing operations/maintenance across unclassified and classified environments

Benefits

Opportunity to work on cutting-edge AI/ML projectsCollaboration with cross-functional teamsProfessional development and growth opportunities

Skills

Ontology engineeringSemantic data modelingAdvanced analyticsRDF/OWL/SHACLGraph data modelsMachine learningNatural language processingContext engineeringPythonRJavaGitCI/CD

Don't send a generic resume

Paste this job description into Mimi and get a resume tailored to exactly what the hiring team is looking for.

Get started free