Principal Scientist, Data Science - R&D DSDH - Therapeutics Development & Supply (TDS)
Johnson and Johnson
About the role
About Johnson & Johnson Innovative Medicine
Johnson & Johnson Innovative Medicine develops treatments that improve the health of people worldwide. Research and development areas encompass oncology, immunology, neuroscience, cardiopulmonary and specialty ophthalmology. Our goal is to help people live longer, healthier lives. We have produced and marketed many first-in-class prescription medications and are poised to serve the broad needs of the healthcare market - from patients to practitioners and from clinics to hospitals. To learn more about Johnson & Johnson Innovative Medicine visit https://innovativemedicine.jnj.com/
Position Summary
The R&D Data Science organization is seeking a Data Scientist - Data Engineer to design, build, and optimize data capture, processing, and storage solutions that enable advanced analytics, digital process transformation, and AI/ML applications across the development-to-supply continuum for Therapeutics Development & Supply (TDS).
You will be a hands-on technical contributor working across Process Development, Manufacturing, Supply Chain, Quality, and Digital/Data Science teams to deliver high-quality, AI-ready data pipelines and data products. This role involves creating robust, future-proof data systems, engineering workflows, and high-value data repositories that support scientific, technical, and operational decision-making.
Key Responsibilities
Data Engineering & Pipeline Development
- Design, build, and maintain scalable data pipelines for acquiring, integrating, and managing TDS data from diverse data generation sources and systems (e.g., lab systems, MES, clinical supply, quality systems, external partners).
- Create and optimize data flows for structured and unstructured data using Python, R, SQL, cloud services, and other modern engineering tools.
- Develop and maintain TDS-specific data repositories, implementing enterprise-level data models and creating new models as needed.
- Enable AI/ML readiness by ensuring data is well-structured, versioned, traceable, and semantically aligned with enterprise data standards.
Data Product & Architecture Partnership
- Partner with data scientists, TDS domain experts, and digital technology teams to translate business needs into high-quality data products and engineering requirements.
- Work closely with ontology/knowledge graph teams to implement semantic models and future-proof data architectures.
Quality, Compliance & Performance
- Implement data quality and performance standards; define KPIs to measure accuracy, completeness, and consistency across TDS data assets.
- Apply data versioning and lineage tracking for compliance, traceability, and audit readiness.
- Follow software development best practices including code versioning, DevOps integration, and documentation.
Cross-Functional Collaboration
- Engage with scientific, technical, and operations stakeholders to understand requirements, design data solutions, and drive adoption.
- Support multiple concurrent projects, managing priorities and delivering maximum business value across the TDS network.
Qualifications
Required
- Advanced degree in Engineering, Data Science, Life Sciences, Computer Science, or related field; advanced degree preferred.
- 3+ years of experience in data engineering, including data modeling and database design, preferably in a scientific, manufacturing, or healthcare environment.
- Proficiency with Python, R, SQL, and cloud-based architectures (e.g., AWS services, Snowflake, Redshift).
- Experience with NoSQL and graph databases.
- Strong analytical, problem-solving, and stakeholder-management skills, with the ability to translate discussions into actionable requirements.
- Ability to drive multiple exciting projects simultaneous with strong organizational skills and adaptability.
Preferred
- Experience with regulated or standards-driven data environments, such as CDISC, HL7, FHIR, OMOP, DICOM, or manufacturing/quality data standards.
- Familiarity with high-dimensional data (e.g., imaging, sensor data, etc).
- Experience with principles connecting to or feeding MLOps and model deployment workflows.
- Knowledge of manufacturing systems (MES), laboratory information systems, or industrial data systems.
- Exposure to knowledge graph or ontology-driven architectures.
Why This Role Is Unique
This is a rare opportunity to grow in one of the world's most ambitious and fast growing R&D Data Science organizations, shaping how Therapeutics Development & Supply data powers next-generation therapies in the largest biomedical company on the planet. Your work will directly accelerate Johnson & Johnson's scientific discovery, fuel AI innovation, and impact patients globally.
#JRDDS
#JNJDataScience
JNJIMRND-DS
Benefits
- Vacation - 120 hours per calendar year
- Sick time - 40 hours per calendar year; for employees who reside in the State of Colorado - 48 hours per calendar year; for employees who reside in the State of Washington - 56 hours per calendar year
- Holiday pay, including Floating Holidays - 13 days per calendar year
- Work, Personal and Family Time - up to 40 hours per calendar year
- Parental Leave - 480 hours within one year of the birth/adoption/foster care of a child
- Bereavement Leave - 240 hours for an immediate family member: 40 hours for an extended family member per calendar year
- Caregiver Leave - 80 hours in a 52-week rolling period
- Volunteer Leave - 32 hours per calendar year
- Military Spouse Time-Off - 80 hours per calendar year
For additional general information on Company benefits, please go to: - https://www.careers.jnj.com/employee-benefits
Skills
Don't send a generic resume
Paste this job description into Mimi and get a resume tailored to exactly what the hiring team is looking for.
Get started free