Skip to content
mimi

Data Ingénieur H/F

Centres Unicancer

Le Kremlin-Bicêtre · On-site Contract €45k – €50k/yr 2d ago

About the role

About Unicancer

Unicancer is the only French hospital network 100% dedicated to fighting cancer and the only national hospital federation dedicated to oncology. It brings together 18 Centers for the Fight Against Cancer (CLCC), private non-profit healthcare facilities, spread across 20 hospital sites in France. 540,000 patients are treated annually within the Unicancer network, and more than 20,000 women and men are committed daily to a constant quest for excellence in care, research, and higher education.

Unicancer is also the leading academic promoter of clinical trials in oncology in Europe. Recognized as a leader in cancer research in France, the Unicancer network enjoys a global reputation, producing one-third of international publications in oncology. The 18 CLCCs and Unicancer's R&D activities are ISO 9001 certified for their clinical research.

Role Purpose:

A major player in building scalable data pipelines to process structured and unstructured data (real-world health data). Develop, maintain, and improve data solutions and infrastructures necessary for the collection, centralization, storage, and access of health data collected from contributing healthcare facilities and made available to scientific teams.

Responsibilities:

  • Design and maintain integration flows (collection, ingestion, storage) to centralize data from multiple healthcare facilities (and multiple data sources per facility) into a health data warehouse while ensuring data quality.
  • Implement secure data pipelines that will be processed and cleaned by data managers to deliver "freeze" bases made available to scientific experts, biostatisticians, and data scientists.
  • Design and implement a process and data pipeline to automatically validate the quality of data integrated into databases and data warehouses by comparing it with data integrated from manual collection.
  • Improve and automate existing integration flows.
  • Participate in the design of platforms for efficient processing of large volumes of data while ensuring their security.
  • Support the development of tools for extracting data in structured format.
  • Assist external service providers specializing in structuring unstructured data from medical reports, multidisciplinary consultation meetings (RCP), or EMR documents, using Natural Language Processing (NLP) and Named Entity Recognition (NER) solutions.
  • Be proactive in evolving the data stack of the Data Management Unit (DDP) to provide innovative solutions for the challenges of new DDP projects: federated EHDs and matching current EHDs with a descendant system of the SNDS (National Health Data System).
  • Ensure the upskilling of the Data Engineers team.
  • Write and provide documentation (procedural guides, user documents, repositories, etc.) in compliance with the existing Quality Management System (QMS - ISO 9001 Certification).
  • Propose relevant indicators for monitoring the activity of Data Engineers and build a dashboard to visualize these indicators and their evolution.
  • Communication/collaboration with project leaders: reporting to the hierarchical manager and functional managers.
  • Assist and participate in meetings with project teams and the Data Management Unit.

Candidate Profile:

Computing: SQL, Javascript, Python, Pandas, Numpy, Spark, PySpark, Elasticsearch, Spacy, Kibana, Java, Camel, Nginx, Liferay, Angular, XML, HTML, JSON, PDF/A (Text), CSS, Windows, Unix/Linux (Debian), Solaris, NLP - NER, PowerBI, KNIME, Talend, SAS (plus appreciated). Databases: SQL and NoSQL (PostgreSQL, MariaDB). Cloud: knowledge is a plus. Continuous Integration: Git, CI/CD. Cross-functional: Agile Methodology. Interoperability: knowledge of OMOP, FHIR HL7, OSIRIS formats would be a plus. Functional: Health sector, Health Data Warehouses, strong curiosity for oncology.

  • More than 2 years of experience as a Data Engineer, with a first successful experience as a Data Engineer in the health sector.
  • Good understanding of the data lifecycle, data lineage, data governance, and data privacy.
  • Ability to work agilely in a collaborative environment.

Welcome to Unicancer Centers

Unicancer is the only French hospital network 100% dedicated to fighting cancer and the only national hospital federation dedicated to oncology. It brings together 18 Centers for the Fight Against Cancer (CLCC), private non-profit healthcare facilities, spread across 20 hospital sites in France. 540,000 patients are treated annually within the Unicancer network, and more than 20,000 women and men are committed daily to a constant quest for excellence in care, research, and higher education.

Unicancer is also the leading academic promoter of clinical trials in oncology in Europe. Recognized as a leader in cancer research in France, the Unicancer network enjoys a global reputation, producing one-third of international publications in oncology. The 18 CLCCs and Unicancer's R&D activities are ISO 9001 certified for their clinical research.

Also...

45-50 K€

Skills

AngularCamelCI/CDCSSDatabasesElasticsearchFHIR HL7GitHTMLISO 9001JavaJavascriptJSONKibanaKNIMELiferayLinuxNatural Language ProcessingNERNginxNoSQLNumpyOMOPOSIRISPandasPDF/APostgreSQLPowerBIPythonSASSparkSQLSpacyTalendUnixXML

Don't send a generic resume

Paste this job description into Mimi and get a resume tailored to exactly what the hiring team is looking for.

Get started free