Deep Learning Engineer for Language Technologies (RE2)
Barcelona Supercomputing Center
About the role
About BSC
The Barcelona Supercomputing Center - Centro Nacional de Supercomputación (BSC-CNS) is the leading supercomputing center in Spain. It houses MareNostrum, one of the most powerful supercomputers in Europe, and hosts important European HPC initiatives such as EuroHPC JU and PRACE. BSC manages research, development and HPC services in computer and computational science across life, earth and engineering domains, employing over 1,000 staff from 60 countries.
Context And Mission
The Language Modeling Team at the newly created AI Institute at BSC has experience in massive language model building, biomedical text mining, and unsupervised learning for low‐resource languages. The team is tasked with developing core open‐source resources and technologies for Spanish and Catalan under national and EU projects. We seek talent to design and improve the team's operations and infrastructure.
Key Duties
- Collaborate with group members to design and develop solutions that achieve the group's research goals.
- Co‐create and evaluate language models using deep learning techniques (Transformers, RNNs, and other neural architectures).
- Focus on research and deployment of post‐training strategies, including instruction processing, algorithm design, and evaluation of instructed LLMs.
Requirements
Education
- Degree and MSc in Computer Science, Mathematics, or related fields.
Essential Knowledge and Professional Experience
- Python programming
- Linux environment
- Machine‐learning techniques
- Deep learning concepts
- Computer vision fundamentals
Additional Knowledge and Professional Experience
- Published research in AI with a focus on large language models.
- Mathematics and statistics applied to machine learning.
- Experience training LLMs with distributed frameworks such as NeMo and Megatron‐ML.
- Knowledge of multimodal AI (language, vision, speech), with preference for multilingual capabilities.
- Broad theoretical knowledge of AI techniques.
- Experience with data pipelines: scraping, processing, filtering.
- Leading research projects.
- Experience with synthetic data generation (text‐only and multimodal).
- Building evaluation frameworks for LLM and multimodal models.
- Planning large‐scale encoder/decoder trainings (exceeding 5 trillion tokens).
- HPC workload managers such as Slurm.
- CI/CD skills (GitHub, Singularity, or similar).
- Knowledge of C++, Matlab and/or Java.
- Machine‐learning libraries: PyTorch, TensorFlow, Pandas, Scikit‐learn, Numpy.
- GPU‐based computing, including multi‐GPU/multi‐node parallelisation.
- Fluency in Spanish and English (additional Spanish co‐official languages valued).
Competences
- Exploration of new research lines.
- Strong communication and presentation skills.
- Teamwork and pair programming.
Conditions
- Location: BSC, Directors Department.
- Full‐time contract (37.5 h/week) with flexible hours, extensive training plan, restaurant vouchers, private health insurance, and relocation support.
- Open‐ended contract aligned with project and budget duration.
- Holidays: 22 days + 6 personal days + 24 th and 31 st December.
- Competitive salary commensurate with qualifications and experience.
- Starting date: 01/05/2026.
Equity, Diversity and Inclusion
BSC-CNS is an equal opportunity employer committed to diversity and inclusion. We consider all qualified applicants without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, age, disability or any other protected characteristic.
Skills
Don't send a generic resume
Paste this job description into Mimi and get a resume tailored to exactly what the hiring team is looking for.
Get started free