Project

Cracking the Multilingual Code: Exploring Multilinguality in LLMs through Mechanistic Interpretability

Code

BOF/PDO/2025/001

Duration

01 October 2025 → 30 September 2028

Funding

Regional and community funding: Special Research Fund

Promotor

Els Lefever

Research disciplines

Natural sciences
- Natural language processing
Humanities and the arts
- Computational linguistics
Social sciences
- Artificial intelligence
- Knowledge representation and machine learning

Keywords

fair and transparent AI mechanistic interpretability multilingual language modelling

Project description

This research project aims to advance the understanding of multilinguality in multilingual large language models (MLLMs), with the overarching goal of making these models more transparent, interpretable and equitable for speakers of low-resourced languages. By investigating how these models represent and transfer meaning across languages, the project seeks to uncover the mechanisms underlying their multilingual behavior at the neuron level. This includes examining the role of language-specific and polyglot neurons, the formation of typological clusters, and the impact of pre-training data and model architecture on multilingual generalizations. Additionally, the research delves into how MLLMs internalize translation processes and whether they employ language-pivoting strategies to facilitate cross-lingual tasks, especially for under-represented languages. Finally, the study addresses the consistency of reasoning structures —such as cause-effect relationships and logical reasoning— across languages, identifying whether systemic biases or universal patterns shape multilingual reasoning capabilities. By advancing our understanding of how MLLMs process diverse languages, this work aims to make these models more equitable and effective, ensuring their benefits extend to all linguistic communities, including those currently underserved by AI technologies.