Project

Design and analysis of multilingual-learning based models for advanced natural language understanding applications

Code

1162223N

Duration

01 November 2022 → 28 February 2025

Funding

Research Foundation - Flanders (FWO)

Promotor

Chris Develder

Fellow

Karel D'Oosterlinck

Research disciplines

Natural sciences
- Machine learning and decision making
- Natural language processing

Keywords

Multilingual-learning Meta-learning Natural Language Processing

Project description

Thanks to recent deep learning breakthroughs, Natural Language Processing (NLP) has seen significant progress. Yet, this progress mainly concerns high-resource languages (e.g., English), and many seemingly basic tasks have not been satisfactorily solved, especially for many low-resource languages (e.g., Dutch). We thus observe a performance gap among languages, caused by a discrepancy in the amount of both (i) available training data, and (ii) research performed on these different languages. Due to this gap, a major part of the global population misses out on state-of-the-art NLP solutions. Recent multilingual language models are promising to reduce the discrepancy problem. Multilingual models are trained on a wide range of high- and low-resource languages simultaneously, enabling generalization of basic tasks across languages. However, it is not clear yet how to optimally exploit such multilingual models in advanced language understanding tasks (e.g., coreference resolution). Another promising direction is meta-learning, which is also gaining traction in NLP: inspired by human learning, it allows to adapt to a wide range of specific tasks given a few examples. Still, metalearning in NLP is mainly limited to monolingual applications. My PhD project will leverage the adaptive power of meta-learning algorithms to realize more efficient multilingual-learning for advanced NLP tasks. I will particularly focus on the task of coreference resolution to keep the scope manageable.