Project

Improving Neural Machine Translation Systems Through Retrieval-Based Approaches

Code

BOF/STA/202302/013

Duration

01 August 2023 → 31 July 2027

Funding

Regional and community funding: Special Research Fund

Promotor

Arda Tezcan

Research disciplines

Natural sciences
- Machine learning and decision making
- Natural language processing
Humanities and the arts
- Translation studies
- Interpreting studies

Keywords

Natural Language Processing (neural) machine translation translation memory Data processing and machine learning data augmentation

Project description

Retrieval-based approaches for natural language processing (NLP) aim to augment neural models by retrieving similar texts to a given input during training and/or inference times. In the context of neural machine translation (NMT), retrieval-based methods have led to impressive gains in translation quality by integrating similar translations into the NMT architecture (i) through modifying the attention mechanism, (ii) in the form of additional network components or (iii) via a data augmentation method.

Due to their ability to effectively utilize similar translations, RBNMT can be seen as a leap forward in the integration of NMT and translation memory (TM) systems, which are commonly used in the computer-assisted translation workflows, while blurring the distinction between the two technologies. This project aims to use state-of-the-art NLP methodologies to further improve NMT systems and TM-NMT integration strategies.