Project

Automatic detection of (potential) factors in the source text leading to gender bias in machine translation

Code

DOCT/011451

Duration

21 November 2023 → 20 September 2026 (Ongoing)

Doctoral researcher

Janica Hackenbuchner

Research disciplines

Natural sciences
- Natural language processing
- Communication networks
Humanities and the arts
- Comparative language studies
- Morphology
- Semantics

Keywords

Translation Studies gender bias machine translation Language technology

Project description

With a growing use of and interest in machine translation (MT) and a growing demand for gender-inclusiveness, research on social biases (e. g., gender bias) in MT is increasing. Research predominantly focuses on top-down methodologies for predefined categories of parts-of-speech. This research encompasses a novel bottom-up methodology to broaden the scope of research and gender bias by focussing on source text analysis. The goal is the creation of a detection system that can automatically analyse source data and detect features that influence the gender inflection in target translation, and with that, lead to gender bias in MT. This detection system will be a machine learning model trained on a taxonomy created as part of this research proposal, based on data manually annotated and extended with morpho-syntactic information from dependency trees. The aim is to develop a comprehensive methodology to help make AI-powered technologies (i. e. MT) more gender-inclusive for society.