-
Humanities and the arts
- Corpus linguistics
- Diachronic linguistics
- Sociolinguistics
The DIRT corpus (corpus Dutch in reality TV) is a corpus consisting of informally spoken Belgian Dutch and Dutch Dutch from reality TV. DIRT is a growing corpus, which is regularly supplemented with new material. The first version of the DIRT corpus was created by Ulrike Vogl and Gauthier Delaby in 2021. In total, the corpus currently has approximately 200,000 words. The transcripts have so far been made by student workers and students from the BA3 research line “Language use in reality TV” in Academic Year 21-22. Over the next years we want to expand the corpus and also ensure continuity by preparing an application for a follow-up project. The planned expansion covers the following aspects: we will add (1) more metadata (e.g. in addition to nationality, age, province/place of residence, profession and gender of the speaker, also language knowledge, residence abroad), (2) older seasons of reality TV to also enable diachronic linguistic research based on DIRT and (3) transcribe more Dutch Dutch reality programs to ensure more balance within the Dutch language area. For this purpose we appoint an employee for 40%, initially for 2 years. Specifically, Lien Hellebaut, who will have the following tasks, among others: transcribing, supplementing transcripts, refining transcription protocol, annotating metadata, dissemination (website), contributing to project application.