Project

A Parsed Corpus of Southern Dutch Dialects

Code
31531018
Duration
01 January 2018 → 31 December 2018
Funding
Research Foundation - Flanders (FWO)
Research disciplines
  • Humanities and the arts
    • Corpus linguistics
    • Dialectology
    • Syntax
Keywords
Dutch Dialects Language technology Linguistics
 
Project description

Many of the unique syntactic features of Dutch dialects spoken in Flanders only occur in very

specific discourse contexts, and therefore cannot be researched using existing databases and

linguistic atlases, as those are based on elicited data, not spontaneous speech. At Ghent University,

783 tape recordings (c. 700h) of spontaneous dialect speech from all Dutch-speaking provinces in

Belgium and from French Flanders (France) are available. They were recorded in the 1960s and

1970s, and the speakers were all born around the turn of the 20th century. The tapes have been

digitised (www.dialectloket.be), but not yet digitally transcribed, or linguistically annotated. With an

eye on fast advancing dialect loss across Flanders, it is an urgent desideratum that this wealth of

data be transcribed, annotated, and made available for linguistic research, as younger speakers are

increasingly unable to understand and transcribe these recordings. Indeed, they already represent a

historical stage of the language, given that the speakers were born around the turn of the 20th

century, and hence acquired language about 100-120 years ago. The accessibility of the data for

researchers is therefore invaluable for diachronic, typological and comparative research. In order to

make this enormous wealth of dialect data present at Ghent University available for fundamental

research, their transcription and linguistic annotation is of high priority.