-
Humanities and the arts
- Development of methods and techniques
- Humanities and the arts not elsewhere classified
The aim of the KBR Digital Research Lab is to establish a long-term cooperation between the Ghent Centre for Digital Humanities (GhentCDH, also acting as national coordinating node of DARIAH-BE and the CLARIAH-VL Open Humanities Service Infrastructure) and the Royal Library of Belgium (KBR) to: a) facilitate data-level access to KBR's digitised and born-digital collections, including integration into the KBR's library management system, digitisation and (semi) automatic creation and enrichment of metadata workflows, b) ensure that the digitised and born-digital collections are embedded into the researcher's workflow, in a user-friendly manner and c) optimise the digitised collections for using digital humanities research methods, such text and data mining.
The Belgian government has invested significantly in the digitisation of the collections of the Federal Scientific Institutions. The KBR's digitisation of the Belgian Press 1830-1950 being a flagship example. Although the digitisation of these valuable Belgian literary, cultural and historical heritage collections represents a major achievement, the true potential of these resources for digital humanities research is as yet underexploited. Without the assistance of advanced digital methods and distributed computing infrastructures it is simply impossible to investigate and to give meaning to this large amount of mostly unstructured data on human culture that is - to a certain extent - locked in textual documents.
What researchers in a wide range of disciplines have increasingly in common is that they want to examine the various (textual) resource types together, with a prime role for Natural Language Processing (NLP) and machine learning methods, such as deep learning. NLP as a way of analysing textual and media data has gained momentum, and many projects have been set up in order to meet the expectations of humanities scholars wanting to interpret the vast amount of digitized data that has become available. While current Optical Character Recognition (OCR) and Handwritten Text Recognition (HTR) technologies already facilitate the production of digital-based texts for literary and historical research, they need to be developed further.
The methodology for developing the envisaged KBR Digital Research Lab (DRL) over a period of 10 years will be undertaken using an agile and iterative approach, in continuous dialogue with all the relevent stakeholders (e.g. KBR staff, in particular, the KBR Senior Management Team, Digitisation and ICT Experts, Collections and Cataloguing Staff including the National Bibliography) and (digital) humanities researchers).
In view of the Belgian multilingual context and KBR's concomitant mission, multilinguality will be a central focus of this programme. This focus ties in with the ambition to provide access, via digital means, to collections in a state-of-the-art way and according to international standards currently evolving within the field of Digital Humanities. Providing search functionalities that provide multilingual access to collections (and their metadata), that are by definition multilingual themselves, will afford added exploratory value.
The importance of research into amending the Belgian legal framework to enable text and data mining, an exception for which has already been successfully implemented in French law, would underpin the work of the KBR Digital Research Lab.