Clarin, Clariah and Dariah: Towards a Full Infrastructure for Digital Humanities in Europe

Öffentlicher Abendvortrag

04. Dezember 2017

Akademiegebäude am Gendarmenmarkt, Einstein-Saal, Jägerstraße 22/23, 10117 Berlin

DARIAH-EU, the Berlin Brandenburg Academy of Sciences (BBAW), Inria (Paris, France) and the Belgrade Center for Digital Humanities (BCDH, Serbia), are co-organizing, with the support of the German Ministry of Education and Research (BMBF), CLARIN, DARIAH-DE and the EU H2020 project Humanities at Scale (HaS), a masterclass on the management of lexical data, which will take place in Berlin at the BBAW from 4 to 8 December 2017. The Lexical Data Masterclass is bringing together 20 trainees together with experts to share experiences, methods and techniques for the creation, management and use of digital lexical data. This masterclass is part of a joint French-German program supported by the BMBF and MESRI (French Ministry for Higher Education, Research and Innovation).


In this presentation we will discuss the needs for a language community to have extensive language resources and a digital language infrastructure for the development of all types of smart computer applications. We expand on how an open resources digital language infrastructure is made available for the Low Countries through the Dutch Language Institute. Such an infrastructure facilitates the development of a large number of technological applications.

In 2011, META, the Multilingual Europe Technology Alliance published a number of white papers discussing the benefits offered by Language Technology and the actions that need to be taken to develop basic tools and data for each language depending on factors such as the complexity of the respective language, the size of its community, and the existence of active research centers in this area. Language technology is used to develop smart software systems designed to handle human language and is therefore often referred to as “human language technology”. Human language technology (HLT) links language to various forms of knowledge. Main application areas of language technology are a.o. language checking, web search, speech interaction, and machine translation.  A large number of smart computer applications rely heavily on speech and language data; we name a few: spelling correction, authoring support, computer assisted language learning, information retrieval, information extraction, text summarization, question answering, speech recognition, speech synthesis.

The situation of every language concerning language technology needs to be supervised. The Metanet consortium stresses the need for continuous  development of language technology resources and use them to drive forward research, innovation and development. The need for large amounts of data and the extreme complexity of language technology systems makes it vital to develop a new infrastructure and a more coherent research organisation to support greater sharing and cooperation.


We are now moving one step ahead: from the CLARIN centers for language infrastructure and the Clariah projects, Dariah is now expanding strongly as an infrastructure for the wider arts and humanities researchers working with computational methods. As such, young researchers and PhD students will be able to train in research in digital humanities, and a wealth  of new applications will become possible. 


Biography: Frieda Steurs (Belgium) is full professor at the KU Leuven, Faculty of Arts. She works in the field of terminology, language technology, specialized translation and multilingual document management. She is a member of the research group Quantitative Lexicology and Variation Linguistics (QLVL). Her research includes projects with industrial partners and public institutions. She is the founder and former president of NL-TERM, the Dutch terminology association for both the Netherlands and Flanders.  She is also the head of the ISO TC/37 standardization committee for Flanders and the Netherlands.. She is the president of TermNet, the International Network for Terminology (Vienna).


Frieda Steurs has coordinated a large number of research projects in the field of translation technology, terminology management and digital language resources, e. g. TermWise (creating resources for specialized language use, IOF funding KU Leuven 2009–2013), SCATE – Smart Computer-Aided Translation Environment (IWT 2014–2018), COST e-Lex : e-lexicography (ISCH COST Action IS1305 2013–2017) and COST enetCOLLECT (Computer Assisted Language Learning and Crowdsourcing Techniques, ISCH COST Action 2017–2022).


 Further Information 

Der Eintritt ist frei. Eine Anmeldung ist erforderlich.

© 2020 Berlin-Brandenburgische Akademie der Wissenschaften