CA21167 - Universality, diversity and idiosyncrasy in language technology (UniDive)

Efficient access to the constantly growing quantities of data, especially of language data, largely relies on advances in data science. This domain includes natural language processing (NLP), which is currently booming, to the benefit of many end users. However, this optimization-based technological progress poses an important challenge:   accounting for and fostering language diversity. (See.

The UniDive Action takes two original stands on this challenge. Firstly, it aims at embracing both inter- and intra-language diversity, i.e. a diversity understood both in terms of the differences among the existing languages and of the variety of linguistic phenomena exhibited within a language.

Secondly, UniDive does not assume that linguistic diversity is to be protected against technological progress but strives for both of these aims jointly, to their mutual benefit. Its approach is to: (i) pursue NLP-applicable universality of terminologies and methodologies, (ii) quantify inter- and intra-linguistic diversity, (iii) boost and coordinate universality- and diversity-driven development of language resources and tools. UniDive builds upon previous experience of European networks and projects which provided a proof of concept for language modelling and processing, unified across many languages but preserving their diversity. The main benefits of the Action will include, on the theoretical side, a better understanding of language universals, and on the practical side, language resources and tools covering, in a unified framework, a bigger variety of language phenomena in a large number of languages, including low-resourced and endangered ones.

Natural language processing - language universals - diversity - idiosyncrasy - language resources and tools

Ilia State University Professor, Dr– Irina Lobzhanidze has been nominated and appointed as a member of the COST UniDive Management Committee.

The COST Consortium UniDive programme is coordinated by University of Paris-Saclay (France) with the following members of the management committee: EPOKA UNIVERSITY (Albania), KU Leuven (Belgium), UCLouvain (Belgium), University of Tuzla (Bosnia and Herzegovina), University of East Sarajevo (Bosnia and Herzegovina), Institute of Information and Communication Technologies (Bulgaria), Bulgarian Academy of Sciences(Bulgaria), University of Zagreb (Croatia), Institute of Croatian Language and Linguistics(Croatia), Univerzita Karlova (Czech Republic), Institute of the Estonian Language (Estonia), Institute for Computer Science(Estonia), University of Helsinki(Finland), LIS - Laboratoire d'Informatique et Systèmes (France), Ludwig-Maximilians-Universität(Germany), Ilia State University (Georgia), Institute for Language and Speech Processing(Greece), Hungarian Research Centre for Linguistics(Hungary), University of Szeged(Hungary), Data Science Institute(Ireland), Dublin City University(Ireland), Jerusalem College of Technology(Israel), The Open University (Israel), "L'Orientale" University of Naples(Italy), Università degli Studi di Cagliari(Italy), Tilde(Latvia), Vytautas Magnus University (Lithuanian), University of Malta(Malta), Technical University of Moldova(Moldova), Vladimir Andrunachievici Institute of Mathematics and Computer Science(Moldova), Instituut voor de Nederlandse Taal(Netherlands), Ss. Cyril and Methodius University (North Macedonia), University of Oslo(Norway), University of Warmia and Mazury in Olsztyn(Poland), Institute of Computer Science(Poland), INESC-ID (Portugal), University of Coimbra(Portugal), University of Bucharest (Romania), Research Institute for Artificial Intelligence (Romania), University of Belgrade (Serbia), Institute for Serbian Language SASA(Serbia), Ľ. Štúr Institute of Linguistics(Slovakia), University of Ljubljana(Slovenia), Jožef Stefan Institute (Slovenia), Univerisity of Murcia(Spain), Universidad de Málaga(Spain), Uppsala University (Sweden), University of Zurich(Switzerland), Istanbul Technical University(Turkey), Bogazici University(Turkey), University of Sheffield (United Kingdom), University of Cambridge(United Kingdom).

The Action starts on September 23, 2022 with four years of funding and collaboration ending in 2026.

