CLARIN – Shared Language Resources and Technological Infrastructure

Last updated on Jan 7, 2025 projects

The primary goal of the project is to enhance the CLARIN-PL research infrastructure by expanding its capabilities to support scientific research and innovative activities related to linguistic data analytics and natural language processing (NLP). Building on advancements achieved by the CLARIN-PL-Biz project, the initiative focuses on integrating Large Language Models (LLMs) to improve tools for information extraction, personalized solutions, and effective communication in natural language. The project aims to develop five virtual laboratories that will offer advanced analytical, personalized, and trustworthy dialogue systems while ensuring high-quality linguistic resources and computational services.

Key objectives include:

Expanding tools and systems for linguistic data analysis, focusing on temporal data and integration with cutting-edge LLMs to enhance the usability and efficiency of language tools.
Developing context-aware and personalized NLP tools, including systems for personalized content generation, hate speech detection, and emotion analysis tailored to user-specific needs.
Creating trusted dialogue systems that ensure reliability, transparency, and security in user interactions.
Enhancing linguistic resources for AI, ensuring FAIR standards (Findable, Accessible, Interoperable, Reusable) to address challenges specific to the Polish language and provide a counterbalance to English-centric LLMs.
Improving computational capacity to meet the demands of large-scale LLM training and usage, offering flexible and efficient solutions for data preparation, model training, and inference.

The project will leverage the latest advancements in LLM technology to integrate tools into the CLARIN-PL platform, ensuring seamless access and usability for research, business, and public service users. The ultimate aim is to create a robust, scalable, and user-focused infrastructure for linguistic and NLP advancements in Poland and beyond.

Partners:

Wroclaw University of Science and Technology
Institute of Computer Science, Polish Academy of Sciences
Institute of Slavic Studies, Polish Academy of Sciences
University of Łódź
University of Wrocław

Program: European Funds for a Modern Economy 2021–2027

Duration: 01.01.2025 - 31.12.2027

Funding: 61 141 241,03 PLN

Ministry of Science and Higher Education

CLARIN – Shared Language Resources and Technological Infrastructure

Maciej Piasecki

Associate Professor