Jan Kocoń

Jan Kocoń

Assistant professor

Department of Computational Intelligence, Wroclaw University of Science and Technology

Jan Kocoń has been involved in the development of language technology in projects carried out at the Wrocław University of Technology since 2011. He specializes in natural language processing in the following areas: information extraction, sentiment analysis, classification of documents and creating language models using machine learning methods. He has unique knowledge in the field of natural language engineering, especially concerning the solutions developed over the years, such as https://github.com/CLARIN-PL/Liner2, co-created since 2014 and further developed. It is one of the key tools at the [http://clarin-pl.eu/](CLARIN-PL Language Technology Centre), where Jan Kocoń co-created methods for recognising proper names https://github.com/CLARIN-PL/PolDeepNer, as well as implemented his own solutions for recognising temporal expressions and events in Polish texts. These works resulted in a useful set of methods, currently used by hundreds of scientists in the field of humanities and social sciences in Poland and worldwide. He is also the author of the mechanism for recognizing attributes of events, modality determinants and the method for connecting them with relations with event triggers.

Since 2012 Jan Kocoń has also co-created a system for annotation of text corpora called https://github.com/CLARIN-PL/Inforex. This system enables specialists in computational linguistics to construct training and test corpus, which are used to create different language models. He also coordinated a team of programmers in the development of the [https://play.google.com/store/apps/details?id=com.pwr.plwordnet](Mobile Wordnet) application, used to browse the Polish Wordnet and English Princeton Wordnet resources on mobile devices. He also coordinated the team’s work in the https://sentimenti.com/ project, aimed at analyzing emotions and sentiment in the text. In this project more than 20 thousand people were examined and over 18 million annotations about emotions were collected. He was responsible for creating a machine learning mechanism based on deep neural networks such as BiLSTM, BERT and LASER for automatic recognition of emotions in text based on collected data.

He documented his experience in natural language processing with more than 30 scientific publications. He has lectured many times as part of workshops organized by the CLARIN-PL project, where he trained researchers in the field of humanities and social sciences in the use of tools and resources created by CLARIN-PL, including Inforex system, Liner2 tool, MeWeX tool for recognizing multi-track lexical units, the use of DSpace and NextCloud repository to store language resources and methods of their further processing at the textual, syntactic and semantic level. During his work at the Wrocław University of Technology he actively participated in the following natural language engineering projects: SYNAT, NEKST, CLARIN-PL, CLARIN-PL 2.0, Parthenos, AZON, Sentimenti. Currently he deals with sentiment analysis using deep neural networks, cross-language transfer of knowledge and deep language models. He is also the main co-ordinator of the task related to the recognition of emotions within the project “CLARIN - Common Language Resources and Technology Infrastructure” worth over 130 million PLN.

Interests

  • natural language processing
  • deep language models
  • machine learning
  • artificial intelligence
  • cryptocurrencies

Education

  • MsC in Computer Science, 2012

    Wroclaw University of Science and Technology

  • Ph.D. in Computer Science, Artificial Intelligence, 2018

    Wroclaw University of Science and Technology