PSY.TXT project (IP-2020-02-8671)

The topic of this project are computational models for personality analysis and prediction from text. Although NLP and personality psychology have a high potential for synergy, the two fields have different goals and values and thus far remain largely disconnected from each other. The gap can be attributed to the lack of adequate datasets, lack of interpretable NLP models that target discourse-level linguistic phenomena, and lack of interdisciplinary skill sets among the researchers. With this project, we aim to narrow this gap and set the ground for truly interdisciplinary research on text-based personality analysis.

People differ in patterns of thinking, feeling, and behaving, which influences how they interact and adapt to the intrapsychic, physical, and social environments. The individual and stable differences are studied by the field of personality psychology. Many differences in personality get manifested in language and in the way language is used in social interactions. This makes language data a valuable source of data for personality research, especially in today’s era of social media and big data, with vast amounts of text generated on social media platforms. The large quantities of text, however, mandate a computational approach to data analysis, using methods from natural language processing (NLP) and machine learning.

The project brings together an interdisciplinary team of computer scientists and personality psychologists, with the objectives to: compile novel datasets of online text and interaction suitable for the development of NLP models and text-based personality research; develop NLP models for text-based personality prediction and analysis that are innovative, creative, and effective, but at the same warrant validity and interpretability; run a number of confirmatory and exploratory data analysis studies to investigate the links between personality traits and linguistic variables of online talk and interaction.

PSY.TXT project (IP-2020-02-8671) is fully supported by Croatian Science Foundation.


Project name: IP-2020-02-8671 “Računalni modeli za predviđanje i analizu ličnosti na temelju teksta” (Computational Models for Text-Based Personality Prediction and Analysis)

Duration: 01.01.2021. — 31.12.2024.

Funding: 1.412.000,00 kn


We are part of the Text Analysis and Knowledge Engineering Lab at the Faculty of Electrical Engineering and Computing, University of Zagreb as well as the Department of Psychology at the Faculty of Humanities and Social Sciences, University of Zagreb.

TakeLab, Faculty of Electrical Engineering and Computing

  • Jan Šnajder, Principal Investigator
  • Irina Masnikosa
  • Iva Vukojević
  • Sara Bakić
  • Mihaela Bošnjak
  • Matej Gjurković
  • Mladen Karan
  • Josip Jukić
  • Ivan Crnomarković

Department of Psyhology, Faculty of Humanities and Social Sciences

  • Denis Bratko
  • Ana Butković
  • Tena Vukasović Hlupić
  • Martina Pocrnić
Personality Psychology meets Natural Language Processing