Personality Psychology meets Natural Language Processing

TakeLab, FER, University of Zagreb

About us

We are a team of computer scientists that want to be psychologists and psychologists that want to be computer scientists.

With this project, we aim to set the ground for a truly interdisciplinary perspective on computational personality research by developing datasets and models for personality prediction and analysis based on online textual interactions. The overarching goal of our project is to bring the two communities closer together and ultimately increase their capacity to carry out relevant and valid research using computational text analysis methods, contributing to both research fields. To this end, the project will focus on three research objectives:

  1. Development of datasets adequate for text-based personality research.
  2. Development of comprehensive NLP models for personality prediction and analysis.
  3. Investigating a number of research questions that relate personality psychology to language use by means of confirmatory and exploratory studies that leverage the developed datasets and computational models.

We are part of the Text Analysis and Knowledge Engineering Lab at the Faculty of Electrical Engineering and Computing, University of Zagreb.

psy.txt team




Irina Masnikosa

Linguist on duty


Iva Vukojević

Psychologist on duty


Ivan Crnomarković

TO-DO list master


Jan Šnajder

Martial arts guru


Josip Jukić

Cleaning enthusiast


Matej Gjurković

Mess generator


Mihaela Bošnjak

Photoshop guru


Mladen Karan

Morning bird


PSY.TXT project (IP-2020-02-8671)

The topic of this project are computational models for personality analysis and prediction from text. Although NLP and personality psychology have a high potential for synergy, the two fields have different goals and values and thus far remain largely disconnected from each other.

All publications

(2021). PANDORA Talks: Personality and Demographics on Reddit. Proceedings of the Ninth International Workshop on Natural Language Processing for Social Media, NAACL 2021.


(2018). Not Just Depressed: Bipolar Disorder Prediction on Reddit. *Proceedings of the 9th Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis, EMNLP 2018.


(2018). Reddit: A Gold Mine for Personality Prediction. *Proceedings of the Second Workshop on Computational Modeling of People′s Opinions, Personality, and Emotions in Social Media, NAACL 2018.



Find out more

Large-scale datasets of Reddit comments labeled with personality and demographics


If you are interested in our datasets, please submit a request with more information here.