SIMPA: Statement-to-Item Matching Personality Assessment from text

Abstract

Automated text-based personality assessment (ATBPA) methods can analyze large amounts of text data and identify nuanced linguistic personality cues. However, current approaches lack the interpretability, explainability, and validity offered by standard questionnaire instruments. To address these weaknesses, we propose an approach that combines questionnaire-based and text-based approaches to personality assessment. Our Statement-to-Item Matching Personality Assessment (SIMPA) framework uses natural language processing methods to detect self-referencing descriptions of personality in a target’s text and utilizes these descriptions for personality assessment. The core of the framework is the notion of a trait-constrained semantic similarity between the target’s freely expressed statements and questionnaire items. The conceptual basis is provided by the realistic accuracy model (RAM), which describes the process of accurate personality judgments and which we extend with a feedback loop mechanism to improve the accuracy of judgments. We present a simple proof-of-concept implementation of SIMPA for ATBPA on the social media site Reddit. We show how the framework can be used directly for unsupervised estimation of a target’s Big 5 scores and indirectly to produce features for a supervised ATBPA model, demonstrating state-of-the-art results for the personality prediction task on Reddit.

Publication
Future Generation Computer Systems