Mastodawn

Oscar Kjell Jan 14, 2023

Language-based assessments also have the ability to be self-descriptive, moving beyond mere scores as the output.

For example, statistically significant descriptive words and key phrases can be visualized based on their underlying meaning along relevant dimensions.

The figure shows AI-generated summaries of the ten most negative and positive answers to: How are you feeling?

7/9

Show thread

Oscar Kjell Jan 14, 2023

@handyschwartz @katarinakjell

Approaching a Theoretically Upper Limit
Language responses analyzed with #LargeLanguageModels can predict rating scales with a Pearson r of .85 🎯

This high accuracy approaches the rating scale's own reliability –i.e., a theoretical upper limit (r = .71–.84).

Rating scales are only a “proxy” and not a perfect true score of a psychological construct – so future research should move beyond only predicting rating scales.

6/9

Show thread

Oscar Kjell Jan 14, 2023

@handyschwartz @katarinakjell

#LargeLanguageModels
#LargeLanguageModels have led to unprecedented accuracies over most computerized language processing tasks (https://gluebenchmark.com/leaderboard).

They largely owe their success to their ability to statistically model words in the context they are used.

Bringing such context to psychological text analysis can more precisely quantify the specific meaning of language and yield a truer understanding of the person behind the words.

5/9

GLUE Benchmark

The General Language Understanding Evaluation (GLUE) benchmark is a collection of resources for training, evaluating, and analyzing natural language understanding systems

Show thread

Oscar Kjell Jan 14, 2023

@handyschwartz @katarinakjell
Favorable Measrument Characteristics
More information comes from having favorable measurement characteristics, including high:

range (e.g., absolutely loves and hates),
resolution (e.g., cherishes, loves, adores, likes), and
multi-dimensionality (e.g., love, excitement, joy, awe).

3/9

Show thread

Oscar Kjell Jan 14, 2023

@handyschwartz @katarinakjell

Information-Rich Language

We observe that language-responses have many times more information than rating scale responses. 100 participants answered How are you feeling in an open response box and filled out a standard rating scale about their feelings (the PANAS):

The language-based responses included 4.8 times more information than the rating scale responses.

2/9

Oscar Kjell Jan 14, 2023

1 /9
How well can #PsychologicalConstructs be measured by analyzing natural language using #AI?

Our viewpoint is that #LargeLanguageModels provide the missing piece for natural language responses to replace closed-ended rating scales.

#LargeLanguageModels can accurately transform the rich information in natural language to psychological construct scores with high validity.

#ML #NLP #Rstats #Rtext
https://psyarxiv.com/yfd8g
with @handyschwartz @katarinakjell

Oscar Kjell Dec 8, 2022

@eikofried check out this very stylish Swedish company https://www.eiko.se/ 👊 🇸🇪

website	https://oscarkjell.se
#Rtext	http://r-text.org/index.html
orcid	https://orcid.org/0000-0002-2728-6278
osf	https://osf.io/c4kdr/

GLUE Benchmark

Eikolytics | Analysverktyg