My first solo-authored publication just appeared in *Linguistic Typology*: "The over-representation of phonological features in basic vocabulary doesn’t replicate when controlling for spatial and phylogenetic effects"

Running a #Bayesian model with #Lexibank data, I show that most previously observed effects that have been claimed to be sound symbolism do **not** replicate. A handful of effects emerges as highly stable though, mostly related to body-parts and the pronominal system.

#linguistics #replication #typology #science #statistics

> https://doi.org/10.1515/lingty-2025-0050

The over-representation of phonological features in basic vocabulary doesn’t replicate when controlling for spatial and phylogenetic effects

The statistical over-representation of certain phonological features in the basic vocabulary of languages is often interpreted as reflecting potentially universal sound symbolic patterns. However, most of these cases have not been tested explicitly for reproducibility and might be prone to biases in the study samples or models. Many studies on the topic do not adequately control for genealogical and areal dependencies between sampled languages, casting doubts on the robustness of the results. In this study, I test the robustness of a recent study on sound symbolism in basic vocabulary concepts which analyzed 245 languages. This paper adds a new sample of 2,864 languages from Lexibank. I modify the original model by adding statistical controls for spatial and phylogenetic dependencies between languages. The new results show that most of the previously observed patterns are not robust, and in fact many patterns disappear completely when adding the genealogical and areal controls. A small number of patterns, however, emerges as highly stable even with the new sample. Through the new analysis, it is possible to assess the distribution of sound symbolism on a larger scale than previously. The study further highlights the need for testing all universal claims on language for robustness on various levels.

De Gruyter Brill

A new blog post appeared today in our #CALC tutorial blog, showing how the #Lexibank database can be queried for #colexifications of taste terms, together with Olena Shcherbarkova (MPI-EVA, Leipzig), via @hypothesesorg.

Retrieving and Analyzing Taste Colexifications from Lexibank

https://calc.hypotheses.org/6398/

Retrieving and Analyzing Taste Colexifications from Lexibank

Colexifications have enjoyed a considerable amount of popularity in the recent years. However, there are still many semantic domains, where not much research on colexification patterns has been carried out so far. Here we show, how the recently published Lexibank repository can be queried to yield colexification data on taste colexifications which can in turn […]

Computer-Assisted Language Comparison in Practice