This week for #TrainingTuesday, our friends over at Programming Historian have this lesson is designed to get you started with #WordEmbedding models 👀

➡️ Browse this resource: https://campus.dariah.eu/resource/posts/understanding-and-creating-word-embeddings

🎃 My #HALLOWEEN treat for you is a new #PUBLICATION 🎃

In #ComputationalCommunicationResearch, we present a semantic validation of using #WordEmbedding metrics to assess implicit group #stereotyping in text corpora: https://doi.org/10.5117/CCR2025.1.14.MULL

Last but not least, this validates our own study on "differential #racism in the #news", published in #PoliticalCommunication back in 2023: https://doi.org/10.1080/10584609.2023.2193146

@commodon @communicationscholars #openaccess #preregistered #opendata #openmaterials

New publication from our project on "Implicit and Explicit #Racism in #News and SocialMedia":

We explore the portrayal of #EthnicGroups in German legacy and #AlternativeMedia in 2022 - focusing on those group labels that were most frequently mentioned in the press last year.

The analysis is based on #WordEmbedding models that contrast the group labels' implicit and explicit associations with fear and admiration.

https://doi.org/10.25521/mzesfokus.2023.275

@commodon @communicationscholars #Journalism

Implizite und explizite Stigmatisierung von ethnisch-gelesenen Gruppen in der deutschen Medienöffentlichkeit im Jahr 2022 | MZES Fokus

#OutNow: My first co-authored article in a peer-reviewed journal. We investigate how different computational approaches fare with complex literary texts: https://doi.org/10.3389/fdata.2022.886362 (open access).

Summary in thread 🧵

@sociology @oneabstractaday #socialscience #methods #textanalysis #dictionary #wordembedding #scaling #machinelearning #sentiment #complex #literature

Examining Sentiment in Complex Texts. A Comparison of Different Computational Approaches

Can we rely on computational methods to accurately analyze complex texts? To answer this question, we compared different dictionary and scaling methods used in predicting the sentiment of German literature reviews to the “gold standard” of human-coded sentiments. Literature reviews constitute a challenging text corpus for computational analysis as they not only contain different text levels—for example, a summary of the work and the reviewer's appraisal—but are also characterized by subtle and ambiguous language elements. To take the nuanced sentiments of literature reviews into account, we worked with a metric rather than a dichotomous scale for sentiment analysis. The results of our analyses show that the predicted sentiments of prefabricated dictionaries, which are computationally efficient and require minimal adaption, have a low to medium correlation with the human-coded sentiments (r between 0.32 and 0.39). The accuracy of self-created dictionaries using word embeddings (both pre-trained and self-trained) was considerably lower (r between 0.10 and 0.28). Given the high coding intensity and contingency on seed selection as well as the degree of data pre-processing of word embeddings that we found with our data, we would not recommend them for complex texts without further adaptation. While fully automated approaches appear not to work in accurately predicting text sentiments with complex texts such as ours, we found relatively high correlations with a semiautomated ap...

Frontiers