Published at #IRRJ: "Exploring Embedding Interpretability by Correspondences Between Topic Models and Text Embeddings" by Meng Yuan, Lida Rashidi, and Justin Zobel. #InformationRetrieval, #EmbeddingInterpretability, #Explanability, #TopicModelling

https://doi.org/10.54195/irrj.23703

Exploring Embedding Interpretability by Correspondences Between Topic Models and Text Embeddings | Information Retrieval Research

Bien que #Rstats soit le cousin pauvre de #Python en traitement de texte (#NLP), rien n'empêche de l'utiliser pour la modélisation thématique (#topicmodelling).

Dans cet exemple, je montre aussi comment soumettre le résultat de l'analyse à une #IA textuelle générative basée #LLM.

Corpus de test: littérature française de la seconde moitié du 18e. Quelqu'un m'expliquera peut-être l'étrange mais quantifiable proximité entre Voltaire et le Marquis de Sade...

https://ourednik.info/maps/2024/05/31/topic-modeling-la-modelisation-thematique-avec-r-et-quanteda

Playful Technology Limited ~ Latent Semantic Indexing

Reducing the dimensionality of language data

Curious about #TextAnalysis? Do you want to be able to discuss the differences between #CorpusLinguistics and #DataScience? In this #TrainingTuesday resource from #DiMPAH and #DARIAHTeach, learn how to apply text analysis techniques, such as #SentimentAnalysis and #TopicModelling, to a dataset:

https://campus.dariah.eu/resource/posts/text-analysis-linguistics-meets-data-science

Text Analysis - Linguistics Meets Data Science | DARIAH-Campus

DARIAH-Campus is a discovery framework and hosting platform for DARIAH learning resources.

DARIAH-Campus

Would anyone have some example code on how one would approach to do #semisupervised #topicmodelling with #BERTopic on a dataset of articles and book-length documents in order to do document-level #classification and analyses? I am interested in analyzing how document topics change over time.

https://maartengr.github.io/BERTopic/getting_started/semisupervised/semisupervised.html

Semi-supervised Topic Modeling - BERTopic

Leveraging BERT and a class-based TF-IDF to create easily interpretable topics.

@dajb Though I’m still sore from the overwhelming LLM chatter of the past few months, I did think about them while reading your piece…
Had neat experiences with other AI/ML approaches to classification. (#TopicModelling methods such as #LatentDirichletAllocation.)

For collaborative design, it does sound like nonhuman actors can contribute. Was also thinking about the iterative nature of design approaches. And about fluid categories.
As we say in #LinguisticAnthropology:

> All grammars leak

One week in Mastodon, a good feel, seems great for geeking about #geospatial #geoviz #dataviz #coding.

I would like to connect further with people in #NLP #NaturalLanguageProcessing #NLPTransformers #TopicModelling #BERTopic.

Also I would like to connect with #Geospatial people in #Australia.

Pass me a follow 🤓👋