๐Ÿš€ TopicWatchdog โ€“ Week 3: Stable Topics with BERTopic

KMeans worked, but cluster IDs kept jumping across retrains. This week I added a Python BERTopic stage with a BigQuery registry โ†’ stable topic IDs!

๐ŸŸข UMAP + HDBSCAN
๐ŸŸข Stable IDs via registry
๐ŸŸข Auto-labels with Gemini
๐ŸŸข Looker Studio dashboards

๐Ÿ“Š 3,802 topics โ†’ 2,472 mapped, top clusters: migration, economy, climate, politics.

๐Ÿ‘‰ Blog: https://dracoblue.net/dev/topicwatchdog-stable-topics-with-bertopic/

#TopicWatchdog #BERTopic #BigQuery
#Clustering
#MachineLearning
#FediScience

Week 3: Stable Topics with BERTopic / Articles / dracoblue.net

In Week 1 (extraction) and Week 2 (embeddings + KMeans in BigQuery ML) we laid the groundwork. This week I built a Python BERTopic stage whose IDs stay stable across runs by mapping BERTopicโ€™s internal clusters to stable topic IDs in BigQuery. I use Go...

TopicWatchdog (Week 2)

Move beyond string matching: cluster political topics from German shorts with embeddings + KMeans in BigQuery ML.

642 videos โ†’ 620 topics โ†’ 20 clusters (AfD positions, debates, labor market, climate/economy, corruption โ€ฆ).

Blogpost with SQL & charts:
๐Ÿ”— https://dracoblue.net/dev/topicwatchdog-embeddings-and-kmeans-clustering-of-topics-claims/

#TopicWatchdog #BigQuery #Embeddings #Clustering #MachineLearning #FediScience

Week 2: Embeddings & KMeans Clustering of Topics/Claims / Articles / dracoblue.net

This post documents Week 2 of the TopicWatchdog project.Last week we successfully extracted topics and claims from German political short videos and persisted them in BigQuery.However, topics often appeared under slightly different names โ€” making aggre...

Kickoff: TopicWatchdog (Week 1)

Iโ€™m building a reproducible pipeline that collects German political shorts, transcribes them, extracts topics & claims with timestamps, and stores everything in BigQuery for transparent analysis.
First finding: topic names are unstable โ†’ next up: embeddings + clustering in BigQuery ML.

Full write-up:

https://dracoblue.net/dev/kickoff-topicwatchdog-extracting-topics-and-claims-from-german-politics-videos/

#TopicWatchdog #BigQuery #MachineLearning #LLM #OpenData #FediScience