29 years of #rstats community knowledge was sitting in hard-to-search pipermail archives. So I built a more modern home for it.

Introducing the R Mailing List Archives: 631,000+ messages from 32 lists, fully searchable and available as open data.

https://r-mailing-lists.thecoatlessprofessor.com/

Every message is parsed, threaded, and indexed. You can browse threads, see who replied to whom, and actually follow conversations that shaped the language.

Here's a recent R-SIG-Mac thread about macOS 26:

https://r-mailing-lists.thecoatlessprofessor.com/lists/r-sig-mac/msg/msg-0b1a4d0c59cf/

Want to do your own analysis? The full archive is available as Apache Parquet files, updated nightly via GitHub Actions.

One-liner to load any list in R or Python. No cloning required.

https://github.com/r-mailing-lists/data

If you've ever wished you could grep through R-help, find that one Brian Ripley reply about CRAN policy from 2009, or just see who the top contributors to R-SIG-Finance were... now you can.

Blog post with all the details: https://blog.thecoatlessprofessor.com/posts/r-mailing-list-archives/

Preserving 29 Years of R Community Knowledge – TheCoatlessProfessor

@coatless I noticed a few contributors duplicated. How should we help with those?
On the blog post I only found "The current alias file covers 82 groups spanning 928 email hashes, focusing on prolific contributors where fragmented identities are most visible. This is ongoing work and contributions are welcome."
The link points to a 404 page: "The main branch of data does not contain the path aliases.json.". Great work!

@coatless this is fantastic! It's great to see that all of the mailing lists, including some of the really obscure ones, are here!

Incidentally, the AI ads are _terrifying_