Mastodawn

RE: https://mastodon.social/@glyph/116220498263137455

The thing that would ameliorate my daily crash-outs is not so much a dozen enthusiasts posting "that's it I'm done with AI" but to have a single one post something like "in my considered opinion the risks *are* worth it, but I recognize that they exist, and here's the safety protocol I use to make sure it's not getting the better of me".

Show thread

Glyph Mar 13

Despite feeling quite intensely about all of this I try to include caveat after caveat and qualification after qualification because I strongly suspect that, one day, LLMs *will* be a normal technology. Someone will develop a safety protocol, someone will find ways to measure the output, we'll get local models that are actually competitive and the moral equivalent of a dosimeter for token exposure. And on that day someone is going to say to me "See glyph? It's fine! They're not *useless*!"

Show thread

Daniel Leigh Mar 13

@glyph the disconnect I see here: LLMs (the underlying models themselves) probably are a normal technology. LLM powered chatbots, on the other hand, aren't and will never be. There are plenty of other ways to use the models, but they don't hack the brain to seem magic, so the industry doesn't care about them.

Show thread

hjhornbeck

@danielleigh @glyph Agreed. The "brains" of an LLM haven't changed much since 2017's "Attention is All You Need." I can spot some training efficiency boosts, but in terms of theoretical capability transformers are no better than traditional neural networks. What has changed is the training data (much larger), availability (OpenAI burning money to allow free access), and the presentation (chatting person-to-person).

Early LLMs like BERT claimed to be trained on public-domain material (unlikely), weren't accessible by the general public, and weren't treated as persons. Is that LLM "good," in an ethical sense? I think you could make a case.

https://en.wikipedia.org/wiki/BERT_(language_model)

BERT (language model) - Wikipedia

Show thread

Ken Case Mar 15

@hjhornbeck @danielleigh @glyph

I do think it's possible to train models ethically, and I appreciate Swiss AI's approach developing the Apertus model (https://www.swiss-ai.org/apertus), released in October 2025:

> … the entire development process, including its architecture, model weights, and training data and recipes, is openly accessible and fully documented.

(I recognize ethically trained isn't solving the same problem as being safe from hacking your brain.)

Apertus | Swiss AI

Swiss AI

Show thread

hjhornbeck Mar 15

@kcase
@danielleigh @glyph Here's why I said "unlikely" above:

Bandy, Jack, and Nicholas Vincent. "Addressing" documentation debt" in machine learning: A retrospective datasheet for bookcorpus." https://openreview.net/forum?id=Qd_eU1wvJeu

BookCorpus was billed as "free books written by yet unpublished authors," but contained published copyrighted material. That Swiss AI seems to be falling into the same trap: the training set is a web scrape "filtered to respect machine-readable opt-out requests from websites," but there's a nontrivial chance that includes copyrighted material. The bigger the dataset, the tougher it is to verify everything is legit, and they claim a training set of 15 trillion tokens. BookCorpus was under a trillion and collected from one/two websites.

Addressing "Documentation Debt" in Machine Learning: A...

A datasheet that provides documentation for the popular (yet heretofore fairly mysterious) BookCorpus dataset, which helped train Google's BERT models and OpenAI's GPT-N models.

Show thread

Ken Case Mar 15

@hjhornbeck That's a good qualifier!

To clarify my own comment, I think it's _possible_, but I don't know if it's actually been done successfully (by Swiss AI or anyone else). I feel like a model could be trained pretty well just using older material explicitly in the public domain (e.g. from Project Gutenberg), along with material explicitly made available for it to use. (E.g. our business welcomes models being trained using our reference manuals and support articles and MIT-licensed code.)