Mastodawn

sidereal Mar 1

LLM boosters: This is trained with all of the text and code on the public internet!

People who can fucking think: So it's extremely low quality, then?

Show thread

sidereal Mar 1

"The average of everything on github" isn't badass code, it's unfinished student projects that never worked.

Show thread

sidereal Mar 1

We hear "all the text on the internet" and we might think "well, they've sure digitized a lot of classic books, that must be good, right?" but then we realize that all of the books are maybe like 1% of the data. Most of it is like facebook messenger breakup arguments and semi-literate emails.

Show thread

prom™️

@sidereal Do they even tell us, what they put in our mashed potatoes? For food, that's mandatory.

Show thread

cmw Mar 1

@promovicz
Has anyone made analogies, yet, between large language models and Soylent Green?
@sidereal