LLM boosters: This is trained with all of the text and code on the public internet!

People who can fucking think: So it's extremely low quality, then?

"The average of everything on github" isn't badass code, it's unfinished student projects that never worked.
We hear "all the text on the internet" and we might think "well, they've sure digitized a lot of classic books, that must be good, right?" but then we realize that all of the books are maybe like 1% of the data. Most of it is like facebook messenger breakup arguments and semi-literate emails.
Humans are relying on computers to correct their grammar when humans don't even agree on what proper English grammar is smdh
@sidereal
I was always taught that people make grammar so i was really surprised to see people ask the machine how to do grammar
@sidereal there's major differences within the same country (for instance between Southern and Northern England).

@sidereal

Even if it was just a bunch of literature I hate to tell the LLM-pilled folks but nobody goes to the bathroom in a novel, so 🤷‍♂️

@sidereal That's kind of the thing though. What the existentialists called "bad faith", the human social fallibility that Kafka was satirizing.