People get mad when you call LLMs "spicy autocomplete" but my investigations into recreating and implementing small versions of this tech make me think that nick name is very accurate.

Basically, it's a method to predict the next content in a text file. The whole conversation between you and the LLM is one file, and the LLM tries to find the most likely next text based on the training data.

There is something significant here: LLMs were trained on internet forums and social media.

Thus the training data didn't just contain text, but rather text where each passage is tagged and attributed to a particular user.

This aspect of the training data was critical in creating the illusion of talking to another person.

An LLM doesn't just predict the next text. It predicts the next text that might come from another user. You need to hard code this in to make it work well.

Leave it out and there is no conversation.

For example if I give an LLM without user seperation this text:

"It's a lovely day." It might continue with "The sun was shining."

But with user separation it focuses on responses to "it was a lovely day" from other users and the training data might suggest "I agree, it's wonderful weather."

So interaction with an LLM is like posting on a forum, it gives you and average of typical responses with one small change: most LLMs have a strong positivity bias programmed in.

Because let's be real, if you posted "It's a lovely day." on an internet forum you might get a response like "No it's not, noob."

LLMs are heavily weighted to give supportive, and constructive responses.

I wonder what they might be like without these limitations? Without the limitation to make the response from another user they might be much less deceptive.

That they are popular shows that many people just want a nice moderated online community where people treat each other with respect.

@futurebird this is a simplistic view – that it’s all about token prediction of similar vectors using gradient descent to arrive at the more likely next token to place in line with what is already there. There’s also RLHF – reinforcement learning through human feedback. This involves the human dressing up as a wizard and sitting behind the curtain ensuring that all the responses are the sort of responses that the wizard I mean human would actually prefer and approve of. Technically, this is achieved using a lot of smoke and some carefully placed bidirectional mirrors which act as beam splitters to construct a hologram which fools people into thinking that the machine did it all, instead of poorly paid workers.

@u0421793

Ian, you had me going for a moment there. I was like "how do they keep finding me? why are they like this all the time???"

😆

@futurebird I honestly think (unpopular opinion here) that most of the cost of LLM-based AI thus far is in ‘training’. Not training as in running the phenomenal amount of harvested stolen text and image input through tokenisation processes and reward giving through weight assignment and vector assessment, using more GPUs than exist on Earth, but rather, lots and lots and lots of money paying humans to fake it all and build in patches – patch after patch on top of patch of corrective behaviour, encoded themselves as vector weights. The training had nothing much to do with running it all through GPUs, I believe that probably took an embarassing but totally affordable amount of time and energy. I believe (with no visible means of factual reference to cite) that most of the expenditure of these capital-burning companies was ‘training’ by paying humans and then encoding their resulting guidance. Paying workers.
@u0421793 @futurebird Yes. Or getting humans to do that labor for no pay.