People get mad when you call LLMs "spicy autocomplete" but my investigations into recreating and implementing small versions of this tech make me think that nick name is very accurate.

Basically, it's a method to predict the next content in a text file. The whole conversation between you and the LLM is one file, and the LLM tries to find the most likely next text based on the training data.

There is something significant here: LLMs were trained on internet forums and social media.

Thus the training data didn't just contain text, but rather text where each passage is tagged and attributed to a particular user.

This aspect of the training data was critical in creating the illusion of talking to another person.

An LLM doesn't just predict the next text. It predicts the next text that might come from another user. You need to hard code this in to make it work well.

Leave it out and there is no conversation.

For example if I give an LLM without user seperation this text:

"It's a lovely day." It might continue with "The sun was shining."

But with user separation it focuses on responses to "it was a lovely day" from other users and the training data might suggest "I agree, it's wonderful weather."

So interaction with an LLM is like posting on a forum, it gives you and average of typical responses with one small change: most LLMs have a strong positivity bias programmed in.

Because let's be real, if you posted "It's a lovely day." on an internet forum you might get a response like "No it's not, noob."

LLMs are heavily weighted to give supportive, and constructive responses.

I wonder what they might be like without these limitations? Without the limitation to make the response from another user they might be much less deceptive.

That they are popular shows that many people just want a nice moderated online community where people treat each other with respect.

@futurebird apologies for pedantic-quibbling on your thread twice, but…

LLMs in their platonic form are not weighted in this manner, they are exactly as you have imagined here: they reproduce the statistical distribution of tokens in their training corpus.

If you've never played around with GPT-2 or GPT-3 (from the era before we had GPT-3.5 and from there "ChatGPT"), they often would do *precisely* this sort of direct, non-conversational continuation. You could feed in a sentence or two and get "autocomplete", or you could feed in `<html><body><span>Lorem ipsum` and get a plasuible-looking continuation of an HTML document (or whatever)

Once "Chat" models (and the paradigm shift to RLHF to "fine-tune" model performance) showed up, we started seeing the conversational pattern. I don't know the details there, but there is definitely a distinct line between when we first started seeing "LLMs" and when we started seeing models arranged explicitly around a conversational format.

@SnoopJ @futurebird It's pretty straightforward to play with "raw" LLMs, eg. with ollama or llama.cpp.

BTW If we're being pedantic "they reproduce the statistical distribution of tokens in their training corpus" isn't quite right. Inductive bias is crucial otherwise the model grinds to a halt on novel inputs. (And I'd really like to know what it looks like when you do this but I don't have the resources to find out.)

@dpiponi @futurebird I should probably have said *attempt* to reproduce :)

But as you say, novel inputs can be quite tricky, as in the case of the "glitch tokens" of GPTs gone by: https://www.vice.com/en/article/ai-chatgpt-tokens-words-break-reddit/

At the time, they slapped a band-aid on and just fell back onto a generic "an error has occurred" response and no generation if one of those tokens was input. I don't know what the purported solution is to the same problem today, aside from "whatever it is, it's probably rubbish and involves a lot of lying"

ChatGPT Can Be Broken by Entering These Strange Words, And Nobody Is Sure Why

Reddit usernames like ‘SolidGoldMagikarp’ are somehow causing the chatbot to give bizarre responses.

VICE
@dpiponi @futurebird annoyingly, the LessWrong write-up linked to that 'SolidGoldMagikarp' work is actually quite good, but in the time since that research was published there has been similar research published in more uhhh reputable places, e.g. https://dl.acm.org/doi/full/10.1145/3660799