Mastodawn

eevee 🦊Jan 11, 2023

chatgpt is predictive text.
chatgpt is predictive text.
chatgpt is predictive text.
chatgpt is predictive text.
chatgpt is predictive text.

it's not even answering questions. it just pattern-matches that the next text after something that looks like a question is most often something that looks like an answer

Show thread

SnowDerg ❄️

🪶Jan 11, 2023

@eevee so it's just markov chains on steroids?

Show thread

Graham Sutherland / Polynomial Jan 11, 2023

@SnowDerg @eevee a bit like a markov chain glued to a content recommendation algorithm.

Show thread

Reboot/Fitz Jan 11, 2023

@gsuberland @SnowDerg @eevee glued as in two models or a larger neural network implementing both concepts?

Show thread

Graham Sutherland / Polynomial Jan 11, 2023

@reboot @SnowDerg @eevee essentially both concepts smushed together in a stateful manner as a single model, at least in terms of external behaviour.

the actual architecture isn't the same as those individual components but that's a separate conversation.

Show thread

Reboot/Fitz Jan 11, 2023

@gsuberland @SnowDerg @eevee but, I kinda want to have that conversation... (as a person who does a lot of general CS work.)

Show thread

Graham Sutherland / Polynomial Jan 11, 2023

@reboot @SnowDerg @eevee I'm probably not the right person to talk about it in deep detail since the gory innards of LLMs and autoregressive models aren't my wheelhouse. Someone did post a good article on LLMs the other day but I can't spot it.

I'd recommend searching online for an explainer but unfortunately every single one I found on the first page of Google was a bust due to gushingly anthropomorphising the model in a way that falsely implied an ability to develop understanding.

Show thread

Reboot/Fitz Jan 11, 2023

@gsuberland @SnowDerg @eevee makes sense as any sense that it developed a greater "understanding" would be context kicking in rather than actual learning. Mind expanding L.L.M. for me?

Show thread

Graham Sutherland / Polynomial

@reboot @SnowDerg @eevee LLM = large language model

I find it important to not use the term "understanding" in this context because the model is *not* understanding anything. It can *mimic* understanding through correlative processes, for sure, but the distinction is important.

Show thread

Graham Sutherland / Polynomial Jan 11, 2023

@reboot @SnowDerg @eevee this is why I call it a word recommendation algorithm. you give it a prompt and it correlates that with wording that it has seen before from its massive training corpus of human conversations. from there it can launder those words through a generative language model that "follows" the linguistic structures it has been trained on, to produce a paragraph of text.

Show thread

Reboot/Fitz Jan 11, 2023

@gsuberland @SnowDerg @eevee make sense. In marginally related terms, the responses rejecting generation of text (due to violation of rules, or general non-understanding) would be due to training to generate responses rejecting the given task, right? (This also explains why I can't ask it to, for example, play music.)

Show thread

Graham Sutherland / Polynomial Jan 11, 2023

@reboot @SnowDerg @eevee it cannot draw from anything it hasn't been trained on. it can't create new information or concepts.

the rules limitations are largely artificial, and the general mechanism is to "prime" the context of the session with a list of terms it shouldn't respond to. you can always break out of the restrictions with clever linguistic tricks because it doesn't actually understand the rules, you're just tweaking the context enough to negate the originally provided words.

@reboot @gsuberland @SnowDerg @eevee Yep, it can also be quite easily tricked into doing anything, you just have to hijack the flow of the conversation and "hypnotize" it.

The important thing is, it doesn't have any internal distinction between its own words and what you say besides the context being given to it (probably some header/prefix, which is also how it knows the time lol), so you can quite literally put words in its mouth.

Show thread

Graham Sutherland / Polynomial Jan 11, 2023

@awooo @reboot @SnowDerg @eevee indeed. it's essentially a correlative prediction/recommendation engine for "if these are the words that have been said so far, with these logical delimiters between participants, what comes next?"

@gsuberland @reboot @SnowDerg @eevee Yeah, it's definitely impressive how it can keep the separation between different participants and objects, search for context and follow a logical flow (as in the text, not necessarily in the "thinking" part). But throw enough at it and it will lose track of that and that's when you can inject whatever you want.

@gsuberland @reboot @SnowDerg @eevee I think the question about the fastest marine mammal gives a lot of insight about what it actually does internally. I tested it on GPT-3 (not chatGPT) and tried confronting it about its mistake, instead of apologizing it decided to change the definition of a mammal to fit its wrong answer instead of the other way around.

It's a bit like a human with confirmation bias cranked up to 11.

Show thread

Anca Buit Jan 11, 2023

@awooo @gsuberland @reboot @SnowDerg @eevee Let's put it in charge of everything then. Nothing can possibly go wrong.

Show thread

Graham Sutherland / Polynomial Jan 11, 2023

@reboot @SnowDerg @eevee if you did this with a small corpus it would be laughably bad. but because it has been trained on a metric crapton of human conversation on a huge variety of topics, there's a disparity of sourcing data between what it has "seen" and what you have seen. so even though the actual correlation and generation processes are ultimately simplistic, the huge amount of data it can draw from makes it *seem* convincing.