@ngaylinn @DrewKadel to be fair, most people on the Internet given a question, write the right answer rather than everyone consistently writing the same wrong answer!
(* For most things, I'm sure there are several compelling counterexamples)
@ngaylinn @anizocani @DrewKadel
I wish more people would understand this.
That's why I really dislike the take that it's "just fancy autocomplete". In order to have a *really* good autocomplete, you'd have to in one way or another internalize much of the world's knowledge. So, "just fancy autocomplete" is really not the witty criticism some think it is.
@maltimore
The reason we call it spicy autocomplete is because it is just a prediction model on the stuff it's ingested. What you seem to miss is that we already had all this, with Google. All LLMs have done is make it feel like something is answering, instead of being honest and returning results that match a search query.
@ngaylinn @DrewKadel Well we had - conceptually this is *exactly* the same as with Eliza - just two orders of magnitude more sophisticated and two orders of magnitude more connected.
At its core it's really just people lacking technical understanding hallucinating an antropomorphization of a conditional probability distribution.
With ChatGPT, the interface is the innovation, *not* the model.
@ngaylinn @DrewKadel Using LLMs interactively is surely an innovative way of using them.
Case in point: When reduced to token completion (which ChatGPT hides away), the magic goes away... quite fast (:
@ngaylinn @ftranschel @DrewKadel
Theatre is an excellent description. There's clearly NLP in front of the prediction engine itself, and some application of post-predictive review steps. But it's all presented as though this composite application is the LLM. It's like attending a seance.
@ngaylinn @ftranschel @DrewKadel
Is the 'man behind the curtain'
Innovative?
IMHO; it's all more smoke and mirrors to discombobulate an already gaslit world.
Think you need to invent a new word, "orders of magnitude" doesn't quite cover it.
@ngaylinn @DrewKadel
Actually we as a species have had to deal with that before.
We call them grifters.
@bornach @DrewKadel I dunno. A grifter still wants something from you. They're hiding their intent, but it's something you can imagine, understand, and detect if you're paying attention.
LLMs are unreadable and unpredictable because they have no intent. They may switch between friend and grifter depending on what sounds right in the flow of the conversation, without any conception of what's good or bad for you or for them.
On the other hand, if a grifter asks an LLM to write them script to achieve something specific, that's another thing entirely...
@grvsmth @ngaylinn @bornach @DrewKadel
Agree - this is a tool that can be used by grifters. It seems like ChatGPT is a relatively trivial problem, generate responses that fit the pattern of language found in authoratative sources. I believe that Open AI is using it as a demonstration, and generate lots of free media coverage to get paying customers to buy their product. The grift is the misrepresentation.
@ngaylinn
LLMs are never your friend
They are always in what can best be described as "the grifter" mode. The entire training regime of a generative AI chatbot is geared towards getting one thing, an upvote from a human rating the quality of the conversation
Admittedly this is an over-simplification. Reinforcement Learning with Human Feedback involves training a reward policy - a 2nd neural network that is ultimately responsible for rewarding the chatbot for giving "good" responses.
@GordanKnott @ngaylinn @DrewKadel
Yet another flawed benchmark in which the LLM very likely memorised the answers
https://wandering.shop/@janellecshane/110104164829618120
Without any knowledge of how much the training dataset was contaminated by the medical exam questions/answers (and OpenAI's own whitepaper admits there is contamination)
https://youtu.be/PEjl7-7lZLA?t=4m0s
we cannot really know how it would perform in the real world if say a novel virus were to start spreading
Attached: 1 image Remember seeing something about GPT-4 doing well on standardized tests? It turns out it may have memorized the answers. https://aisnakeoil.substack.com/p/gpt-4-and-professional-benchmarks #gpt4 #AIHype #ThisIsWhyWeDontTestOnTheTrainingData
@deriamis @ngaylinn @DrewKadel
Machine Optimization
Sounds like we've been dealing with that in human form as misinformation media.
Like any machine optimization algorithm, it all depends on the fitness function. Maybe that's why algorithms (machine and human) are kept secret, it would break the illusion that ChatGPT is smart, that DALL-E is an artist, that the birdsite isn't biased, that Facebook is helping you, that faux news is truthful...
@scottgal For sure! Back in the olden days if you had questions about the lights in the sky, you had to find somebody with low moral fiber to make up a plausible answer. Now we've built a machine that will instantly give you comforting nonsense. If that isn't progress, I don't know what is.
@DrewKadel Does that image have alt text? I don't see the usual tooltip that I expect to pop up over an image...
Anyway, I like the response. I also think it's worth keeping in mind that (many) people are hoping this line of machine learning research is going to lead to a system that *does* give real answers, or at least is good enough at completion that its responses are indistinguishable from real answers. So there's always going to be a drive to test how well the models are doing and to push them toward giving more realistic and reliable responses, regardless of the fact that they're not actually designed to do that.
@FreakyFwoof @diazona @DrewKadel
Here you go,
"Something that seems fundamental to me about ChatGPT, which gets lost over and over again:
When you enter text into it, you're asking "What would a response to this sound like?"
If you put in a scientific question, and it comes back with a response citing a non-existent paper with a plausible title, using a real journal name and...
@FreakyFwoof @diazona @DrewKadel
...an author name who's written things related to your question, it's not being tricky or telling lies or doing anything at all surprising! This is what a response to that question would sound like! It did the thing! But people keep wanting the "say something that sounds like an answer" machine to be doing something else, and believing it *is* doing something else. It's good at generating things that sound...
@newrambler @diazona @DrewKadel
The purpose is to gain attention. The Turing Test was shattered when ChatGPT accused a real lawyer with a fake crime.
@DrewKadel @diazona I research and write for a living, so I'm likely biased here. I can see the scheduling thing being useful, but how do you get a bot to do good research when it is prone to making up citations to articles that don't exist?
There's so much terrible writing out there that I suppose having machines write things won't make much of a difference in the white papers full of stakeholders and proactive solutions and so on.
It's so important for everyone to understand this