Researchers claim GPT-4 passed the Turing test
Researchers claim GPT-4 passed the Turing test
The Study
The interrogators seem completely lost and clearly haven't talk with an NLP chatbot before.
That said, this gives me the feeling that eventually they could use it to run scams (or more effective robocalls).
It took them this long?
E: There are way too many people ITT that think a Turing test is hard to pass, and don’t seem to understand what it means for something to pass one. It’s such a low fucking bar, it might as well be meaningless.
most renditions have a strict set of rules on how questions must be asked and about what they can be about. Pretty sure the response times also have a fixed delay. Scientists ain’t stupid. The touring test has been passed so many times news stopped covering it.
Yes, “scientists” aren’t stupid enough to fail. I’m sure it’s super easy to “pass” the “turing test” when you control the questions and time.
Turing tests aren’t done in real time exactly to counter that issue, so the only thing you could judge would be “no human would bother to write all that”.
However, the correct answer to seem human, and one which probably would have been prompted to the AI anyway, is “lol no.”
Turing tests aren’t done in real time exactly to counter that issue
To counter the issue of a completely easy and obvious fail?
it’s not a good test.
Of course you can’t use an old set of questions. It’s useless.
The turing test is an abstract concept. The actual questions need to be adapted with every new technology. Maybe even with every execution of a test.
Turing test? LMAO.
I asked it simply to recommend me a supermarket in our next bigger city here.
It came up with a name and it told a few of it’s qualities. Easy, I thought. Then I found out that the name does not exist. It was all made up.
You could argue that humans lie, too. But only when they have a reason to lie.
That’s not what LLMs are for. That’s like hammering a screw and being irritated it didn’t twist in nicely.
The turing test is designed to see if an AI can pass for human in a conversation.
turing test is designed to see if an AI can pass for human in a conversation.
I’m pretty sure that I could ask a human that question in a normal conversation.
The idea of the test was to have a way of telling humans and computers apart. It is NOT meant for putting some kind of ‘certified’ badge on that computer, and …
That’s not what LLMs are for.
…and you can’t cry ‘foul’ if I decide to use a question for which your computer was not programmed :-)
In a normal conversation sure.
In the Turing tests you would be disqualified as a jury for asking that question.
Good science demands controlled areas and defined goals. Everyone can organize a homebrew touring tests but there also real proper ones with fixed response times, lengths.
Some touring tests may even have a human pick the best of 5 to provide to the jury. There are so many possible variations depending on test criteria.
you may be disqualified as a jury for asking that question.
You want to read again about the basics of the Turing test.
To clarify:
People seem to legit think the jury talks to the bot in real time and can ask about whatever they want.
Its rather insulting to the scientist that put a lot of thought into organizing a controlled environment to properly test defined criteria.
Its rather insulting to the scientist that put a lot of thought into organizing a controlled environment to properly test defined criteria.
lmao. These “scientists” are frauds. 500 people is not a legit sample site. 5 minutes is a pathetic amount of time. 54% is basically the same as guessing. And most importantly the “Turing Test” is not a scientific test that can be “passed” with one weak study.
Instead of bootlicking “scientists”, we should be harshly criticizing the overwhelming tide of bad science and pseudo-science.
The reporting are big clickbait but that doesn’t mean there is nothing left to learn from the old touring tests.
I dont know what the goal was they had in mind. It could just as well be “testing how overhyped the touring tests is when manipulated tests are shared with the media”
I sincerely doubt it but i do give them benefits of the doubt.
The participants judged GPT-4 to be human a shocking 54 percent of the time.
ELIZA, which was pre-programmed with responses and didn’t have an LLM to power it, was judged to be human just 22 percent of the time
Okay, 22% is ridiculously high for ELIZA. I feel like any half sober adult could clock it as a bot by the third response, if not immediately.
Try talking to the thing: web.njit.edu/~ronkowit/eliza.html
I refuse to believe that 22% didn’t misunderstand the task or something.
Okay, 22% is ridiculously high for ELIZA. I feel like any half sober adult could clock it as a bot by the third response, if not immediately.
I did some stuff with Eliza back then. One time I set up an Eliza database full of insults and hooked it up to my AIM account.
It went so well, I had to apologize to a lot of people who thought I was drunken or went crazy.
Eliza wasn’t thaaaaat bad.
14% of people can’t do anything more complicated than deleting an email on a computer.
26% can’t use a computer at all.
www.nngroup.com/articles/computer-skill-levels/
So right off the bat, 40% probably don’t even know what a chatbot is.
The public versions of the ais used in Turing tests usually have less computing power. The test itself is often also highly specific in what and how questions can be asked.
Its hardly new because models have passed the test before and as a result the test is made more difficult. It says nothing about intelligence and only about the ability to convincingly simulate a human conversation.
This is the same bot. There’s no way this passed the test.
.
Forget psychotherapy, it can barely understand anything:
> Hello, I am Eliza. I'll be your therapist today. * Hello! How are you? > Would you prefer if I were not ? * Not what? > Have you asked such questions before? * No > You are being a bit negative and that won't help us move forward. * What are you smoking? > Perhaps in your fantasies I am smoking?In the 1970s, psychiatrists couldn't distinguish between PARRY, a chatbot simulating paranoia, and a human patient suffering from schizophrenia. So this 'bot convinces judges that it's a human' phenomenon is not new and tells us more about how humans think.
#AI #DeceptionDetection #Chatbots #TuringTest #LLMs #GPT4
It was a 5 minute test. People probably spent 4 of those minutes typing their questions.
This is pure pseudo-science.
Yeah, it took me one message lol
I did some stuff with Eliza back then. One time I set up an Eliza database full of insults and hooked it up to my AIM account.
It went so well, I had to apologize to a lot of people who thought I was drunken or went crazy.
Eliza wasn’t thaaaaat bad.
So it’s good enough to fool people into thinking it’s a human?
Cool. Anyway…
Meanwhile, me:
(Begin)
[Prints error statement showing how I navigated to a dir, checked to see a files permissions, ran whoami, triggered the error]
Chatgpt4: First, make sure you’ve navigated to the correct directory.
cd /path/to/file
Next, check the permissions of the file
ls -la
Finally, run the command
[exact command I ran to trigger the error]>
Me: stop telling me to do stuff that I have evidently done. My prompt included evidence of me having do e all of that already. How do I handle this error?
(return (begin))
Each conversation lasted a total of five minutes. According to the paper, which was published in May, the participants judged GPT-4 to be human a shocking 54 percent of the time. Because of this, the researchers claim that the large language model has indeed passed the Turing test.
That’s no better than flipping a coin and we have no idea what the questions were. This is clickbait.