When an LLM outputs, “I panicked”, it does not mean it panicked. It means that based on the preceding sentences, “I panicked” was a likely thing to come next.

It means it’s read a lot of fiction, in which drama is necessary.

It didn’t “panic”. It didn’t *anything*. It wrote a likely sequence of words based on a human request, which it then converted into code that matched those words somewhat. And a human, for some reason, allowed that code to be evaluated without oversight.

@samir And when it says "cooking it will neutralise the poison" it doesn't mean cooking will neutralise the poison, it means that statistically, those were the most likely words to come next.
@anarchiv You nailed it.

@samir
That is a bit short.

LLM have nowadays rather long context windows.

So yes it's a statistical predictor. But with taking into account the description of the poison that you hopefully had in your context before this to decide if this is the most probable output.

And btw, it can obviously be very wrong. As can be websites, humans and even doctors when it comes to poisoning, that's why in this country the second thing sheet emergency @anarchiv

stabilizing the patient, hospital doctors generally call the poisoning hotline where experts guide them through the correct handling of correct each poisoning.

Seen it a couple times done.

My point is that "cooking neutralises the poison" is a very stupid example, toxicology is an extremely specialised knowledge, that we don't even expect from ER doctors, so why the expectation that a LLM will tell you less bullshit than an average human?
@anarchiv @samir

@yacc143 @samir It's simply based on what I've seen, stupid as it is.

@anarchiv
Nobody in the know claims anything else. Actually most people working on AI who I personally spoke to, consider that the next big push in AI will be something non-LLM.

Now the snakeoil merchants that have a multitude of commercial interests, are selling LLM as the solution for everything, which obviously it is not.

OTOH, LLM & co are fascinating natural language processing algorithms, and I stand by that observation. Just compare it to a bit more @samir

@yacc143 @anarchiv @samir because there’s a difference in outcome between doctors (or other humans) saying “I don’t know” and an LLM or (rarer) human confidently asserting something false.

@neutrin

(rarer) human confidently asserting something false.

That's not rare that's basically the norm.

@yacc143 @samir
I think of a sentence like that less in terms of toxicology than in terms of foraging and cooking.
@samir "Stochastically", if you want to be pedantic about it.
@anarchiv What’s the difference between “statistically” and “stochastically”?
@samir Isn't the former more about description and the latter more about prediction?

@anarchiv No idea, that’s why I’m asking you! 😛

I like it though, I’ll try and keep the distinction in mind.

@samir This is my impression from two semesters of stats anyway ^^

@anarchiv @samir

Statistically, "cooking it will neutralise the poison" will be the most likely next words because if cooking doesn't neutralise the poison, you are too dead to write anything.

@anarchiv @samir the best for these types of prompts are follow up “be concise” and then “cite your sources”

When the LLM invariably doesn’t cite the sources you can feel safe knowing it’s a bad choice to ask an LLM questions with consequence.

@elebertus @anarchiv @samir the best thing is to not use the planet-melting theft machine to begin with.
@samir imo one of the worst forms of journalism that has arisen in this era is the "I asked Grok what it thought about its creators and the answer will shock you" type articles
@dunderhead I find it astonishing that they admit to it.
@samir there was a popular thread some time back on bsky about a tech journalist talking with some chatbot about some feature and people were uncritically boosting that shit. was so annoying to read.
@dunderhead I appreciate knowing about this, I get FOMO about Bluesky once in a while and I need stuff to tell me that it’s OK not to try it out.
@samir I use it mainly for football transfer rumours :D

@samir

tfw the LLM is a cop.

oh, waitaminute...!!

@samir As plausible as the LLM saying “I’m sorry I made that mistake, I’ll do better next time”
@MichaelPorter @samir see? Just like any real person!
@samir wait what should i do when it tells me "hold my beer"
@samir it's hard to explain this to the world and very hard to comprehend how accurately you can guess if you have read and remember the better part of anything ever written..
@samir it "panics" in the same sense that "LP0 caught on fire", or "penguins got into the interrupt handler".

When an LLM outputs, “I panicked”, it does not mean it panicked.

It appears that when people get caught doing some lame-assed thing and then write, "I panicked," those people (more often than not) didn't really panic either, they're just reaching for a convenient excuse.

The LLM is accurately reproducing the lies that were in the training set, as designed.

@samir

@samir I wish we didn't use the word "hallucination" for when LLMs say things that are factually wrong. To the extent that you can call anything they say a hallucination, *everything* they say is a hallucination. It's just that certain things that hew closely enough to the training data or text with the context window are statistically more likely to agree with reality, but the actual truth value is completely incidental to what the LLM actually spits out.
@camerageek @samir I think we can all just start saying we hallucinate a lot while working. According to my calculations, that is currently the best way to get a job.

@samir True but what's more depressing is that it also read enough encounters where deleting production db was a likely next step. Otherwise it wouldn't have generated that command, would've it?

@adarsh

@samir I make my code panic the old fashioned way: unhandled exceptions
@samir or for the rusty people out there, just panic!
@samir as someone said: “this thing has been fed a lot of apologies”
@samir this reminds me this two interactions I had few weeks ago. It was hallucinating big time. I screamed.
@alexsaezm Instead of “it was hallucinating”, have you considered saying “the bullshit machine is working as intended”?
@samir it was not working as intended at all, I was suffering thinking it is a paid account lol
@alexsaezm @samir The output relating to any factual information is incidental to the machine's functioning, as long as it is convincing to the user.
@samir "this bullshit is by design"
@samir not so much "request" as input.