Mastodawn

Jeff Geerling May 24, 2024

lol glad I use DuckDuckGo, wouldn't want to be eating rocks now
https://infosec.exchange/@howelloneill/112492876811138145

Patrick Howell O'Neill (@[email protected])

Attached: 2 images

Infosec Exchange

Show thread

Admiral Snackbär May 24, 2024

@geerlingguy

All jokes aside, but this has essentially two implications:

Either Google is structurally too incompetent to actually test edge cases before deploying such products.

Or they did test it, and just said "fuck it, as long as nobody dies attributable to us, it's good PR".

Both scenarios are terrifying.

Show thread

chebra May 24, 2024

@AdmSnackbar @geerlingguy

Option number 3 - these things cannot be tested. It's an intrinsic part of the technology, LLMs will simply always hallucinate and can never be completely protected from outputting training data verbatim or giving these false answers. They were never meant to. It's a glorified random word generator. And we are now building businesses on top of that.

Show thread

Admiral Snackbär

@chebra @geerlingguy

Of course you can test them by, you know, trying stuff out.

You can't unit test them like traditional software, but I expect every company to have at least a few test queries for these models. Including some weird ones. This is not that one weird query that breaks Google, it's endemic. This feature is clearly not ready and releasing it to the public reeks of recklessness or stupidity.