Mastodawn

Just had a bizarre experience with both ChatGPT, Grok and Gemini. I was asking it to verify whether some page on my website had an og:image tag, and it all cases it repeatedly started making shit up.

When pressed, they admitted to lying about fetching the live page and just using inference... MULTIPLE TIMES.

Conversation: https://chatgpt.com/share/69d11947-6c34-832a-8a79-901efbd06088

Show thread

UkeleleEric 3d ago

@dvk And why are you still using any of those?

Show thread

Danny van Kooten

@UkeleleEric What do you mean exactly?

Show thread

UkeleleEric 3d ago

@dvk Using these LLM models, aka automated 'complex word-probability-smusher'. Surely, you knew that this is what was going to happen? And there are numerous other reasons why you should say #NOAI , but that's a start.

Show thread

Danny van Kooten 3d ago

@UkeleleEric Believe it or not, but I mostly get good value out of them. Not 10x, but definitely 10%.

Are they sometimes frustrating and is the tail risk of blindly trusting them very high? Absolutely.