Mastodawn

This is one of the things that frustrates me about these LLM based coding tools, too much wrong headed certainty. I've been using these classes and their ancestors for going on 30 years now and I sure as hell don't know off the top of my head. Saying I don't know is tons better than hallucinating an incorrect answer.

Show thread

Whyrl Feb 25

@paul Programming is both an art and a science. LLM based tools are capable of neither. Considering the most important part of code is communicating intent to a human, that's three strikes against the bots.

Show thread

Jay P Feb 25

@paul unfortunately, people are often the same damn way

Show thread

OberstKrueger Feb 25

@paul This is what kills me with these tools. When they’re correct, they can be quite helpful. But they’re incorrect with enough frequency that I find it hard to trust any but the most basic of answers/output from it.

Show thread

Eshu Marneedi Feb 25

@paul Have you tried Opus, preferably with extended thinking? It is much more likely to (a) do the verification by itself, and (b) refuse if it cannot verify it. Benchmarks have proven reasoning models are much better about avoiding hallucinations.

Show thread

Paul Haddad

Feb 25

@EshuMarneedi Opus and Codex said the same thing, but I didn't turn on any special settings on either, I think both set to the equivalent of Medium.

Show thread

13xforever Feb 25

@paul that’s the problem of training the model: it breeds out the concepts of uncertainty and not knowing. As a model, you might or might not get a cookie if you hallucinate, but you are guaranteed to not get anything if you refuse to answer.

Show thread

Kev Feb 25

@paul Sorry for the but this is what Perplexity gave back to me.

Show thread

Paul Haddad

Feb 25

@Kgault they all seem to get this one wrong, which I guess is fair because its a bit confusing.

Show thread

Kev Feb 25

@paul Yeah I’ve had to reword what I was asking few times to get a correct result on some work I’ve done in recent past using AI.

Show thread

nSonic Feb 25

@paul Well - I’m a human and developer (but in other languages) and I don’t even get the question 😅
But I’m not a LLM so I can clearly say: „what is the question? What is the problem you like to solve?“

My experience so far with those LLMs: write more Prosa. Explain your task, describe the problem and what you have tried. Ask for a (better) solution or let it „analyze“ it etc. Rather short (trick) questions are not what LLMs very good in.

Show thread

Paul Haddad

Feb 25

@nSonic it wasn't a trick question, it was a question based on what I was working on. The docs on the API are confusing on what should happen. I asked the AI assuming it'd write a couple of lines of code to see what happens, which is exactly what I'd do without AI.

Show thread

Fishd Feb 25

@paul I was recently given info on and links to "the code in a GitHub repo" for an Ansible module.

It didn't exist. The link 404'd and the git history never contained the entry described.

When questioned, it told me my Internet connection must be to blame.