Whenever I wonder what cool application I would personally build with #GPT models, I keep coming back to the problem that there's zero guarantee of "worst-case" performance.

When I put on my software engineering hat, my first thought is always to ask "what could go wrong and how?" The answer of #LLM|s is that things could go wrong in completely unpredictable ways, we just try to make it statistically unlikely.

1/3
#nlproc #gpt4 #chatgpt

Using #GPT to build a product is like programming a calculator that gives the right answer 90% of the time, but in 8% of cases fails in subtle and hard-to-notice ways, and in the remaining 2% it claims to be a potato farmer, insults the user, or deletes your hard drive.

Yet somehow people are okay with that, because when it's in the 90%, it's a really really awesome calculator?

2/3

Of course that's not a fair analogy (none is) — #GPT models can do the most impressive things that no other software could do before. But the problem of worst-case behavior remains, and I personally am totally put off by that.

I love the potential to use #AI models for creative uses, I just don't see myself wanting to build any other kind of serious application with them at this point. And I'm surprised that so many people don't seem to care.

3/3

@mbollmann just as an assistant, helping writing pieces of code or suggesting them, it works quite while. That is how many people use it i think ?
@ErikJonker Oh, absolutely. But even there you have to be alert to carefully check the suggestions, and they can be wrong in very subtle ways. I feel many people have too much blind faith in the output.