The biggest question for me about large language model interfaces - ChatGPT, the new Bing, Google's Bard - is this:

How long does it take for regular users (as opposed to experts, or people who just try them once or twice) to convince themselves that these tools frequently makes things up that aren't accurate?

And assuming they figure this out, how does knowing it affect the way they use these tools?

@simon Given the argument I had over the weekend with a "regular user" who absolutely refused to accept my suggestion that the code ChatGPT was "helping" him with had problems, I don't think there's much hope for the future.