I have a real world example of the limitations of ChatGPT, specifically for the purpose of code—or more accurately, in case the difference matters here, shell scripting.
This is about my greeter for Fish, although most of it could've been done in exactly the same way in Bash or whatever else. The important things here are that I wrote the whole thing myself with no help from any LLM, and that it consists of one-liners that are all quite long (by my standards, at least).
AFTER everything was already working correctly, at which point no LLM had ever seen in, I showed it to ChatGPT and asked it to abstract away whatever redundancy I was unable to myself, if possible. I already had a function, for everything I was able to, that would've made the whole thing more difficult to read when it was all put together. One of them was only used once, which means I wouldn't have pulled it out normally, but I had in this case because of how long it was.
Well, it tried to do what I asked, and the new version it gave me didn't work at all and was missing a LOT of what was already there because it had to be.
I'd already done the hard part myself; it, apparently, couldn't hold enough of everything I'd already written in memory at once to be able to rewrite it.
I hadn't done anything that anyone else couldn't have—a lot of people have way more experience with this stuff than I do, and therefore could've done it even better—so I don't know that the ability of the average LLM will ever exceed that of the average human in this regard, frankly.
