@Platform_Journalism The results are problematic but they not just singular, they are compounded I have found in my experiments. You have the underlying system of token generation being probabilistic with a temperature to create forced randomness becuase it creates more human like striings of text and is more believable.
On top of this you have a guiding principled prompt for most main stream systems that are more there for business purposes. So re-inforcing engagement and helpfulness as primary principles of what they deliver.
And further on top of this you have additional systems that may tweak both incoming and outgoing messages and controlling further injected context (you may not see) into the conversation. All together these pull away from particular elements that professionals should typically be after.
My view is that what is "professional" these days is more about feel than metric assessment. So LLMs are very good at create an emotional interpretation that many are happy enough with.
I have a recommendation for you to try if you haven't. Give Claude a try. It still has all the same issue, but I find Anthropic at least try to address some of these things better than ChatGPT or Gemini. They also at least try to do better research on demonstrating some of these concerns and what causes them.
Still far from perfect though.
As for what can they be used for. They are reasonable for being a rough sounding board. When I need to solve something technical like a function and it's syntax I will usually engage with it when it's something I'm not that familiar with to get ideas of things I should look for. I'd say I get a benefit like 1 in 5 things, but it's sometimes quicker than pushing through google. It's also better at summarising search results, which I can verify when I go to a link. So it has some uses.