I think people really underestimate how fragile LLMs are for auto-translation. You can put complete garbage into it where none of the words are real words and still get out plausible-sounding "translations" just because the LLM sees it as "close" to a real sentence, and then translates what it thinks that "close" sentence is based, once again, on what seems "close".
The whole benchmarking approach really does not help with this since benchmarks rarely include testing for failures. You need to test that garbage-in is recognized as garbage, otherwise you get garbage-out too.
@endrift this is the problem with LLMs for transcription too. They do ok, but they MAKE SHIT UP. A more specific transformer-based model does a great job! But the chatbot is more convenient.
I'm disconcerted they're putting the LLM "I know what you mean" thing into Google Translate, the original showcase for transformers. I mean, it's obvious they think that's helpful. But still, urgh.
My favourite example was actually you! A post containing pivot-to-ai.com was translated (I can't remember the source language) and it decided to replace the domain name with 'pineapple.com'.
It wasn't allowed to touch the HTML, so this ended up with a link that showed 'pineapple.com' but went to 'pivot-to-ai.com'.
I strongly suspect that there are some neat ways of sneaking malicious links past existing email filters that rely on this.