LLM translation models are going great
The word for finger isn't even in this sentence

I think people really underestimate how fragile LLMs are for auto-translation. You can put complete garbage into it where none of the words are real words and still get out plausible-sounding "translations" just because the LLM sees it as "close" to a real sentence, and then translates what it thinks that "close" sentence is based, once again, on what seems "close".

The whole benchmarking approach really does not help with this since benchmarks rarely include testing for failures. You need to test that garbage-in is recognized as garbage, otherwise you get garbage-out too.

@endrift garbage in -> garbage out
@TheOneDoc @endrift except even a basic 3€ calculator will recognize when you enter something invalid and enter an error state instead of just tossing a random number at you. Software just making some garbage up, that isn't even meaningfully related to the garbage you entered, is a very recent invention and we should not just accept it.
@ratsnakegames @endrift yes I want my garbage generator to be at least deterministic.