Mastodawn

Robin Syl 🌸

May 27

@zzt Machine translation has never been good enough for prod. It was only for personal use. I use websites in English because the Chinese translations are usually awkward and baffling. What blows my mind is that LLM translations are worse, but now companies are bragging about using it. They didn't brag about using machine translation because that was embarrassing.

Most "freely" available machine translation engines have never been good enough, but purpose-built language pair engines (like say DeepL's "classic" backend, RWS, Lionbridge) have been "good enough" for years (well before this current LLM craze).

Yes, I wouldn't trust ChatGPT to translate one of my technical documents into Chinese, that would be a bunch of gibberish. But the good engines are the ones you still pay comparable-to-humans $/word for.

And we DO check, IF we have the luxury of $ and time for a human proofreader.
If we ARE lucky enough to get downstream review in the target language we sometimes do a blind test; translate the same thing with the machine and a human; frequently the two are similar level of "not perfect, but good enough; technically correct, conveys the right information." They just make different mistakes.
2/

The Machine tends to always fall back on a literal translation so it fumbles on colloquialisms and commonly understood phrases; also if we're using dictionaries or translation glossaries to overrule default literal translations, it tends to mangle the surrounding grammar, like tenses or genders. A human wouldn't do that, but in particularly technical content we have to explain to the humans what a whidgaflam is.
3/

Third spruce tree on the left

And we may not always get the same human next time so we have to explain _again_ (this too can be assuaged by glossaries, but those come with their own problems.)

But yeah, fundamentally whether you use cheap LLMs, GOOD LLMs or humans, if you're not proofreading with a target langauge specialist in your field you're gonna get questionable results no matter what.

Sometimes we don't care - if a customer's country has language laws saying you MUST have a manual in French, but everyone knows the operators all know english, maybe we'll skip the proofreaders because noone is gonna ever read that manual. But manual_fr-FR.pdf checks a box.
IF we're trying to land our first customer in Japanese and its a big system sale, you bet your ass we're having a proofreader review all 400 pages.
5/