I’m glad people are finally noticing LLM translators will just make plausible sentences the fuck up when they’re fed anything but a perfect source to translate, which makes them exhausting and damaging for language learning and a variety of other situations where you’re expected to actually, you know, use the fucking language for anything but a highly inaccurate skim

thank fuck all the language learning companies didn’t jump onto the LLM train, right?

…right?

this post brought to you by the language learning app I’m using semi-consistently telling me that names, formatting artifacts, and words in languages other than the one being learned, all have fabricated definitions related to the sentence they appear next to

this is a very hard error to catch if you aren’t paying attention and it’s actively slowing me down

“LLMs might be horrible at almost everything but at least they’re good at translating and accessibility”

are they though? or are they fucking terrible and you didn’t notice because the output looked good and you didn’t check it?

@zzt LLMs: good at everything that you are not equipped to verify.
@ainmosni @zzt ai is great at things that are very cheap to validate, because it learns to get the person that gets 3¢ per validation to click the green button
@kabel42 @zzt sadly, while the validation is cheap, the generation isn't, but that's not their problem.
@ainmosni @zzt Is it? I can't validate a translation is good that fast, I cant even decide which squares are bicyles and not mopeds
@kabel42 @zzt I meant in cases where validation is cheap, like the insane tightloop in claude code that validates if claude generated valid json with an open jsonschema and then lets claude try again if it doesn't validate.

@kabel42 @ainmosni @zzt
> [ai] learns to get the person that gets 3¢ per validation to click the green button

in this setup, ai learns to produce plausible text rather than correct

@zzt LLMs are always really good at whatever type of output you don't know how to validate.

If you don't know how to code, it is great at code.

If you don't know a language, it is great at translating it.

If you don't know the law, it is a great lawyer.

Every fucking time... Yet somehow, some people aren't seeing the fucking pattern.

@minego @zzt Exception - I know how to bullshit managers, it's great at bullshitting managers

@etchedpixels
@zzt It's great at bullshiting managers who don't understand the work their employees do.

My manager used to be a coder, and he gets it. He sees how bad it is, and listens. Sadly his manager doesn't.

@etchedpixels @minego @zzt

Ah, is that an exception, or just an extension of the principle?

LLMs are great at whatever output you don't know how to validate.

Managers and C-suites generally don't know anything about anything.

Therefore, they consider LLMs universally effective and highly desirable.

@TheEntity @etchedpixels @minego @zzt Why don't we just replace all the managers and executives with LLMs then?
@LordCaramac @TheEntity @minego @zzt Certainly for the management consultants I'm not sure anyone would notice
@etchedpixels @minego @zzt Which isn't that hard, just phrase it in that weird corpobollocks pidgin language they speak and they'll be convinced.

@nini @minego @zzt 'embracing the latest upward debt leveraged technofuckery'

Or you can use AI against itself as Kagi translate will happily let you set the other language to anything including linked in which is a hoot - or even perl

@minego @zzt I stopped reading The Ecnomist when I noticed a similar pattern.
@RogerBW @minego @zzt
rare to see gel-mann amnesia outside of journalism, but there we are
@minego @zzt The solution to that problem is already being implemented: fire the people who know how to validate.
@minego @zzt hence the reason people whose only job is to tell other people to do things think it's good at everything
@minego @zzt llevo toda mi vida teniendo esa misma sensación con el periodismo. Todo parece verdad hasta que dominas el campo del que trata el artículo y entonces todo parece estar mal. Mal entendido, mal explicado y tendencioso. Así que no voy a notar tanta tanta diferencia. Seguiré no creyendo casi nada de lo que me digan y aplicando siete capas de escepticismo.

@minego @zzt I think it's called Gell-Mann Amnesia.

It's the same thing with the warehouse robot, by the way. If I worked the speed that robot (or his human worked), I'd be fired. If I worked in conditions a tenth as simple, uniform and reproducible, I'd buy myself a lottery ticket.

@minego @zzt

if you are not into hallucinogens, it is great at hallucinating.

@zzt I translated something to English yesterday, and the translator made up an “English” word

@scm @zzt That's...technically...in line with how english-speaking humans handle english.

If we aren't under pressure to be 'generative' we often just yoink the other language's word and call it good.

I suspect that the bot is not driven by a commitment to idiomatic perfection, however.

@fuzzyfuzzyfungus @zzt it wasn’t a word in any language, as far as I know, I think it was a mangled English word

@scm @zzt

You resist its work to embiggen your vocabulary with a perfectly cromulent word!

😉

@zzt So true. Also related, I reckon, is how they groom these same people – those who haven't tried doing the actual task – into thinking the task to be easy / below them.

Like, people who don't bake or dance, watching Bake Off or Strictly, seeing people make mistakes and judging them super harshly for it.

Keeping the users from checking the output and trying to do the task themselves, is intrinsic to the con, imo.

@zzt A week or two ago I had one of those funny coincidence moments where I saw a post like "It's sad that AI companies are doing this because machine translation is one of the only things LLMs are good for," followed about an hour later by "I had to retranslate this entire passage because the machine translation was nonsense."

It's almost like LLMs aren't good at anything

@awmwrites @zzt Come now, let's be reasonable. They're good at generating intelligible strings of text in many different languages. Otherwise, we wouldn't be in this situation.
@zzt I have complained about machine translation for about a decade by now, usually what I meet is "What am I supposed to do? Learn the language?" yeah, just like I learned english to be able to communicate with people who don't bother.
@sotolf right! and for my purposes (learning the damn language with my human brain) the LLM translations are worse for me, because it’s very obvious when an old school translator fucks up and I can sometimes figure out how it got there and retranslate by hand, but the LLM will just extrude its fuckups into unrelated text
@zzt Yeah, I can think that's annoying yeah, I speak 3 languages daily, and some times I just need a nudge for a word, so I pop a sentence into one of the online translators, and even then when i use rather clear example sentences at times the result is really weird.
@zzt The worst thing though is things like applications or games or something where the developer think that machine translation is "good enough" so they just pulled it through without caring at all, I have had to reset the language of my phone from norwegian to english, and every other program set it to english, because the text bits in program are so bad I can't even puzzle out what they were supposed to mean. It's annoying.
@sotolf @zzt sotolf, this was a huge thing in the early web in France when people would ask their 16 year old niece who knew a little English to translate the site. That would be obvious on the home page that read, "Welcome in our web site". AI is like your 16 year old niece who knows a little English!
@randulo @zzt Just that your 16 year old niece is less likely to objectively lie to you, and if they don't understand something completely you can look it up together :)
@sotolf @zzt But similar that, unless she's paid and taken seriously, she'll do the best she can and not mention if she's not sure. AI would be cool if answers began, "I don't know for sure, but I think..."

@randulo @zzt

No, I don't think AI would be cool then, there are still so many issues that isn't helped by the averageing out-o-meter, I'd much rather read something where I see someone tried, broken norwegian that I can puzzle out because I see what the person tried to do I'm okay with puzzling out and some times help someone get better if they want to, but someone just putting something into the "good enough for me" box and just leaving me hanging don't deserve to get any cookies for trying, I'd much rather have to puzzle out the meaning of a french sentence, and believe me I utterly suck at french than having to read it machine translated, at least if there is something wrong in how I did it it's a learning moment, and I can get better at something, even though it's not something I was really prepared to get better at that day.

Of course the anglos aren't really going to care, because they are never going to have to use the result, don't have to suffer from not having the neuances there, aren't going to notice the context that is going missing. There are whole conversations that I have had that I've put to a translator for fun and 70% of what is going through it makes no sense any more, and that's not something that is going to be solved by "I'm not sure but" almost all LLM chatbots has the "Always check the result, never use the result without verifying" thing under it, and we all know how much people pay attention to that.

@sotolf @zzt
Even though I have lived here for over four decades and have a pretty high level of the language, I still need a boost from time to time. Nowadays, I just type what I want in the address bar and whetever AI the search is using will provide the answer. Actually I sometimes need to do it in Englisn too, as I sometimes forget correct spellings. I am not ok with shoving AI everywhere and in everything, but I find some of it useful in day to day trivial tasks that don't reveal much.

@randulo @zzt

Sorry, but I'm unable to parse this sentence:

I am not ok with shoving AI everywhere and in everything, but I find some of it useful in day to day trivial tasks that don't reveal much.

@sotolf @zzt Is your niece available?

I'm saying I don't hate AI itself, just the fact that we are fed it very much like geese are fed to make foie gras. I think the result is similar, too.

And that I use it for some simple tasks that do not reveal bodily illness or existential issues.

@randulo @zzt

Sorry, she's only 2 and bearly speaks norwegian :p

So you have no issue with the plagiarism, the excessive resource use, piracy or anything? I don't want even less effort spent on things that actually matter to me, like my native tongue, which are spoken by probably fewer people that live in Paris, and it's still one of the less endangered langauges around. Personally I don't think outsourcing thinking ever was a good idea, so much context and unsaid knowledge that people "just know" are getting lost for something that is not really worth consuming in any way.

@sotolf @zzt Without entering into a long back and forth, I don't think asking how to say a word in French uses a whole city's water, replaces anyone's job, or plagarizes.

As we were discussing translation, I was thinking about it today. We worked with specialized translators in our business a few years ago. Translators mostly will NOT be replaced by AI, as it will never be good enough for serious translation, i.e., perfectly conveying human thoughts.

@randulo @zzt

If you're not up to a back and forth, I will give you a tip to something I've used for so long, that's really good and quick, and cost next to no resources, it's called "a dictionary" and it's magical, it has so many cool words in there, often with an etymology and pronounciation guide and everything :)

@sotolf @zzt Yes, I do use the online versions. Very handy and nice of them to make them available free. In fact, typing in the address bar usually brings one up. The changes announced in Google search mean the answer will just pop up instead of those links. It has been said that those results are licensed to the rights holders, which is an interesting development.
@sotolf @randulo @zzt depends on the language tho. for chinese, for example, i wouldn’t know how to look a word up unless i already know how it’s written

@xarvos @randulo @zzt

For my japanese learning at the university, I used hiragana, stroke-count, or hand writing recognition to find stuff, I also had a really cool electronic dictionary that was driven by two AAA batteries, I would guess that would be similar in chinese, just that they use pinyin and stuff for lookup.

@sotolf @randulo @zzt you learned japanese so i guess you also know there are a bunch of characters that sound the same but mean entirely different things. and you can’t rule out from the context unless you already know the language. so, it’s useful for people already know the language looking up a few new words, but not if you’re a beginner and need to look up almost every word

@xarvos @randulo @zzt Yeah, no, but our teachers were really clever and taught us some nice tricks, like how to ask for a simpler explanation if we don't understand or something like it, if it's spoken you can always ask the person to write down the kanji/hanzi, and then it's usually easier to guess, or look up, in a conversation I would mostly not bother with a dictionary, and guess from context unless it was a load bearing word for the sentence, as with so much of language learning, it's daring to show that you don't understand, daring to be silly or stupid, and just asking helps so much. Even now, having spoken german almost daily for a decade now there are times that I ask a coworker "how do you say (simple stupid explanation of something)" and mostly people are very helpful if they see you make an effort.

If it's written down it's usually quite quick to look up, and if it is something I want to say, and there are homonyms that I can't really keep apart I will use a 2 language dictionary to make sure it's the one I want. Sure it's more work, but at least personally when I put some work in it it sticks in my mind a lot better than if I just get it handed to me.

@sotolf @randulo @zzt
Another good one is Wikipedia; you can look up a term in one language, then it has links to the corresponding pages in other languages, where the term is used in a sentence (many sentences) along with related words

Especially handy for specialist terms or realia

Also good for showing people, because it'll often have a picture along with the word (and an explanation, if needed)

@sabik @randulo @zzt Yeah, it's really awesome for that too, I forgot, but I do that rather often :) It's a really good tip :D

@randulo @sotolf > Without entering into a long back and forth, I don't think asking how to say a word in French uses a whole city's water, replaces anyone's job, or plagarizes.

and you’d be fucking wrong, it’s the same fucking LLM

“without entering into a long back and forth” too late for my fucking notifications I guess

@sotolf I miss my last LWT app, which defaulted to using classical bilingual dictionaries for definitions. that isn’t perfect, but it fails in predictable ways and gives me the possibility to look up the original source for more context from the original linguists if available.

my current app just turned a couple of bilingual dictionaries and an LLM into a slurry of definitions and doesn’t tell you its source for anything. the top result always gets a ✨ regardless of its origin of course

@zzt Yeah, I'm old enough that most of my language learning was with paper dictionaries and tons of hand writing :p Text books and a classrom setting still is the best I think, but it's not something that is really that likely to be achieveable if you just want to learn a language as a hobby.

My learning german was going way faster working in it every day, and not having any way out of it if I want to eat :p but that also is not something that I would suggest anyone do :p

@sotolf @zzt All your base are belong to us.
@zzt Machine translation has never been good enough for prod. It was only for personal use. I use websites in English because the Chinese translations are usually awkward and baffling. What blows my mind is that LLM translations are worse, but now companies are bragging about using it. They didn't brag about using machine translation because that was embarrassing.

@robinsyl @zzt

Most "freely" available machine translation engines have never been good enough, but purpose-built language pair engines (like say DeepL's "classic" backend, RWS, Lionbridge) have been "good enough" for years (well before this current LLM craze).

Yes, I wouldn't trust ChatGPT to translate one of my technical documents into Chinese, that would be a bunch of gibberish. But the good engines are the ones you still pay comparable-to-humans $/word for.

1/

@robinsyl @zzt

And we DO check, IF we have the luxury of $ and time for a human proofreader.
If we ARE lucky enough to get downstream review in the target language we sometimes do a blind test; translate the same thing with the machine and a human; frequently the two are similar level of "not perfect, but good enough; technically correct, conveys the right information." They just make different mistakes.
2/