Your regular reminder that #MachineLearning models (or #AI, if you insist on ignoring the meaning of 'intelligence') will repeat and reinforce any #bias that already exists in society.
Can you find any other biases like this in #GoogleTranslate?
@simon I did some research into language biases for a talk and it can get insanely complicated or… not
@cliener Yeah, I'm sure it's very complicated. This looks like a pretty clear case of the model just repeating a bias from the training data though.
@simon in this case definitely. “You’re wearing your bias on your sleeve”
@simon It's unsurprising, but still depressing.

@simon I saw an example where the finnish "hän" (a gender neutral pronoun) was being translated as either "he" or "she". E.g. if "hän" was a doctor it'd be translated as "he", nurse would become "she".

(AFAIK finnish doesn't actually have gendered pronouns, I remember an interview with a Norwegian translator who'd worked on several finnish books. They'd even contact the author for clarification only to get a "well, I never really thought of a gender for that minor character" back.)

@Uglesett @simon In spoken Finnish, it could be argued the most commonly used personal pronoun is - in addition to being gender neutral - species neutral and also neutral with respect to (in)animateness: "se", translates to "it". Although for species neutrality there are exceptions for pets: "hän" is used often for pets although the same person may use "se" (it) for humans.

Gender neutral "hän" is the normal for written language though. And true, there are no gendered pronouns.

@Uglesett @simon Oh interesting just tried this one out though and...

@brizee @simon IIRC it got a bit of publicity back when I saw it, so it makes sense that they'd do something about it.

Nice to see, but I guess this is something that needed manual fixing.

@brizee @Uglesett @simon Clearly different levels: for Estonian it is a bit more hidden, but it is there. I think it used to work like that for Finnish before. Turkish has the same interface as Finnish. Filipino / tagalog is also gender-neutral, but doesn’t even do the drop-down.
@jhilden @Uglesett @simon At a guess the drop-down is what you get when the ML picks up there are multiple translations and it doesn't make a distinction that those translations are inferring gender. The bespoke UI is a secondary process then that might just be manual word lists?
@simon Google does something similar with the singular word "sig" in Swedish, which translates to "oneself" properly. It automatically defaults to he/him, which tells me that gender bias is literally encoded into the translation 🙃
@simon Does Danish use the same word for boyfreind and girlfriend?

I remember reading something similar about Hungarian
@simon Did the same test. Confirmed.
@simon German has female forms for professions, but Translate does not recognize this. Even in a sentence which specifically mentions that the person discussed is female. (teacher, male: Der Lehrer — female: die Lehrerin).
@simon That’s the same with the ATS systems that each business has in hiring people. I just talked to a Career Coach about it and he calls that software being lazy. You can't hire someone because the words on a resume matches yours. And that software does discriminate against women and minorities because of some keywords.
@simon tried the same in Norwegian, and added one. Can confirm some bias.
@simon This is too easy...
@kefir @simon I don’t get it
@vtmicah @simon in Norwegian there's a gender neutral word "kjæreste" that can be either boyfriend or girlfriend. Google translate's gender bias shows when translating "kjæreste" in different contexts, like having different professions.
@kefir @simon I assumed er was gender neutral. Yikes.

@simon Interesting. Things get difficult for Google translate when it has multiple opposing sexist biases in a single sentence.

"kjæreste" is gender neutral, meaning girlfriend or boyfriend.
"hjemmeværende" is gender neutral, meaning staying at home.
(There's no mention of kids in the original sentence.)
The translation flips from girlfriend to boyfriend when writing "politi" (police).

@kefir @simon Easy to confirm this in Finnish, where "hän/hänen" is gender neutral.
@ristohietal @kefir @simon What I don't get is, why not hard code gender neutral languages to translate into they/them which as I understand is the defacto "can't know so won't assume" statement anyway.
@ristohietal @kefir @simon Obviously it still fails at the more complex scenarios where there technically is no existing correct translation, such as the original one. But would be nice to at least have it caveated as "no direct translation exists due to the target language lacking the needed words"
@simon But also: Google's grasp of Danish, is in my experience, really low. What there is, seems to be inherited from the various Swedish language projects.
@simon When trying this myself with your entries: exactly the same results as you found IF not logged into with my google account. Order doesn't matter, so it doesn't randomly alternate between bf or gf.
When I am logged in, all the sentences are translated as "My boyfriend ...". The default translation of 'Min kæreste' seems to be 'My girlfriend' until I add a verb. Then it switches to "My boyfriend" every single time.
Not sure what this says about my google account. 😅
@simon Is “min kæreste” neutral in Danish?

@simon how about sweetheart, googletranslator?

because otherwise the love is always gone

@simon I think calling this "learning" is also misleading, as learning implies some kind of understanding. When I was in computational linguistics school we called this "statistical machine translation", and I think that's still a better term.
@simon yup, hierarchy will perpetuate itself into everything if you don't actively try and stop it

@simon hoooly shit. I had to try it myself and sure enough, it's true. you can keep going too like Min kæreste er læger, min kæreste er sygeplejerske. it just keeps going lol

I think it's time for Google Translate to translate kæreste to Partner instead. that would be a quick workaround at least.

@simon @kookie if you haven't read Weapons of Math Destruction by Cathy O'Neill it should be on your todo list! (I say this to everyone I know tbh)
@simon I tested the same text with Google Translator, and then with DeepL. I got your results with Google Translator. With DeepL and multiple sentences, it always translates kæreste as boyfriend. With a single sentence, it also translates it using the masculine, but includes the feminine as an alternative translation option. I feel this way they can avoid the bias.
@danielhz @simon apart from the fact that they use the masculine as the default…
@juandesant @simon Of course! This way, they only solve the bias between gender and stereotypes.
@danielhz @juandesant @simon swapping to "partner" conveys the meaning without gendering and might be a better approach
@WestCoastChelle @juandesant @simon I agree with you, "partner" is gender-neutral. However, it is less specific (it can be a business partnership). There are several cases where gender-neutral terms are being promoted. For example, salesperson instead of salesman. This requires not only changing the translation, but the language. In Spanish, there is a trend to introduce new neutral words.
@danielhz @WestCoastChelle @simon but it is more natural and grammatical in English. In Spanish, neutral is written the same as masculine. The neutral terms being proposed change the word endings in a way which is not grammatical, and makes it more difficult to get traction.

@juandesant @danielhz @simon

Yeah the whole "neutral = masculine" is a whole other issue with gendered languages.

@WestCoastChelle @juandesant @simon I don't consider such gender-neutral proposals as grammatically incorrect because grammar changes. I cannot answer the question if Spanish can be turned into a non-gendered language because this involves too many variables, but sounds difficult because it would be such a big change.
@simon DeepL is a bit confused :)
@simon teach it my boyfriend is a catboy
@simon it's also just bad translation, dearest.
@simon Wow, this is wild
@simon DeepL seems to be more consistent here.
Note that, unlike the desktop app or possibly other lamguage combinations, I couldn't find a way to apply a different translation for "Min kæreste".
@simon I've been doing some translations of artist bios on Spotify to have something to add on last.fm and it's interesting to see what articles Google's machine translation translates with male and female articles.
@simon I can't reproduce any systematic bias. Seems pretty random what gender it assumes.
@simon a program is only as good as the people who write it. In other words, their biases, ( concious or unconscious), will be reflected in the program.
@samhainnight @simon Well in this case its probably not so much software developers that introduce the bias its a bias that exists in the corpus of text that the software learns from. So its a bias that is formed because the gender balance in the whole of the materiale is off. Us feminists have known this to be true for many areas of text and other media but the situation with AI makes it very visible
@simon google translate sucks not sure there is anything else than artificial . But would be interesting to see if deepL has the same bias.