Mastodawn

mcc Feb 28

Wait hold on I just realized. Is

八人入

A reasonable Chinese sentence

Show thread

mcc Feb 28

…Also waaaa why did the character rendering change so much when I copied from Pleco to Tusky. Who gave eight a hat

Show thread

mcc Feb 28

In Pleco they look like this. I don't know if this is a different but regular hanzi font or if the CJK unification is messing me up somehow

EDIT: I currently think Tusky is showing me Japanese character variants https://social.mildlyfunctional.gay/@artemist/116146010272716935

Show thread

mcc Feb 28

This is what Tusky looks like.

Show thread

mcc Feb 28

WAIT WTF this is an actual Chinese IME and it seems to be showing me Japanese characters. Ok I think Lenovo is fucking with me, one minute

Show thread

mcc Feb 28

Okay I now believe the problem is neither Tusky nor Lenovo but rather that Android is not a serious product and never has been. It seems Android may outright refuse to show scripts unless you've whitelisted the language. Problem: I think this menu is asking me which version of Chinese I want but the menu is in Chinese. I want to look at Chinese text so I can learn Chinese. I don't know it yet. I feel like I'm playing an adventure game.

* I may explore a PR later anyway.

Show thread

mcc Feb 28

Actually I'm pretty sure 简 already means simplified, so I selected simplified at the top level, and this second menu is asking… I don't know. Locale? TTS dialect?!

Show thread

mcc Feb 28

Update: I solved the problem, not by adding Chinese as an alternate language for my Android, but by deleting Japanese as an alternate language. Not sure when I added Japanese in the first place or what I was trying to accomplish but I question Google's decision that informing it I may look at text in Japanese makes it conclude I DEFINITELY won't be looking at Chinese!

Show thread

abadidea Feb 28

@mcc unfortunately there’s not really a good solution to this problem and Android, like everyone else, just has to pick a resolution method and stick with it. If you’ve heard of “Han Unification,” well it sounds like something that happened violently in 2200 BC but actually it happened quite recently in a Unicode meeting room and it causes this exact specific intractable issue

Show thread

groxx Feb 28

@0xabad1dea @mcc I suppose the only actually reliable approach would be to store the IME locale per character or something so that it can be accurately rendered as it was written... or are these truly identical graphemes, and there's no chance of confusion in context? Even when people use multiple languages simultaneously?

(late edit after reading a lot more: ah, I see they DID just add a variant-selector character to effectively specify the locale... that seems a bit unlikely to gain major use, but technically I like it I guess)

Maybe one day we'll have UTF-8-2 and it'll just be infinitely extendable, rather than using a limited length prefix.

Show thread

mcc Feb 28

@groxx @0xabad1dea There are various existing solutions but just because the solutions exist does not mean people follow them corectly

Show thread

groxx

@mcc @0xabad1dea definitely agreed. even technically, it seems very unlikely to me that any IME is going to choose to, like, add variant selectors *to every single character* and confuse their users when it's blended with other text or in a size-limited scenario. those characters already take up a ton of space, making it worse won't go over well.