Mastodawn

This is a very funny page https://unicode-explorer.com/list/large

List of Super-Wide Symbols - Unicode Explorer

%s, Unicode symbol table, copy and paste

@mcc the glyph '﷽' was mentioned in a discussion I saw recently about computing character widths as *rendered* in a terminal, and fundamental futility of this task

Show thread

ROTOPE~1 ⭐️Feb 20

@SnoopJ @mcc it might have been a mistake to put "an entire prayer, or perhaps a dozen angels dancing on the head of a pin" as a single code point

Show thread

SnoopJ Feb 20

@rotopenguin @mcc alas, the Unicode Consortium has limited authority over the evolution of human language in general

Show thread

mcc Feb 20

@SnoopJ @rotopenguin in my opinion, they have much more than they should

Show thread

SnoopJ Feb 20

@mcc @rotopenguin oh? Anything in particular?

Show thread

mcc Feb 20

@SnoopJ @rotopenguin Well, for example, if the people of China decide to invent a new hanzi, effectively now they just can't

Or they can, but they have to ask someone for permission. They'd have to do some complex set of steps with a PUA codepoint. Before computer encoding they could just draw it

Show thread

SnoopJ Feb 20

@mcc @rotopenguin nothing stops them from doing it and not encoding it (e.g. seal forms) but sure the reality is that someone's gonna want to put the thing on the computer at some point, and someone's gonna be in charge of that encoding. Not sure that problem has any solution other than "fuck it all text is purely graphical now"

I'd point to U+32FF SQUARE ERA NAME REIWA as an example of UTC acting in good faith here, but I don't follow along very closely with the massive volume of communication with their colleagues working on standards bodies in China. What I have read makes it seem like a pretty good working relationship

Show thread

trystimuli Feb 20

@SnoopJ @mcc @rotopenguin there is the option of looking at how folks actually go about composing new characters out of existing ones (already a thing people study) and construct an encoding for that.

that wouldn’t give complete flexibility, but it could be similar to what alphabetic language users have.

Show thread

SnoopJ Feb 20

@tryst @mcc @rotopenguin it's also something Unicode already has, for languages where it is clear how to specify it without making a mess (e.g. the Hangul Jamo block and associated combiner semantics)

As the ligature above demonstrates, this is a hard problem, and almost always harder than one thinks

Show thread

mcc

@SnoopJ @tryst @rotopenguin so for the record this exists except (1) arguably it doesn't exist and (2) you can't add new radicals (although arguably you can co-opt radicals from anything that exists in unicode already) https://social.treehouse.systems/@rcombs/116101096263014337

Ridley @ WATCH LYCORECO (@[email protected])

@Elizafox @[email protected] I regret to inform you, https://en.wikipedia.org/wiki/Chinese_character_description_languages#Ideographic_Description_Sequences though afaik no implementation actually renders these sequences composed

Treehouse Mastodon

Show thread

SnoopJ Feb 20

@mcc @tryst @rotopenguin oh yes def not for hanzi

Show thread

trystimuli Feb 20

@mcc @SnoopJ @rotopenguin yep :) i was thinking something that captures a bit more than that (like the example in wikipedia of ⼟ vs ⼠).