This is a very funny page https://unicode-explorer.com/list/large
List of Super-Wide Symbols - Unicode Explorer

%s, Unicode symbol table, copy and paste

@mcc the glyph '﷽' was mentioned in a discussion I saw recently about computing character widths as *rendered* in a terminal, and fundamental futility of this task
@SnoopJ @mcc it might have been a mistake to put "an entire prayer, or perhaps a dozen angels dancing on the head of a pin" as a single code point
@rotopenguin @mcc alas, the Unicode Consortium has limited authority over the evolution of human language in general
@SnoopJ @rotopenguin in my opinion, they have much more than they should
@mcc @rotopenguin oh? Anything in particular?

@SnoopJ @rotopenguin Well, for example, if the people of China decide to invent a new hanzi, effectively now they just can't

Or they can, but they have to ask someone for permission. They'd have to do some complex set of steps with a PUA codepoint. Before computer encoding they could just draw it

@mcc @rotopenguin nothing stops them from doing it and not encoding it (e.g. seal forms) but sure the reality is that someone's gonna want to put the thing on the computer at some point, and someone's gonna be in charge of that encoding. Not sure that problem has any solution other than "fuck it all text is purely graphical now"

I'd point to U+32FF SQUARE ERA NAME REIWA as an example of UTC acting in good faith here, but I don't follow along very closely with the massive volume of communication with their colleagues working on standards bodies in China. What I have read makes it seem like a pretty good working relationship

@SnoopJ @mcc @rotopenguin there is the option of looking at how folks actually go about composing new characters out of existing ones (already a thing people study) and construct an encoding for that.

that wouldn’t give complete flexibility, but it could be similar to what alphabetic language users have.

@tryst @mcc @rotopenguin it's also something Unicode already has, for languages where it is clear how to specify it without making a mess (e.g. the Hangul Jamo block and associated combiner semantics)

As the ligature above demonstrates, this is a hard problem, and almost always harder than one thinks

@SnoopJ @tryst @rotopenguin so for the record this exists except (1) arguably it doesn't exist and (2) you can't add new radicals (although arguably you can co-opt radicals from anything that exists in unicode already) https://social.treehouse.systems/@rcombs/116101096263014337
Ridley @ WATCH LYCORECO (@[email protected])

@Elizafox @[email protected] I regret to inform you, https://en.wikipedia.org/wiki/Chinese_character_description_languages#Ideographic_Description_Sequences though afaik no implementation actually renders these sequences composed

Treehouse Mastodon
@mcc @tryst @rotopenguin oh yes def not for hanzi
@mcc @SnoopJ @rotopenguin yep :) i was thinking something that captures a bit more than that (like the example in wikipedia of ⼟ vs ⼠).