Mastodawn

Marcin Wichary Sep 15, 2024

I built this proof of concept of a tool called https://text.makeup. It is meant to be a friendly Unicode explainer – meant not just for Unicode nerds, but nerds of any kind. Useful for debugging, but also learning.

You can go there now to play (much more fun on desktop!), but I also recorded a 5-minute video that explains it further.

I am curious: Does this feel like fun? Is it worth building out for real? What would you like to see in it if so?

Text makeup

Show thread

Nelson Minar Sep 15, 2024

@mwichary the thing I most want to see is a per-codepoint index into the history of the writing systems. The Unicode proposals contain a remarkable amount of scholarship on writing systems, mostly buried in hard to find PDFs. The excitement about ꙮ was very popular but articles about things like Linear-B or Deseret are also fascinating and hard to access.

Show thread

Ted Mielczarek Sep 16, 2024

@nelson @mwichary I have long wanted Unicode charts that contain a "how did this glyph wind up in Unicode?" explainer. Even being able to track things back to "it was part of WingDings" is useful information!

Show thread

Marcin Wichary Sep 16, 2024

@tedmielczarek @nelson I would actually love that part, too! I was always wondering how hard it is to research that from meeting files etc.

Show thread

Ted Mielczarek

@mwichary @nelson the documents are basically all there, from when I've dug into this at times, but it's an entirely manual process to track them down. This doesn't account for symbols that came in as part of a bulk import from elsewhere, like the mysterious angzarr ⍼.

Show thread

Ted Mielczarek Sep 16, 2024

@mwichary @nelson For new additions in the upcoming version of Unicode, the Pipeline page is great, as it links to sources: https://www.unicode.org/alloc/Pipeline.html

The versioned charts page for each release version is nice as it highlights additions/changes, but doesn't link sources: https://www.unicode.org/charts/PDF/Unicode-16.0/

Proposed New Characters: The Pipeline

Show thread

Ted Mielczarek Sep 16, 2024

@mwichary @nelson What I don't know (and I guess I should ask someone actually involved in the Unicode standards work) is whether they maintain historical copies of that Pipeline page or whether fetching the Wayback Machine copy is all we've got. e.g. https://web.archive.org/web/20240526110216/https://www.unicode.org/alloc/Pipeline.html shows the additions for Unicode 16 with references, like you can follow the trail of references for "Graphic shapes for legacy computing" all the way to: https://www.unicode.org/L2/L2021/21235-terminals-supplement.pdf

Proposed New Characters: The Pipeline

Show thread

Nelson Minar Sep 16, 2024

@tedmielczarek @mwichary That pipeline page is nice! I recall seeing similar references in past communications going back as far as 20 years. Maybe in the details of Unicode releases, or in mailing list archives.
It is still a manual process to assemble it all. Would be a really useful scholarly project but a lot of work. (And I hear ya Marcin, if it doesn't interest you...)

Show thread

Ted Mielczarek Sep 16, 2024

@nelson @mwichary it's amazing how many fractal dimensions of rabbit holes there are to fall down just within the domain of Unicode!

Show thread

Marcin Wichary Sep 16, 2024

@tedmielczarek @nelson It’d be super cool to automate it so you can plug in a codepoint and it would just spit out all the results… unless that exists already.

Show thread

Nelson Minar Sep 17, 2024

@mwichary @tedmielczarek I asked an expert about this back in 2015 and got this reply:

I have long wanted to create an index of all UTC/WG2 documents, which would go part way to achieving what you want, but I have not had the time to do so unfortunately. For most characters the paper trail is quite straight forward, but for some characters the history is convoluted, and you have to delve into the minutes and resolutions of UTC and WG2 meetings to determine what happened.

Show thread

Marcin Wichary Sep 17, 2024

@nelson @tedmielczarek I’m interested in a) good stories, and b) stuff that helps understand other things (“how we got there”).

For example, very curious what made “symbols for legacy computing” made it in this year particularly!

Show thread

Marcin Wichary Sep 16, 2024

@tedmielczarek @nelson Yeah, so far in my tool I did focus on stories that helps you understand particular issues rather than stories of glyphs in general. But I’d be interested in digging deeper.