| Work | https://lontar.eu |
| Work | https://lontar.eu |
We just dropped two super useful articles:
- Addresses look wildly different around the world (order, details, even postal codes). This one explains the chaos and how to build forms/apps that don't frustrate international users.
- Easy guide to JavaScript's built-in Intl API, with real examples for formatting dates, numbers, currencies, etc., the right way for any language/region.
🔗 Addresses: https://www.w3.org/International/questions/qa-address-formats
🔗 Intl guide: https://www.w3.org/International/articles/intl/index
I have been living inside of LinkedIn and got distracted, but here now, to share the new project that Paley Dreier and I started at the end of last year and are starting full force this year. We would love to hear from you and about your challenges running a foundry in all areas. We are here to make the path clearer and more exciting.
I might have called your “shaping order” the “initial glyph sequence”, but as long as we know what we mean, the name doesn’t matter that much to me. And yes, this point in shaping is important because it’s the first time the font gets to see and influence what’s going on.
No, “encoding order” is not tied to any particular step. I use the term to specify how text for a given script *should be* encoded, what we consider “correct”. An actual character sequence may not conform to the specified encoding order. That’s why the USE and some other shaping engines validate their input and insert dotted circles where text does not conform to the specified encoding order.
“The moment when layout passes from character-level processing to glyph-level processing” isn’t well-defined in the Universal Shaping Engine – see USE bug 270.
An interesting point to look at glyph order is just before application of lookups starts. By then shaping systems have processed the input character sequence in several steps:
• (optional) apply full or partial Unicode normalization
• (USE) apply specified partial Unicode normalization
• (some shaping engines, incl. USE) inserted dotted circles to ensure valid clusters
• (some shaping engines, not USE) decompose, insert, reorder specific characters
• map characters to glyphs per cmap, possibly applying (de-)composition
Would that be your shaping order?
I think there should be well-defined encoding orders for Brahmic scripts shared between all stages of processing. Sadly the Unicode Standard doesn’t define such encoding orders, or defines them incompletely, or (for Khmer) defines one that’s unworkable. So for the scripts supported by the Universal Shaping Engine the encoding orders defined by the USE are currently your best bet.
I wrote in much more detail about this topic in “Order and disorder in Unicode”:
I do mean “encoding order”, the order in which characters should appear within a Unicode character sequence. The Universal Shaping Engine defines its cluster model in terms of Unicode character properties, and validates before any glyph-level processing, so it seems to agree with that.
One issue is the USE’s lack of compatibility with Unicode normalization. The USE applies some decompositions, and rendering systems may apply more steps (decomposition, reordering, composition) before passing text to the USE. However, the USE is not designed to guarantee that text is rendered identically independent of normalization. See USE bugs 905 and 568.
Do you know of other issues?
I’m not familiar with “shaping order” – can you point me to a definition?
Updated article: Encoding orders of Brahmic scripts
Documents the encoding orders that the OpenType Universal Shaping Engine assumes for the Brahmic scripts it supports. Understanding encoding orders is necessary when rendering or otherwise interpreting text in these scripts, as well es when entering text using input methods or otherwise generating text.
Updated for Unicode 17.0 and latest USE data.
https://lontar.eu/en/notes/encoding-orders-of-brahmic-scripts/index.html
A smart component trick for adjusting spacing in Devanagari conjuncts while keeping automatic alignment intact in Glyphs
https://muthunedumaran.com/2025/12/30/a-spacer-trick-for-conjunct-spacing-in-glyphs/