Mastodawn

King Naga Calyo Lucere-Delphi Aug 11, 2025

Last night thanks to @orman I had the epiphany of why:

XML parsers don't use or even need regex to parse XML in the first place.

XML parsers go through the text one char at a time, and if they encounter a <, >, </, or />, those chars form flags that signal to the parser if it's entering or leaving a tag, and whether it's a closing or self-closing tag respectively, all of which changes the parsing rules and builds a node tree on the fly.

This will be useful for the UTC.

Show thread

Spring Jo 🥚

🍀Aug 11, 2025

@dragonarchitect @orman regular grammar vs context-free grammar

Show thread

King Naga Calyo Lucere-Delphi

@ShadowJonathan @orman I am actually too sleep-deprived this morning to figure out which is which. Can you clarify please?

Show thread

Spring Jo 🥚

🍀Aug 11, 2025

@dragonarchitect @orman regular expressions = regular grammar, while html can only be parsed by context free grammars, or more complex than that :3

https://en.m.wikipedia.org/wiki/Chomsky_hierarchy

Chomsky hierarchy - Wikipedia

Show thread

Orman Aug 11, 2025

@ShadowJonathan @dragonarchitect well formed trees with matching end tags are actually context sensitive IIRC, but since that's even more hairy you just cheat and check if the end tag matches the start after parsing