Last night thanks to @orman I had the epiphany of why:

XML parsers don't use or even need regex to parse XML in the first place.

XML parsers go through the text one char at a time, and if they encounter a <, >, </, or />, those chars form flags that signal to the parser if it's entering or leaving a tag, and whether it's a closing or self-closing tag respectively, all of which changes the parsing rules and builds a node tree on the fly.

This will be useful for the UTC.

@dragonarchitect @orman regular grammar vs context-free grammar

:3

@ShadowJonathan @orman I am actually too sleep-deprived this morning to figure out which is which. Can you clarify please?

@dragonarchitect @orman regular expressions = regular grammar, while html can only be parsed by context free grammars, or more complex than that :3

https://en.m.wikipedia.org/wiki/Chomsky_hierarchy

Chomsky hierarchy - Wikipedia

@ShadowJonathan @dragonarchitect well formed trees with matching end tags are actually context sensitive IIRC, but since that's even more hairy you just cheat and check if the end tag matches the start after parsing