Writing my first bottom up parser. I want my xml lexer to give the doctype as one token but to do that I need to parse the internal subset -> markup decl -> element decl -> content spec -> children which has
[47] children ::= (choice | seq) ('?' | '*' | '+')?
[48] cp ::= (Name | choice | seq) ('?' | '*' | '+')?
[49] choice ::= '(' S? cp ( S? '|' S? cp )+ S? ')'
[50] seq ::= '(' S? cp ( S? ',' S? cp )* S? ')'
As its grammer. Notice the recursion. I would normally use a recursive decent parser but since I'm using Rust's coroutines I can't have recursive coroutines(as far as I am aware).
I'm using coroutines because this is a streaming parser meant for embedded systems with very little memory. At any point I could run out of input which is when I yield back up to get more. My previous iteration of this was a massive state machine essentially implementing coroutines from scratch.
#rust #embedded #coroutines #xml