New blog post by Hannes Mühleisen and Mark Raasveldt:
Runtime-Extensible SQL Parsers Using PEG
https://duckdb.org/2024/11/22/runtime-extensible-parsers
This post, a shortened version of a CIDR 2025 paper, discusses how parsers in DBMSs could be re-designed using Parser Expression Grammars for extensibility and improved error reporting.
Runtime-Extensible SQL Parsers Using PEG
Despite their central role in processing queries, parsers have not received any noticeable attention in the data systems space. State-of-the art systems are content with ancient old parser generators. These generators create monolithic, inflexible and unforgiving parsers that hinder innovation in query languages and frustrate users. Instead, parsers should be rewritten using modern abstractions like Parser Expression Grammars (PEG), which allow dynamic changes to the accepted query syntax and better error recovery. In this post, we discuss how parsers could be re-designed using PEG, and validate our recommendations using experiments for both effectiveness and efficiency.