Usually, when faced with a performance problem, you would look for a smart way to do less work. Today we do not do that. Today we sprinkle #[inline] all over the code until it tastes good
Welp, that was the trick. The GAT enabled version of nom gets separate function for each combinator unless I explicitely tell rustc to inline them (which was happening automatically with nom 7).
Now perf is on par or a little better with most parsers, but far better for those with more complex error types
The point of that work was to see what happens if we just avoid generating output values or errors that we know will be ignored. Right now it seems viable, I expect it will bring huge gains with a bit more work
(a few more #[inline(always)] later...) I just noticed a very interesting result. This is a screenshot of json benchmark results, VS nom 7. The non verbose (small error type) version can be slightly faster, the verbose version is way faster than in nom 7, great.
But what's incredible here is that the verbose version is now on par with the fast one!
You can get a complex error type with minimal overhead!!
now I just need to get a more ergonomic error type and that new version should be amazing
I think I have a rough plan for how errors will change, it's inspired from suggestions here and there.
The main point: nom's errors are there for parser control flow, that's not the right place to build nice error messages.
So it will accumulate errors elsewhere, likely in a wrapper for the input type.
So we need:
- spans (nom_locate, etc)
- statefulness
- catch and convert parser errors to user friendly errors
- error recovery
- a good example error type leveraging it
@geal The more time is spent on parsing, the less impactful are the type of errors? Hmm that’s interesting. Can you try with an even larger file?
@hywan what happens here is that errors that will end up ignored, like in the opt combinator, are just not generated. That accounts for a lot of the work nom performs with complex error types. Here the canada.json file is 2.15MB, that's already large.
So with this change, complex error types have minimal overhead when there's no error
@geal How does it compare to serde_json now?
@hywan way slower. That benchmark is there to compare between nom versions, not really with other tools. Some parts of the parser are intentionally slow

@geal are the codegen issues/fixes (like "separate function for each combinator", and various inline attributes) avoidable with different compiler flags ? eg with LTO, single CGU, etc.

If that GAT-enabled pattern becomes popular outside of chumsky and nom, it would be nice that the rustc codegen is not subpar at the very least. Or being able to limit the issues to a single point like only the crates themselves, and prevent all of their downstream users from encountering the issues as well.

@lqd I have not tried other compilation options, I should test them. Here I mainly tag combinators that should obviously be inlined, like opt or map