i love it when my program's execution conditions are "still active if unsure"
what in the fuck kind of world have we arrived at where in the optimal conditions where the "program" works fully as intended the program is "RUNNING" as long as the execution environment is "NOT SURE" (????????) if the program is "RUNNING"
this is what counts as benchmarking, with code links because this shit makes literally no sense and boils your brain if you try and read it:
validate_email, is_valid_email, etc. if any of those names is defined, get the function by fucking evaling the name. globals() DICT AND SEE IF THAT IS AN EMAIL VALIDATION FUNCTIONthere is an as-yet unmerged PR to "fix the correctness benchmarks" and a "robustness audit" that is wonderful:
https://github.com/DietrichGebert/ponytail/pull/83

Two-part response to #65 ("Impact on model performance?"). Part 1 — fix the correctness gate Two bugs in the correct gate were under-reporting correctness for terse models — the likely so...