i'm at a loss of words after reading a paper about reformatting code using an ML model that has a measured statistical quantity A_c which says how often the reformatted code behaves the same as the original

the "ideal" (their choice of words) case is 64.2%

edit: this got popular without me really intending to, so here's why i'm reading research: i want a semantic style transfer tool that can automatically format a patch "the same as the rest of the file / rest of codebase is formatted" without the rigidity involved in black or rustfmt that i find so hostile to my workflow that i refuse to use them. obviously, i want a tool that generates semantically equivalent code 100.0% of time (ignoring source locations or reading from __file__)

@whitequark i can imagine a few cases where reformatting a code could change behavior (mostly related to language constructs that capture source locations) so I think I would be willing to accept as low as 99.99%
@porglezomp you'll love Fig. 6
@porglezomp there's explanatory text that says the issue with the identifier "found" is that it's rarely used
@whitequark @porglezomp also cout << “\n”; is not equivalent to << endl; endl flushes the stream. “\n”; does not. They’d need to write << “\n” << flush; to get the same behavior. Which is annoying to write, which is precisely why endl exists!

@jonathankoren @whitequark @porglezomp I just thought I’d take a moment to say that I thought that i was barely coping with things today, but thanks to figure 6 I clearly am not.

I’m not even going to ask about a use case. Taa for now I’m off to disintegrate.