i'm at a loss of words after reading a paper about reformatting code using an ML model that has a measured statistical quantity A_c which says how often the reformatted code behaves the same as the original

the "ideal" (their choice of words) case is 64.2%

edit: this got popular without me really intending to, so here's why i'm reading research: i want a semantic style transfer tool that can automatically format a patch "the same as the rest of the file / rest of codebase is formatted" without the rigidity involved in black or rustfmt that i find so hostile to my workflow that i refuse to use them. obviously, i want a tool that generates semantically equivalent code 100.0% of time (ignoring source locations or reading from __file__)

@whitequark i can imagine a few cases where reformatting a code could change behavior (mostly related to language constructs that capture source locations) so I think I would be willing to accept as low as 99.99%
@porglezomp you'll love Fig. 6
@porglezomp there's explanatory text that says the issue with the identifier "found" is that it's rarely used
@whitequark I really love “not changing endl to \n” listed as a style issue when that changes buffer flushing behavior.
@porglezomp did you notice that one of the If tokens is capitalized in the output
@whitequark @porglezomp also cout << “\n”; is not equivalent to << endl; endl flushes the stream. “\n”; does not. They’d need to write << “\n” << flush; to get the same behavior. Which is annoying to write, which is precisely why endl exists!

@jonathankoren @whitequark @porglezomp I just thought I’d take a moment to say that I thought that i was barely coping with things today, but thanks to figure 6 I clearly am not.

I’m not even going to ask about a use case. Taa for now I’m off to disintegrate.

@whitequark @porglezomp I'm spitting out my drink at j++ ­→ j--. Holy shit.
@xgranade
I think the right is the output from running the model on the right code (center being the "desired output"). So it's not changing the semantics of the loop, just not not changing the loop order to match their desired outcome.

Given that loop order can have behavioral impact (and I would never trust an LLM to be able to tell if it did), that seems like the correct behavior to me though
@whitequark @porglezomp
@xgranade @whitequark @porglezomp
I think reversing the `j` for loop is actually wanted by them? It's labelled "ground truth", and it is a potential valid optimisation
@sabik @xgranade @whitequark @porglezomp but they also changed the boundaries! "Input" checks all values from 2 to i+2 inclusive; but "ground truth" just trows i+2 iteration out.
@IngaLovinde @xgranade @whitequark @porglezomp
`i` starts from 1 in the "ground truth" version
@whitequark @porglezomp This looks like it could join the current crop of "DLSS5 off/DLSS5 on" memes.