i'm at a loss of words after reading a paper about reformatting code using an ML model that has a measured statistical quantity A_c which says how often the reformatted code behaves the same as the original

the "ideal" (their choice of words) case is 64.2%

edit: this got popular without me really intending to, so here's why i'm reading research: i want a semantic style transfer tool that can automatically format a patch "the same as the rest of the file / rest of codebase is formatted" without the rigidity involved in black or rustfmt that i find so hostile to my workflow that i refuse to use them. obviously, i want a tool that generates semantically equivalent code 100.0% of time (ignoring source locations or reading from __file__)

this isn't satire, this is real research published by IEEE/ACM

@whitequark So let me get this straight, IEEE thinks you should count it as a win if rewriting your code by vibing it has less than 15% better odds than a literal coinflip of reproducibility?

edited for clarity and to fix a typo

@disorderlyf @whitequark IEEE and ACM don't do the research nor they think you to do things, they are publishers that own journals and conferences where researchers publish their work
@urixturing @disorderlyf yeah. there are other issues with their models but this isn't one
@urixturing @whitequark I initially thought IEEE was like a standards body specifically for networking, like a hardware W3C. Regardless of who did the research, I thought this was their conclusion. It sounds like I was wrong on both parts
@disorderlyf @whitequark that would be the IETF, who publishes the RFCs (networking standards like email or HTTP)
@disorderlyf @whitequark but honestly I understand why it's very confusing

@whitequark @danlyke so … by "reformatted" I assume you mean aesthetically tidied up, with no change in functionality required?

If I got that right: wtf?

@deborahh @danlyke this is what a reasonable person would understand to be "code style", yes
@whitequark @deborahh @danlyke ie, the sort of thing a linter does?
@nxskok @whitequark @deborahh @danlyke to be fair, according to the paper, replacing for with while loops and vice versa and the like was also the goal
@hennichodernich @danlyke @whitequark @deborahh @nxskok but like wouldn't that be easy to implement?
like

for(expression;bool expression; affectation) that would turn into
expression; while (bool) { //every possible branch inside while would get affectation }

@deborahh @whitequark @danlyke

No.

"there is no existing work that performs full stylization on an arbitrary piece of code. The most common methods are rule-based linters, formatters, which are limited to a few pre-defined style rules"

@mrkeen @deborahh @danlyke I do think that stretching the definition of what "code style" could reasonably refer to until it fits the shape of the research product is a part of the problem here. (Consider that the introduction explicitly refers to the gotofail bug as something the research is supposed to help with, whereas it is plainly evident that it would make that problem only worse.)