Mastodawn

i'm at a loss of words after reading a paper about reformatting code using an ML model that has a measured statistical quantity A_c which says how often the reformatted code behaves the same as the original

the "ideal" (their choice of words) case is 64.2%

edit: this got popular without me really intending to, so here's why i'm reading research: i want a semantic style transfer tool that can automatically format a patch "the same as the rest of the file / rest of codebase is formatted" without the rigidity involved in black or rustfmt that i find so hostile to my workflow that i refuse to use them. obviously, i want a tool that generates semantically equivalent code 100.0% of time (ignoring source locations or reading from __file__)

Show thread

cassie 18h ago

@whitequark i can imagine a few cases where reformatting a code could change behavior (mostly related to language constructs that capture source locations) so I think I would be willing to accept as low as 99.99%

Show thread

✧✦Catherine✦✧18h ago

@porglezomp you'll love Fig. 6

Show thread

Cassandra is only carbon now 18h ago

@whitequark @porglezomp I'm spitting out my drink at j++ → j--. Holy shit.

Show thread

sabik 16h ago

@xgranade @whitequark @porglezomp
I think reversing the `j` for loop is actually wanted by them? It's labelled "ground truth", and it is a potential valid optimisation

Show thread

Inga stands with 🇺🇦 🇵🇸8h ago

@sabik @xgranade @whitequark @porglezomp but they also changed the boundaries! "Input" checks all values from 2 to i+2 inclusive; but "ground truth" just trows i+2 iteration out.

Show thread

sabik

@IngaLovinde @xgranade @whitequark @porglezomp
`i` starts from 1 in the "ground truth" version

Show thread

Inga stands with 🇺🇦 🇵🇸8h ago

@sabik @xgranade @whitequark @porglezomp ah I see, so the new i is just the old one + 1