i'm at a loss of words after reading a paper about reformatting code using an ML model that has a measured statistical quantity A_c which says how often the reformatted code behaves the same as the original

the "ideal" (their choice of words) case is 64.2%

edit: this got popular without me really intending to, so here's why i'm reading research: i want a semantic style transfer tool that can automatically format a patch "the same as the rest of the file / rest of codebase is formatted" without the rigidity involved in black or rustfmt that i find so hostile to my workflow that i refuse to use them. obviously, i want a tool that generates semantically equivalent code 100.0% of time (ignoring source locations or reading from __file__)

@whitequark And this is how research money is lit on fire, I guess. Why else conduct research into ML for a task that has had obvious, deterministic, efficient and well-tested solutions for decades?
@lu_leipzig @whitequark i would honestly be more interested into a deterministic but very configurable formatter, and a ml model to, from sample code, write a config for you, and you just do minor adjustments to it, generally all code styles stand in just a few hundred switches
@SRAZKVT @lu_leipzig this would be ~easy to do but convincing people to implement and maintain "a few hundred switches" has been incredibly difficult; my motivation is exactly that rustfmt maintainers have been consistently unwilling to entertain that
@SRAZKVT @lu_leipzig if every language i cared about (at this point: mainly rust, python, and c++) had highly configurable formatters i would not care to spend as much effort as i'm planning to on ml research

@whitequark @lu_leipzig most tooling devs today seem to believe in a one size fits all with no configurability, kind of sad

also i think the problem of "but if every codebase isn't formatted exactly the same" is way overblown, once you start reading the code it really doesn't take long to adapt to a new style, barely a few minutes from my experience

@SRAZKVT @lu_leipzig there is a more real problem of "some people bounce off contributing if you ask them to fix style"
@whitequark @lu_leipzig yea, such as: the code being shit
@SRAZKVT @whitequark @lu_leipzig in general, I agree, but I almost wish I could have just told the software teams that I worked with a couple years ago “this is style for this language, just drank with it” instead of having hours of meetings about clang-format settings.
@c0dec0dec0de @SRAZKVT @lu_leipzig I think it's different for corporate. I don't really care about most corporate code I touch (that isn't already OSS I maintain that is), it's completely whatever. I care a lot about this in projects I'm invested in success of
@whitequark @SRAZKVT @lu_leipzig I get that. At the end of it, I was just like pick something, I don’t care. This will make your code more readable regardless what you pick and minimize diffs in some cases.