Mastodawn

Hal Daumé III Nov 22, 2022

New blog post on the NeurIPS'21 experiment re authors' perceptions of their own papers!

https://blog.ml.cmu.edu/2022/11/22/neurips2021-author-perception-experiment/

Key findings:

1) Authors significantly overestimate their papers' chances of acceptance. By like a LOT.

How do Authors' Perceptions about their Papers Compare with Co-authors’ Perceptions and Peer-review Decisions?

Alina Beygelzimer, Yann N. Dauphin, Percy Liang, Jennifer Wortman Vaughan(NeurIPS 2021 Program Chairs) Charvi Rastogi, Ivan Stelmakh, Zhenyu Xue, Hal Daumé III, Emma Pierson, and Nihar B. Shah There is a considerable body of research on peer review. Within the machine learning community, there

Machine Learning Blog | ML@CMU | Carnegie Mellon University

Show thread

Ted Dunning Nov 23, 2022

Alternative interpretation: Above a relatively low threshold, acceptance is randomized due to lack of space leading to loss of predictive power in that regime. @hal

Show thread

Hal Daumé III Nov 23, 2022

@ted_dunning I may be missing something (correct me!), but I think in order to get that, most respondents would've had to interpret the question as about the QUALITY of the paper, rather than its CHANCE of acceptance.

It's entirely possible that what you're saying is true - in which case, if one believed their paper was "good enough" they should have answered ~30% - but that's not what happened, which at least suggests people don't *think* that's the case.

Show thread

Ted Dunning Nov 23, 2022

Who knows what lurks in the hearts of authors?

I have never been sure about how people truly interpret questions. My users have confounded me far too many times for me to have illusions that the question asked is the question answered.

@hal

Show thread

Hal Daumé III Nov 23, 2022

@ted_dunning Yup, that's entirely possible. We hoped that giving them the past rate would help the interpretation, but it's definitely possible they misinterpreted.

If that's the case, there's still a big gap because if we really believe everything over a threshold is random, then no one should be saying anything over say 50%, but clearly a lot of people are.

Show thread

Ted Dunning Nov 23, 2022

There is definitely a gap, but my first interpretation is that people are complicated and they assume that any questioner is complicated. And then they estimate what you really meant by your question in some complicated way based on their estimate of your estimate of their mental state.

I still love y'all's work here and the graph speaks volumes. It also makes me re-think what I think about publications. That's probably true of others as well.

@hal

Show thread

Hal Daumé III Nov 23, 2022

@ted_dunning "people are complicated" --- something about truer words... :)

But yes, I agree there should be a lot of room for interpretation given how these darned complicated people interpreted things :

Show thread

Ted Dunning Nov 23, 2022

My coming of age moment in this respect was when I was first analyzing the behavior of people relative to music.

I found that if you looked at how much of a song people let play before hitting skip versus our estimate of how much they liked the song that the behavior was very non-intuitive.

Skipping after less than 15 seconds generally seemed to indicate radical dislike of an entire genre. Country music for a heavy metal fan. Or metal for a classical music listener.

1/2

@hal

Show thread

Ted Dunning Nov 23, 2022

That made sense. People can determine rough genre in a few hundred milliseconds.

But people skipped their absolute favorite songs frequently after about 30-60 seconds had played. Quizzing users about this indicated that they weren't even quite aware of doing this, but it seemed that they knew the songs well enough that this was enough to get the high.

This behavior had clear ramification for building a recommender.

And none of it much carried over to video watching.

2/2

@hal

Show thread

Hal Daumé III

@ted_dunning that’s an amazing example! both surprising but also i can totally see how it’s true