Once again I am heartbroken to remind you that the Dunning-Kruger effect is probably not real:

https://www.mcgill.ca/oss/article/critical-thinking/dunning-kruger-effect-probably-not-real

Like Freudian psychology, Hardin's tragedy of the commons and any number of other popular pseudoscientific narratives, it caters to our preconceptions and makes for entertaining, easy to re-tell stories, but it's also... not true.

And - again, I am entirely saddened by this - that means that if we keep using these metaphors we're legitimizing the false ideas behind them.

The Dunning-Kruger Effect Is Probably Not Real

I want the Dunning-Kruger effect to be real. First described in a seminal 1999 paper by David Dunning and Justin Kruger, this effect has been the darling of journalists who want to explain why dumb people don’t know they’re dumb. There’s even video of a fantastic pastiche of Turandot’s famous aria, Nessun dorma, explaining the Dunning-Kruger effect. “They don’t know,” the opera singer belts out at the climax, “that they don’t know.” I was planning on writing a very short article about the Dunning-Kruger effect and it felt like shooting fish in a barrel. Here’s the effect, how it was discovered, what it means. End of story. But as I double-checked the academic literature, doubt started to creep in. While trying to understand the criticism that had been leveled at the original study, I fell down a rabbit hole, spoke to a few statistics-minded people, corresponded with Dr. Dunning himself, and tried to understand if our brain really was biased to overstate our competence in activities at which we suck... or if the celebrated effect was just a mirage brought about by the peculiar way in which we can play with numbers. Have we been overstating our confidence in the Dunning-Kruger effect? A misunderstood effect The most important mistake people make about the Dunning-Kruger effect, according to Dr. Dunning, has to do with who falls victim to it. “The effect is about us, not them,” he wrote to me. “The lesson of the effect was always about how we should be humble and cautious about ourselves.” The Dunning-Kruger effect is not about dumb people. It’s mostly about all of us when it comes to things we are not very competent at. In a nutshell, the Dunning-Kruger effect was originally defined as a bias in our thinking. If I am terrible at English grammar and am told to answer a quiz testing my knowledge of English grammar, this bias in my thinking would lead me, according to the theory, to believe I would get a higher score than I actually would. And if I excel at English grammar, the effect dictates I would be likely to slightly underestimate how well I would do. I might predict I would get a 70% score while my actual score would be 90%. But if my actual score was 15% (because I’m terrible at grammar), I might think more highly of myself and predict a score of 60%. This discrepancy is the effect, and it is thought to be due to a specific problem with our brain’s ability to assess its skills. This is what student participants went through for Dunning and Kruger’s research project in the late 1990s. There were assessments of grammar, of humour, and of logical reasoning. Everyone was asked how well they thought they did and everyone was also graded objectively, and the two were compared. Since then, many studies have been done that have reported this effect in other domains of knowledge. Dr. Dunning tells me he believes the effect “has more to do with being misinformed rather than uninformed.” If I am asked the boiling point of mercury, it is clear my brain does not hold the answer. But if I am asked what is the capital of Scotland, I may think I know enough to say Glasgow, but it turns out it’s Edinburgh. That’s misinformation and it’s pushing down on that confidence button in my brain. So case closed, right? On the contrary. In 2016 and 2017, two papers were published in a mathematics journal called Numeracy. In them, the authors argued that the Dunning-Kruger effect was a mirage. And I tend to agree. The effect is in the noise The two papers, by Dr. Ed Nuhfer and colleagues, argued that the Dunning-Kruger effect could be replicated by using random data. “We all then believed the [1999] paper was valid,” Dr. Nuhfer told me via email. “The reasoning and argument just made so much sense. We never set out to disprove it; we were even fans of that paper.” In Dr. Nuhfer’s own papers, which used both computer-generated data and results from actual people undergoing a science literacy test, his team disproved the claim that most people that are unskilled are unaware of it (“a small number are: we saw about 5-6% that fit that in our data”) and instead showed that both experts and novices underestimate and overestimate their skills with the same frequency. “It’s just that experts do that over a narrower range,” he wrote to me. Wrapping my brain around all this took weeks. I recruited a husband-and-wife team, Dr. Patrick E. McKnight (from the Department of Psychology at George Mason University, also on the advisory board of Sense About Science and STATS.org) and Dr. Simone C. McKnight (from Global Systems Technologies, Inc.), to help me understand what was going on. Patrick McKnight not only believed in the existence of the Dunning-Kruger effect: he was teaching it to warn his students to be mindful of what they actually knew versus what they thought they knew. But after replicating Dr. Nuhfer’s findings using a different platform (the statistical computing language R instead of Nuhfer’s Microsoft Excel), he became convinced the effect was just an artefact of how the thing that was being measured was indeed measured. We had long conversations over this as I kept pushing back. As a skeptic, I am easily enticed by stories of the sort “everything you know about this is wrong.” That’s my bias. To overcome it, I kept playing devil’s advocate with the McKnights to make sure we were not forgetting something. Every time I felt my understanding crystallize, doubt would creep in the next day and my discussion with the McKnights would resume. I finally reached a point where I was fairly certain the Dunning-Kruger effect had not been shown to be a bias in our thinking but was just an artefact. Here then is the simplest explanation I have for why the effect appears to be real. For an effect of human psychology to be real, it cannot be rigorously replicated using random noise. If the human brain was predisposed to choose heads when a coin is flipped, you could compare this to random predictions (heads or tails) made by a computer and see the bias. A human would call more heads than the computer would because the computer is making random bets whereas the human is biased toward heads. With the Dunning-Kruger effect, this is not the case. Random data actually mimics the effect really well. The effect as originally described in 1999 makes use of a very peculiar type of graph. “This graph, to my knowledge, is quite unusual for most areas of science,” Patrick McKnight told me. In the original experiment, students took a test and were asked to guess their score. Therefore, each student had two data points: the score they thought they got (self-assessment) and the score they actually got (performance). In order to visualize these results, Dunning and Kruger separated everybody into quartiles: those who performed in the bottom 25%, those who scored in the top 25%, and the two quartiles in the middle. For each quartile, the average performance score and the average self-assessed score was plotted. This resulted in the famous Dunning-Kruger graph. Plotted this way, it looks like those in the bottom 25% thought they did much better than they did, and those in the top 25% underestimated their performance. This observation was thought to be due to the human brain: the unskilled are unaware of it. But if we remove the human brain from the equation, we get this: The above Dunning-Kruger graph was created by Patrick McKnight using computer-generated results for both self-assessment and performance. The numbers were random. There was no bias in the coding that would lead these fictitious students to guess they had done really well when their actual score was very low. And yet we can see that the two lines look eerily similar to those of Dunning and Kruger’s seminal experiment. A similar simulation was done by Dr. Phillip Ackerman and colleagues three years after the original Dunning-Kruger paper, and the results were similar. Measuring someone’s perception of anything, including their own skills, is fraught with difficulties. How well I think I did on my test today could change if the whole thing was done tomorrow, when my mood might differ and my self-confidence may waver. This measurement of self-assessment is thus, to a degree, unreliable. This unreliability--sometimes massive, sometimes not--means that any true psychological effect that does exist will be measured as smaller in the context of an experiment. This is called attenuation due to unreliability. “Scores of books, articles, and chapters highlight the problem with measurement error and attenuated effects,” Patrick McKnight wrote to me. In his simulation with random measurements, the so-called Dunning-Kruger effect actually becomes more visible as the measurement error increases. “We have no instance in the history of scientific discovery,” he continued, “where a finding improves by increasing measurement error. None.” Breaking the spell When I plug “Dunning-Kruger effect” into Google News, I get over 8,500 hits from media outlets like The New York Times, New Scientist, and the CBC. So many simply endorse the effect as a real bias of the brain, so it’s no wonder that people are not aware of the academic criticism that has existed since the effect was first published. It’s not just Dr. Nuhfer and his Numeracy papers. Other academic critics have pointed the finger, for example, at regression to the mean. But as Patrick McKnight points out, regression to the mean occurs when the same measure is taken over time and we track its evolution. If I take my temperature every morning and one day spike a fever, that same measure will (hopefully) go down the next day and return to its mean value as my fever abates. That’s regression to the mean. But in the context of the Dunning-Kruger effect, nothing is measured over time, and self-assessment and performance are different measures entirely, so regression to the mean should not apply. The unreliability of the self-assessment measurement itself, however, is a strong contender to explain a good chunk of what Dunning, Kruger, and other scientists who have since reported this effect in other contexts were actually describing. This story is not over. There will undoubtedly be more ink spilled in academic journals over this issue, which is a healthy part of scientific research after all. Studying protons and electrons is relatively easy as these particles don’t have a mind of their own; studying human psychology, by comparison, is much harder because the number of variables being juggled is incredibly high. It is thus really easy for findings in psychology to appear real when they are not. Are there dumb people who do not realize they are dumb? Sure, but that was never what the Dunning-Kruger effect was about. Are there people who are very confident and arrogant in their ignorance? Absolutely, but here too, Dunning and Kruger did not measure confidence or arrogance back in 1999. There are other effects known to psychologists, like the overconfidence bias and the better-than-average bias (where most car drivers believe themselves to be well above average, which makes no mathematical sense), so if the Dunning-Kruger effect is convincingly shown to be nothing but a mirage, it does not mean the human brain is spotless. And if researchers continue to believe in the effect in the face of weighty criticism, this is not a paradoxical example of the Dunning-Kruger effect. In the original classic experiments, students received no feedback when making their self-assessment. It is fair to say researchers are in a different position now. The words “Dunning-Kruger effect” have been wielded as an incantation by journalists and skeptics alike for years to explain away stupidity and incompetence. It may be time to break that spell. Take-home message: - The Dunning-Kruger effect was originally described in 1999 as the observation that people who are terrible at a particular task think they are much better than they are, while people who are very good at it tend to underestimate their competence - The Dunning-Kruger effect was never about “dumb people not knowing they are dumb” or about “ignorant people being very arrogant and confident in their lack of knowledge.” - Because the effect can be seen in random, computer-generated data, it may not be a real flaw in our thinking and thus may not really exist @CrackedScience Leave a comment!

Office for Science and Society

Having said that, it is always informative to look carefully at who benefits from the attitudes - and policies - that emerge from these false stories.

Gareth Hardin of "The Tragedy Of The Commons" fame, for example, was a hardcore white nationalist, nativist and eugenicist. Stockholm Syndrome was invented out of whole cloth to cover up police brutality and incompetence.

The list of empirically false but fun-to-retell stories that are also powerful as authoritarian propaganda is pretty long.

It's a very hard pill to swallow, especially given the panoptic, reflexive cruelty of this grudgefuck of a zeitgeist we're all presently stewing in and how easy it is to hit that boost or reply button, but one of the awful facts about high-semiotic-density memetic culture is that you might very easily be amplifying - and legitimizing - ideological positions you _don't even realize exist_ through the wonders of near-zero-friction and pushback-free participation.

@mhoye

Memetic is something that likes of Peter Thiel and the conman culture of Silicon Valley cling to. It has some traction because marketing works. The biggest mistake, though I think is the con works so long as people trust the source and when that trust collapses all hell breaks loose

@GhostOnTheHalfShell @mhoye the collapse of trust in those circumstances just means the damage is already done and truth or our reliance on it is dead. They accomplished what they set out to do, for the detriment of society.
@mhoye i would like to amplify this position but am insufficiently sure now what ideological positions it causes me to champion

@mhoye Mike Godwin's recent piece addresses overlapping ideas: https://mikegodwin.substack.com/p/from-a-law-to-an-ethic

(With apologies for the Substack link.)

From a Law to an Ethic

Introducing Godwin's Ethic - a framework for individual digital responsibility

Mike Godwin

@mhoye This sounds of a piece with Film Crit Hulk's famous despairing analysis:

"WHAT HAS ESSENTIALLY HAPPENED IS THAT WE HAVE TAKEN A CULT BEHAVIORAL APPROACH TO DISCUSSION AND PHILOSOPHY - NORMALLY A REALLY DIFFICULT THING TO INSTILL INTO PEOPLE AND REQUIRES ISOLATION, DIRECT PROGRAMMING AND FULL-ON CULTURAL SEPARATION - AND TURNED IT INTO SOMETHING THAT HAS BEEN CASUALLY LEARNED ON THE INTERNET'S PROVERBIAL STREETS"

https://web.archive.org/web/20230603022513/https://birthmoviesdeath.com/2014/10/27/film-crit-hulk-smash-on-despair-gamergate-and-quitting-the-hulk

Film Crit Hulk Smash: ON DESPAIR, GAMERGATE AND QUITTING THE HULK

Hulk attempts to make sense of the maelstrom. 

Birth.Movies.Death.
@mhoye where can I read more about them? Seemingly I've been lagging behind on these concepts.
@mhoye I had to reread this several times before I fully understood it, but the word choice is so delicious that I didn't mind.

@mhoye More than a few of the most popular examples of studies on human behavior from Kahnemann's Thinking Fast and Slow have really not aged well either. Quite a cringe read for people who try to keep up with the reproducibility crisis.

Unfortunately they have long since been perpetuated as just-so-stories in business popsci and management nightstand literature, so they are now irrevocably canon.

@mhoye
"Stockholm Syndrome" might not exist but some people definitely internalize and identify with abusive authoritarian structures

it's the basis of liberalism/capitalism

@johnbrowntypeface @mhoye

Yes. And more generally, these ideas become popular because the idea, or more often some mutation of it, describe something that people experience in their lives

Lord of the Flies is another good example. Real life experiences show that people marooned together on an island cooperate in help each other. At least, if they are from a culture that encourages that

But the book says something about a different culture, the culture in which it was written

Stockholm syndrome might be BS regarding a particular bank heist. But the term is now used to describe a very real phenomenon in kidnappings and abuse, a phenomenon that promotes survival. The bonding is two way, and makes it more likely that the assailant will back off at a crucial moment and allow survival or escape. (The bonding is also extremely traumatic in its its after effects, for the survivor)

The tragedy of the commons is something that happens every day in a capitalist individualist culture. But it doesn't have to be that way

And dunning-kruger gives people a name for something they see constantly: people in managerial roles who don't have the awareness or capabilities needed, who have been promoted into those positions for incomprehensible reasons

@mhoye the concepts of the ‘tragedy of the commons’ predate him. He was restating them to push his own agenda.

That said the idea that people can be motivated to over deplete a shared resource to the impoverishment of all is replicated throughout the historical record and the present.

I agree that there are severe misunderstandings about ‘the commons’ etc… in hardin’s work and that we should be careful with that shit.

@subterfugue
It seems to feed off Hobbs who was also full of it. Clumsy rationalization to justify state violence.
@mhoye
@hackersquirrel @mhoye hobbes is an entirely different thing born out of religious wars in Europe
@subterfugue
True. But Leviathan still shows a bias towards the same 'tragedy of the commons' mentality.
@mhoye
@hackersquirrel @mhoye what do you mean specifically?
@subterfugue
His need to impose and justify hierarchy. He specifically calls anarchy
Quote: Anarchy, (which signifies want of Government;)
@mhoye

@hackersquirrel @mhoye how is that connected to concerns about depletion of shared resources?

I am lost here

@subterfugue
When I think of the 'The tragedy of the commons '. I seem to recall it justifying the power of a state in order to control ownership. My total understanding may be imperfect.
@mhoye
@hackersquirrel @mhoye you mean that Hardin’s work is influenced by Hobbes then.
@subterfugue
Unfortunately I only know Hardin through second hand opinions. I would need to read his original work before I could make an informed opinion.
@mhoye

@hackersquirrel @mhoye the most damaging part of his work in my opinion is his mischaracterization of the ‘commons’ as ‘virgin’ territory or resources.

That assumption has been hugely destructive.

@subterfugue
I see what you mean. I think about the commons as it relates to things like clean air water. When industries polute these resources for free. The illusion of externalities.
@mhoye
@hackersquirrel @mhoye i think that his work contributed to that while providing cover with the very sensible wisdom that people often deplete shared resources. The rationale of his work was used to justify removing people and imposing ‘regulation’ that favored capital
@mhoye If there were a book that featured a deep dive into about a bunch of these examples of this, and explored this thesis generally, I’d read that book

@mhoye

Hardin was a white supremacist, and he overweighted his opinion, but tragedy of the commons is a real thing.

I have similar opinion about Stockholm Syndrome. Just because things aren't as simple as they are in textbooks, don't mean that there isn't more than "pseudoscience" on the topic.

@iju Ostrom's argument was that aboriginal communities had successfully managed communal resources for centuries, and her famous retort to Hardin was that if something exists in nature, it should exist in theory.

@mhoye

There are many famous examples of an exhaustion of a commons.

I'm not sure if Hardin said that exhaustion ALWAYS happens: if he did, it was stupidly done as examples of centuries managed commons existed, and do exist.

Generally speaking, I would say, "tragedy of the commons" as a phrase refers to certain situation, and the literature on where and how that situation arises (and how it can be avoided) is clear.

@mhoye I’ve started to wonder if “mob mentality” is just a euphemism for “white people being racist”
@mhoye Ironically, Dunning and Krueger thought they understood statistics much better than they actually did.
@mhoye First it was Stockholm Syndrome, now this. Can't a popular trope catch a break?

@zebulonmysterioso @mhoye

Re: Stockholm Syndrome ...not entirely true.

Just because there aren't formal studies for a pschycological phenomenon (esp. a commonly observed one—e.g. as with this in the realm of domestic violence, or algorithmic social media addiction, which was originally pooh-poohed in this very way) doesn't mean that phenomenon doesn't exist.

https://en.wikipedia.org/wiki/Stockholm_syndrome

Stockholm syndrome - Wikipedia

@Mark_Harbinger @zebulonmysterioso @mhoye Ehhh there is reason to think such a phenomenon is *possible* and *could explain something* but it was literally invented for nefarious and fraudulent reasons. Just because some version of it *could be meaningful* doesn’t warrant whitewashing its origin.

@Moss @zebulonmysterioso @mhoye

<insert generic noise of contempt here> Well, ...*originally*... Stockholm Syndrome was attributed to Swedish victims of bank robbers (Pro Tip: sometimes, if you read the link, you learn things).

So, how does one even "whitewash" that? You can't get much whiter than Sweden!
😂

p.s. I sincerely hope you're a bot.

@Mark_Harbinger @Moss @mhoye Dude, just read the Wiki article. The syndrome definition was supposition by an armchair pundit based on zero interaction with those involved, all of whom describe a very different dynamic. So yeah, it's a solution looking for a problem.

@zebulonmysterioso @Moss @mhoye

"Stockholm Syndrome" is best understood as a common parlance description of trauma-bonding—which is both very real and very well-proven, empirically.

Beyond that, I have no earthly idea what the hell you guys are talking about (Whitewashing, armchair pundits, etc.).

< Filing this convo under "Fun with Bots" and moving on... >

@Mark_Harbinger @zebulonmysterioso @Moss Anyone I Can’t Understand Is A Bot is a hell of an epistemic, I have to admit. Posts with big words? Bots. Terminology from unfamiliar domains? Bots. The French? That’s right, all of them.
@Mark_Harbinger You literally did not read that Wikipedia article, did you. Yet you think others should “read the link and learn things.” How embarrassing for you.
@mhoye I feel kinda bad about cargo-culting the Dunning-Kruger effect.
@tedmielczarek I have terrible news.
@mhoye yeah, I know, I just couldn't figure out how to annotate that as a joke. 🤷‍♂️

@mhoye

This reminds me of another article called the LLM mentalist. It goes on to describe how highly educated people are often the most steadfast believers of a con because of their own education.

The education in fact, renders them so self-confident that they never question what they are seeing or hearing, and they become quite effective, convincing themselves in the con

@mhoye good. so i am not crazy. bc i have heard the most educated of people say the most unimaginably silly old shit on their subject matter with the utmost confidence and it honestly defied any explanation.
@mhoye it is almost as if at times we just don't know and people are talking off their ass no matter their education..
@kali @mhoye and the congregation said AMEN

@mhoye

The comments on that article require Facebook so it seems asking the author directly isn't a thing, but how is showing that with people the transition point between over and underestimating ability is at about the 75% point at the top of the 3rd quartile, and with "random data" it's at the 50% for half of the test disapproving anything?
That the random data ends up at 50% for 50% of those tested would solidify that when you test it on people and it's not 50%, an effect is being seen.

@mhoye

There’s a an eerie echo of this topic by G Soros regarding economics. He is an exquisitely, concise and clear writer so it’s also a pleasure to read this article. It’s some kind of serendipity here. The parallels are quite compelling.

https://www.georgesoros.com/2014/01/13/fallibility-reflexivity-and-the-human-uncertainty-principle-2/

Fallibility, Reflexivity, and the Human Uncertainty Principle

The Journal of Economic Methodology, the leading peer-reviewed journal on the philosophical foundations and methodological practice of economics, has published a special issue devoted to George Soros’s theory of reflexivity. The issue contains a new article by Mr. Soros articulating his most recent thinking on reflexivity and fallibility, the role of those concepts in social science, and their contribution to events such as the 2008 financial crisis and euro crisis. The issue also contains contributions, responses and critiques from 18 leading scholars in economics and the history and philosophy of science.

George Soros

@mhoye But if nobody's actually conceptualizing the effect as it was originally intended, and we all share a common misconception, then what this guy researched wasn't the "Dunning-Kruger" effect, but the Dunning-Kruger effect--which the article explains is something entirely different.

So nothing is proved here, thank you very much. I can continue to use it as it was intended: an ad hominem. If people can't see through that then they almost certainly don't know exactly HOW full of shit I am.

@mhoye oh, the paper has been demonstrated to be false already. It's not real and the correlation was non existent.
The Dunning-Kruger Effect is Autocorrelation – Economics from the Top Down

Do unskilled people actually underestimate their incompetence?

Economics from the Top Down

@portaloffreedom @mhoye

I found that to be a much better explanation than the mcgill.ca article which vaguely speaks of "computer-generated data" and does not even include the term autocorrelation. Thanks for the link!

@eloquence @portaloffreedom @mhoye
I agree, and the author is here on Mastodon!
@blair_fix

@mhoye
My •extremely• limited understanding of the “it’s just noise” argument is that there •may• be room in the data for a gentler conclusion from the data:

Everyone’s self-assessment is inaccurate in •both• directions (over- and under-estimating), but experts may be slightly less inaccurate. We don’t have evidence that ignorance comes with •bias•, but it might come with greater •noise•.

@inthehands Except that if you really are a top notch expert, there is not much room to over-estimate your ability.
The experts get that true self-assessment for free, because it is mathematically impossible to over-estimate by very much. @mhoye

@hakona @mhoye
You’ve half got the argument, half missed it.

Yes, as the experiment is set up, experts don’t have much room to overestimate — and beginners don’t have much room to underestimate. Thus even if there is uniform inaccuracy (“noise”) across the whole ability spectrum, beginners will tend to overestimate and experts will tend to underestimate. This is exactly the “it’s just noise” argument, and the whole point of the article linked in the OP.

What you’re missing is that experts do not in fact “get true self-assessment for free,” because they •could• also underestimate themselves — and they do, but (it seems, maybe) by less than beginners overestimate themselves. That conclusion, if it holds under scrutiny, is still an interesting one and not a statistical given.

@inthehands @mhoye Unless I misunderstand, that’s included in the article:

'instead showed that both experts and novices underestimate and overestimate their skills with the same frequency. “It’s just that experts do that over a narrower range,” he wrote to me.’