New! A Look At Benford's Law
Benford's Law is an interesting heuristic in data analysis. It states that in any large collection of numbers that are created naturally, you should expect to see numbers starting with the number 1 about 30% of the time.
In this article we will look at how to calculate Benford's Law and then applying the law against a series of data sets to see if we can spot any issues.
https://www.hashbangcode.com/article/look-benfords-law
#benfordsLaw #php #hashbangcode
Benfords Law is an interesting heuristic in data analysis. It states that in any large collection of numbers that are created naturally, you should expect to see numbers starting with the number 1 about 30% of the time. The frequency distribution of numbers states that 2 should appear about 17% of the time, down to 9 being seen just 5% of the time.
Hot off the press (well, Substack) - Benford’s Law in Python
https://codedrome.substack.com/p/benfords-law-in-python
Benford's Law describes the distribution of the first digits of most sets of numeric data and in this article I explain the principles and implement a demonstration in Python.
#statistics #benfordslaw #datascience #programming #python #pythonprogram
New study: "[Impact factor stats for] #OpenAccess journals adhere to #BenfordsLaw more closely than subscribed journals."
https://ieeexplore.ieee.org/abstract/document/10753145
(#paywalled)
Benford's law is weird & fascinating.
https://en.wikipedia.org/wiki/Benford%27s_law
I have no idea what to make of its appearance here.
The authors' take: "The rate of increase in [non-OA] journals is much higher than that of OA journals…[&] violations of Benford's Law are more frequent in [non-OA] journals compared to open [OA] journals.
Can We Mathematically Spot Possible Manipulation of Results in Research Manuscripts Using Benford's Law?
https://arxiv.org/abs/2307.01742
Reproducibility of academic research is a persistent issue ... more concerning is the increasing number of false claims found in academic manuscripts recently
Our analysis predicted a 3% occurrence of result manipulation w. 96% CI. We find disturbing inconsistencies in recent studies & offer a semi-automatic method for their detection.
The reproducibility of academic research has long been a persistent issue, contradicting one of the fundamental principles of science. What is even more concerning is the increasing number of false claims found in academic manuscripts recently, casting doubt on the validity of reported results. In this paper, we utilize an adaptive version of Benford's law, a statistical phenomenon that describes the distribution of leading digits in naturally occurring datasets, to identify potential manipulation of results in research manuscripts, solely using the aggregated data presented in those manuscripts. Our methodology applies the principles of Benford's law to commonly employed analyses in academic manuscripts, thus, reducing the need for the raw data itself. To validate our approach, we employed 100 open-source datasets and successfully predicted 79% of them accurately using our rules. Additionally, we analyzed 100 manuscripts published in the last two years across ten prominent economic journals, with ten manuscripts randomly sampled from each journal. Our analysis predicted a 3% occurrence of result manipulation with a 96% confidence level. Our findings uncover disturbing inconsistencies in recent studies and offer a semi-automatic method for their detection.
Great #Excel calcChain #forensics by @datacolada
#Harvard professor who studies honesty accused of #falsifying #data in #studies
https://amp.theguardian.com/education/2023/jun/25/harvard-professor-data-fraud
The original blog from 2021 http://datacolada.org/98 is followed by a 4 part blog
https://datacolada.org/109
#excel #msexcel #dataquality #fraud #statistics #probability #benfordslaw #inference #patterns #analysis #audit #behavioural #behavioral #science #honesty #research #review #francescagino
#eusprig #spreadsheet #risk
https://en.wikipedia.org/wiki/Benford%27s_law=
Still questioning #electionIntegrity around #election2020 or even #election2022? #Antiscience Hooey.
To all at #FoxNews and all the other Trumpian #socialmedia acolytes still using doubt and fear for political or financial advantage, while pushing us toward civil war: show us the #dataforensics #electionforensics' #datascience or stfu.
Here a blurb on #electiondata from wikipedia on #BenfordsLaw to tell you where you need to look.
#GeorgeSantos: I ate at the Italian restaurant where the #Republican congressman often spends exactly $199.99.
https://slate.com/news-and-politics/2023/01/george-santos-il-bacco-campaign-spending-new-york.html
See also #BenfordsLaw https://en.wikipedia.org/wiki/Benford%27s_law
Benford's Law turns out to be a poor test for election fraud
"Why do Biden's votes not follow Benford's Law?"
https://youtube.com/watch?v=etx0k1nLn78
The distribution of values fails to meet the multiple orders of magnitude requirement for applying Benford's Law.
"Benford's Law and the Detection of Election Fraud"
Abstract: The proliferation of elections in even those states that are arguably anything but democratic has given rise to a focused interest on developing methods for detecting fraud in the official statistics of a state's election returns. Among these efforts are those that employ Benford's Law, with the most common application being an attempt to proclaim some election or another fraud free or replete with fraud. This essay, however, argues that, despite its apparent utility in looking at other phenomena, Benford's Law is problematical at best as a forensic tool when applied to elections....