Mastodawn

S Ellison Jan 3, 2024

@steven Certainly you need an effect size or interval to check against. On point 2, if your primary purpose is demonstration of equivalence (equivalence of two measurement methods is a typical case where I come from) there’s no reason not to do the equivalence test first.
Sample size; you’re looking at the same distribution assumptions with the same underlying SD, but the critical values differ; I’ve not compared power myself. Worth checking.

Show thread

S Ellison Jan 3, 2024

@steven Finally, there are Bayesian apptoaches to this that would give you a Bayes factor or posterior probability for both the null and the alternate, whichever way you set them up (including either a zero-effect or a narrow-interval hypothesis). You’d just need prior probabilities for the (no-effect) null and the alternate and some likelihood. If the posterior probability for the zero/small effect hypothesis is big, you consider it more likely.

Show thread

S Ellison Jan 3, 2024

@steven There’s a caveat; the interval has to be a reasonably sensible size compared to your result distribution. If not, there’s never a sufficient probability of your result landing inside the interval by chance to give a p-value under 5% (or whatever you want).

Show thread

S Ellison Jan 3, 2024

@steven You set up a hypothesis to the effect that a given difference is greater than some chosen value and test that agaist the null that it isn’t.

This is sometimes called ‘equivalence testing’ and a common implementation is a ‘Two One-Sided t’ (TOST) test, so called because you’re effectively testing against both ends of an interval.

See https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5502906/

Equivalence Tests: A Practical Primer for t Tests, Correlations, and Meta-Analyses

Scientists should be able to provide support for the absence of a meaningful effect. Currently, researchers often incorrectly conclude an effect is absent based a nonsignificant result. A widely recommended approach within a frequentist framework is to ...

PubMed Central (PMC)

Show thread

S Ellison Oct 24, 2023

@cherdt Interestingly, 3sd doesn’t make sense for the Normal either; it’s a bit too much for Normal if you’re aiming at 99%. Shewhart knew that real distributions mostly have heavier tails when he chose 3 sigma action limits. And as a rule of thumb for the unusual it’s surprisingly robust.

But it can go wrong, of course, in quite enough ways to make thinking important.

Show thread

S Ellison Oct 12, 2023

@dgar … and almost everyone has more than the average number of legs.

Show thread

S Ellison Sep 17, 2023

@hishamzerriffi I cannot say I have an answer.

It is certainly important to recognise the dark side of figures like Pearson and Galton when looking at their history and the history of statistics. But that is about the men. Mathematically, the methods do not depend on that history, and most were built on older foundations and have been extended by many others.

That suggests teaching the history honestly but perhaps remembering it’s about the men, not the statistics per se.

Show thread

S Ellison Sep 17, 2023

@hishamzerriffi Yes, the history there is salutory. Galton in particular seems to have bent his research to support his preconceptions.

But there are few academic pursuits that have not been turned to evil at some point. Does that mean we should have a health warning on every academic textbook, listing the evils done by past practitioners? I’m not sure we shouldn’t, of course. But since we don’t for most, it seems reasonable to ask where we should draw the line for statistics.

Show thread

S Ellison Sep 16, 2023

@hishamzerriffi @hishamzerriffi A problem is that as far as I can see, _everyone_ in the early 20th century was a neodarwinist, a racist, a eugenicist, or some combination of all three. The Empire was still there; we English were out there bravely helping the poor savages learn christianity, law and trousers. It was all a “Good Thing”.

The challenge, sadly, is not to find and unmask the racists among researchers back then. It’s to find someone who wasn’t.

Show thread

S Ellison Sep 5, 2023

@pastelbio Spitballing a bit, but if it’s just a trend, that sounds like a case for GLM ... possibly a GLMM if you want the individuals counted as random and there are multiple counts per individual. You could also consider a simpler linear mixed model with log transform. Snag there is that any data transform will make a fitted trend harder to interpret; I’d rather not do that for continuous predictors.

If you’re looking at seasonal time series, though, I’m officially out of my depth :(

statistics	metrology
chemistry