#statstab #463 One-and-a-half sided test
Thoughts: Alberson has a more different take on the issues from #461 and #462: human behaviour.
(excerpt from the book)
#pvalue #onesided #NHST #directional #hypothesis #testing #logic
https://www.routledge.com/Statistics-As-Principled-Argument/Abelson/p/book/9780805805284
#statstab #462 The paradox of one-sided vs. two-sided tests of significance
Thoughts: A solution to Royall's paradox from #461. The "null" is not one thing.
#pvalue #Royall #paradox #onesided #nhst #null #hypothesis #logic
https://www.onesided.org/articles/the-paradox-of-one-sided-v-two-sided-tests-of-significance.php
A lot of people find as paradoxical the claim that a one-sided test of significance at a given p-value offers the same type I error guarantees as a two-sided test that produced the same p-value. Here I solve the paradox in its informal version and a formal version put forth by Royall.
Slicing Is All You Need: Towards a Universal One-Sided Distributed MatMul
https://arxiv.org/abs/2510.08874
#HackerNews #Slicing #MatMul #Distributed #Computing #OneSided #AI #Research
Many important applications across science, data analytics, and AI workloads depend on distributed matrix multiplication. Prior work has developed a large array of algorithms suitable for different problem sizes and partitionings including 1D, 2D, 1.5D, and 2.5D algorithms. A limitation of current work is that existing algorithms are limited to a subset of partitionings. Multiple algorithm implementations are required to support the full space of possible partitionings. If no algorithm implementation is available for a particular set of partitionings, one or more operands must be redistributed, increasing communication costs. This paper presents a universal one-sided algorithm for distributed matrix multiplication that supports all combinations of partitionings and replication factors. Our algorithm uses slicing (index arithmetic) to compute the sets of overlapping tiles that must be multiplied together. This list of local matrix multiplies can then either be executed directly, or reordered and lowered to an optimized IR to maximize overlap. We implement our algorithm using a high-level C++-based PGAS programming framework that performs direct GPU-to-GPU communication using intra-node interconnects. We evaluate performance for a wide variety of partitionings and replication factors, finding that our work is competitive with PyTorch DTensor, a highly optimized distributed tensor library targeting AI models.
5 Signs That You’re in a #OneSided #Friendship
⬆️ #RupiKaur, listen to this:
"I used to consider myself part of the extreme #liberals, whatever they call themselves.
But when I see demonstrations with cries in support of #Hamas and stuff like that, I doubt that the world understands #complexity … and when they can’t understand complexity, they see this as a #oneSided thing
and their sense of #justice is very #simple.
But it’s not simple…
I think the GOVERNMENTS understand this,
but the PEOPLE… I don’t know.”
https://www.cnn.com/2023/11/07/middleeast/israel-mood-gaza-war-intl-cmd/index.html