🚫🤖 Ah, the elusive "prefix sums" article—where you learn absolutely nothing about ARM NEON but everything about web developers' favorite error code: 403 Forbidden! Who knew gigabytes per second referred to the speed at which you get denied access? 🙄🔒
https://lemire.me/blog/2026/03/08/prefix-sums-at-tens-of-gigabytes-per-second-with-arm-neon/ #prefixsums #403forbidden #webdevelopment #ARMNEON #errors #gigabytespersecond #HackerNews #ngated
Prefix sums at tens of gigabytes per second with ARM NEON

Suppose that you have a record of your sales per day. You might want to get a running record where, for each day, you are told how many sales you have made since the start of the year. day sales per day running sales 1 10$ 10 $ 2 15$ 25 $ 3 5$ 30 … Continue reading Prefix sums at tens of gigabytes per second with ARM NEON

Daniel Lemire's blog
Prefix sums at tens of gigabytes per second with ARM NEON

Suppose that you have a record of your sales per day. You might want to get a running record where, for each day, you are told how many sales you have made since the start of the year. day sales per day running sales 1 10$ 10 $ 2 15$ 25 $ 3 5$ 30 … Continue reading Prefix sums at tens of gigabytes per second with ARM NEON

Daniel Lemire's blog
🎉 Behold the thrilling saga of prefix sums! 💥 Marvel at the grandeur of #algorithms for #GPUs that no one but a handful of engineers will ever need. 🚀 It's the blockbuster hit of the software world that nobody asked for, coming to a #GitHub near you! 🍿
https://github.com/b0nes164/GPUPrefixSums #prefixsums #softwaredevelopment #technews #HackerNews #ngated
GitHub - b0nes164/GPUPrefixSums: A nearly complete collection of prefix sum algorithms implemented in CUDA, D3D12, Unity and WGPU. Theoretically portable to all wave/warp/subgroup sizes.

A nearly complete collection of prefix sum algorithms implemented in CUDA, D3D12, Unity and WGPU. Theoretically portable to all wave/warp/subgroup sizes. - GitHub - b0nes164/GPUPrefixSums: A nearl...

GitHub