Cool post by my ex-colleague and friend Yury Habets about one of the cool task we did while we worked at Unity. We ported the SSE version of Intel's masked occlusion culling to ARM Neon and found lots of optimisation opportunities along the way

https://over17.github.io/2025/10/01/burst-occlusion-culling.html

Converting SSE intrinsics code to Neon. Burst masked occlusion culling

The Project

CPU Performance, Compilers and Android