#FluidX3D #CFD v3.7 brings faster Q-criterion isosurface rendering with #OpenCL local memory optimization! 🖖🤠
https://github.com/ProjectPhysX/FluidX3D/releases/tag/v3.7
Instead of 32 velocities for each #GPU thread, now an 8x8x8 workgroup loads & reuses 11x11x11 velocities in L1$, a 12x VRAM BW reduction.
Fascinating insight: Which thread loads which cell from VRAM to L1$, and which thread renders which grid cell within the workgroup, can be very different!
https://github.com/ProjectPhysX/FluidX3D/blob/master/src/kernel.cpp#L2827-L2956
PS: plugged X-wing Gif in #GitHub preview 🖖😜







