Look at the bump in performance that we will see with the next Python-Blosc2 release.

Matrix multiplication has been speeded up by using blocks and Blosc2 prefilters and its own efficient and multithreaded engine.

Expect between 5x and 6x better speed for matrices with no padding.

Huge shoutout to @luke_shaw_ironarray Shaw for making this happen!