Just had fun with SIMD. I wonder whether at this point it's faster to simply use a scalar loop..