TIL: Even though #Cublas always assumes column-major order, the docs of #cudaMemcpy2D assume row-major order!