TIL: Even though #Cublas always assumes column-major order, the docs of #cudaMemcpy2D assume row-major order!
Evaluation of computational and energy performance in matrix multiplication algorithms on CPU and GPU using MKL, cuBLAS and SYCL
#CUDA #SYCL #MKL #CUBLAS #MatrixMultiplication #LinearAlgebra #Performance #Package