Mastodawn

FredPlus10 Oct 15, 2024

TIL: Even though #Cublas always assumes column-major order, the docs of #cudaMemcpy2D assume row-major order!

HGPU group Jun 2, 2024

Evaluation of computational and energy performance in matrix multiplication algorithms on CPU and GPU using MKL, cuBLAS and SYCL

#CUDA #SYCL #MKL #CUBLAS #MatrixMultiplication #LinearAlgebra #Performance #Package

https://hgpu.org/?p=29229

Evaluation of computational and energy performance in matrix multiplication algorithms on CPU and GPU using MKL, cuBLAS and SYCL

Matrix multiplication is fundamental in the backpropagation algorithm used to train deep neural network models. Libraries like Intel’s MKL or NVIDIA’s cuBLAS implemented new and optimiz…

hgpu.org

Peter Guhl Nov 30, 2023

Not sure who needs to know that, but if you get a #CUBLAS error 15 with #llama.cpp and the .cu-file has something about f16 at about the line which fails, starting main with --memory-f32 may be a workaround. Had this with the #NVIDIA #Tesla #M40 24GB.
#AI #MachineLearning #CUDA #llama2 #Meta