So, do all #BLAS functions scale linearly only up to 4-6 threads? This seems to be the case when multithreaded BLAS is used for glm(m) modeling in #Rstats
#HPC

@ChristosArgyrop depends on the implementation of the library. #BLIS is good at scaling for multicore shared memory. Also does a good job at maintaining numerical stability (minimizing floating point error accumulations).

https://github.com/flame/blis

GitHub - flame/blis: BLAS-like Library Instantiation Software Framework

BLAS-like Library Instantiation Software Framework - flame/blis

GitHub
@ctaylor Let me try this - I assume it is a drop in replacement?
@ChristosArgyrop yep it has a CBLAS wrapper so it should work in that capacity. If you've problems send a DM
@ChristosArgyrop forgot to mention data parallelism (simd, vector, etc) is supported across ISAs depending on hardware.
@ChristosArgyrop the community is worth joining on discord
@ctaylor what is the discord server name?

@ChristosArgyrop on the road right now, best place to look is their guide (it's not in the most clearly marked place):

https://github.com/flame/blis/blob/master/docs/Discord.md#joining-the-blis-server

blis/docs/Discord.md at master · flame/blis

BLAS-like Library Instantiation Software Framework - flame/blis

GitHub