#AVX512 extension request: IFMA-52 but lower precision integers.

IFMA-52 is nice because of it's high intermediate precision as well as great throughput (but high latency). I suspect 52 is convenient because of the FP64 FMA unit.

Perhaps an FP32 based IFMA-22 could be doable?

#intel #HPC #ai #x86 #YetAnotherISAExtension

As of now the closest to what I want is the #KNC instruction vpmadd231d zmm, zmm, zmm which can deal with EPI32

Idea is lowering power requirements while also taking advantage of the fp32 pipes extremely high performance, which also happens to have lower latency than the integer #SIMD unit (both CPI 0.5, but FP unit is lat 4 vs integer being lat10)

I need more than EPI16 native, but EPI32/64 is a waste of power and precision.

I'd also rather not do horrible things in the FP unit...