@acqrel @ethan https://elixir.bootlin.com/linux/v6.13.7/source/arch/x86/include/asm/nops.h#L41 has a collection of nops by length, from 1 byte up to 11 bytes
and Linux has code for optimizing adjacent nops into bigger nops: https://elixir.bootlin.com/linux/v6.13.7/source/arch/x86/kernel/alternative.c#L223
@acqrel that reminds me of the famous IBM mainframe utility IEFBR14. It did nothing, but was useful to e.g. allocate space for files with no other side effects. It famously consisted of one one-byte instruction (BR 14, meaning: jump back).
Nevertheless, over the years, several updates to the program have been rolled out 😆
@acqrel just for the fun of it, i made a simple assembly program that calls nop/fnop a hundred thousand times, then loops ten thousand times, for a total of one trilliion nops.
on my system (fedora 41 amd64, i7-13700HX), fnop is a whopping five times slower than nop. additionally, the nop binary comes out to 102 KiB, while fnop is 200 KiB. nop is a one byte instruction and fnop is two bytes, so that makes sense.
under qemu-x86_64-static, the nop binary ran in only 10.8ms (faster than native!), while fnop took over twenty-four seconds! clearly qemu needs to better optimise the extremely important use case of running billions of floating point no-ops.
fnorp
@acqrel that dates back to the times when the 80x87 family of Math Coprocessors was a thing.
https://www.lo-tech.co.uk/wiki/80x87_Math_Coprocessors
My first PC had a AM386 SX-25 CPU from AMD and I later got a IIT 3C87-25 FPU for it.
This days ended with the Intel 486DX that came with the FPU integrated and it's in every x86 ever since.
But is there also some kind of negative `fnop`, @acqrel ?
What about nan `fnop`? Negative-NaN-fnop? Next-float-after-fnop-by-bitwise-value-fnop?