I wonder how much of the death of the "unix way" of having lots of tiny binaries that each do one thing has to do with the ever-rising relative cost of TLB and cache misses due to the fact that CPU pipelines have become so deep, and the slowness of memory relative to the CPU.
One can imagine CPU makers might have pursued a different path had they not been targeting Windows for so long.