furiously clicking through the histories of various microprocessors on wikipedia so i can figure out who to blame for the fact that you can't push/pop a single byte to the stack on the gameboy in an atomic fashion (you can only push/pop register pairs). a fact that led to a stupid bug that took me two days to figure out
(as far as i can tell, the gameboy's SM83 has it because the Z80 has it, and the Z80 has it because the Intel 8080 has it, but I can't figure out what the Intel engineers were thinking when they decided this is how it would work on the 8080. Stanley Mazor wrote about the design process of the 8080 here https://ieeexplore.ieee.org/document/4287219 but on the topic of push and pop he only says "Push and Pop instructions were needed for each of the three register pairs.")
@aparrish i have been thinking about the timings of memory stores and fetches on the z80. PUSH and POP are quite efficient in space and clocks; largely because the target address is not in the instruction stream (it's SP). There are no other instructions to store or fetch a 16-bit register to an indirect target (LD HL, (nn) is a _lot_ slower). Hmm, so much for the joy of 16-bit PUSH and POP.
Why no 8-bit PUSH and POP? I suppose if both is not an option, i would rather have only 16-bit ops. And i wonder if at least one 16-bit PUSH and POP is needed for some systems programming reason involving interrupts (one 16-bit op is atomic, but two 8-bit ops would not be).