@pervognsen @nick reference on early POWER multi-loads https://bitsavers.org/pdf/ibm/IBM_Journal_of_Research_and_Development/341/ibmrd3401E.pdf pp. 7-10 starting with "The RS/6000 architecture has adopted the following strategy for dealing with misaligned data."
Load-multiple section starts. on p. 9 "Another aspect of including string operations..."
@pervognsen @nick I will say that they are IMO bang on the money here on _all_ counts - calling out that
a) mem copies/string copies etc. are important and usually unaligned
b) Alpha-esque "we give you a way to do SWAR loops for this" only gets you so far,
c) for load/store multiple, that function prologues/epilogues are the key use case
other ISAs have struggled to learn that lesson 30 years later...
> The architecture allows for the partial
completion of an operation and thegeneration of an
alignment-check interrupt when the datacrosses a cache-
line boundary. System softwarecan then complete the
instruction by fixing up the affected registersor memory
locations.
this has EINTR vibes