For evaluation strategies, I see:
- 1. TLB-MPKI Talluri-Hill (ASPLOS '94), Navarro-Iyer-Druschel-Cox (OSDI '02), Barr-Cox-Rixner (ISCA '10)
- 2. cache pollution, maybe LLC-MPKI, originators unclear
- 3. LLC/TLB-MPKI vs. RAM scaling curves to determine „memory walls” per Wulf-McKee '95
- 4. Attribution / stack-distance modelling per Mattson '70
My own list of things I'd like to measure is:
- 1. Queueing: inter-arrival times and service times have more predictive power
- 2. Distribution fairness: the grand total never matters as much as the distribution, where fairness is a central concern, as well as priority inversion
- 3. Distribution utilisation: is effective use being made of sizes across the size spectrum?
- 4. Verification of small superpage size allocation guarantees
- 5. Fragmentation metrics
- 6. Verification of increased allocation success likelihoods for larger page sizes in the face of external fragmentation
- 7. Defragmentation overhead
- 8. Distribution to shared mappings: the highest impact can be had via sharing, so it's also a question of whether the contiguity has been distributed to the shared memory objects of greatest impact.
The MPKI notion is on the one hand a reciprocal of inter-arrival times and on the other isn't naturally comparable or extensible to service times. Wulf-McKee recast in queueing theoretical terms still doesn't cover the full range of issues I'm working on, though. Metrics for 4. distribution fairness, 7. defragmentation overhead and 8. distribution to shared mappings and maybe more need to be devised.