Any ideas why perf_event_open() would often return 0 CPU instructions retired when I pin a thread to a particular core?
Apparently if you limit to _both_ a thread id (pid) _and_ a CPU id, then perf_event_open() will happily give you results.
UPDATE: Nope, still broken, I was being fooled by my fallback logic which switches to time-based measures.