I guess sometimes the kernel can't put enough counters in the limited number provided by the hardware, and then it either samples (giving reduced data) or just... returns 0?
And kinda seems like whatever is using the counters is some other process, since I still have intermittent issues when I use a lock to limit this to one counter at a time in my code.
And you can actually see the time enabled vs time running and yeah this is a problem.
@pervognsen I eventually discovered that setting _both_ CPU core restriction _and_ process id restriction made schedaffinity no longer break the results from perf_event_open. So now I'm fiddling with APIs, then docs, then a release: https://codeberg.org/itamarst/bigo
(I guess also figure out if I want to fight a name squatter or pick another crate name.)
UPDATE: Nope, still broken, I was being fooled by my fallback logic which switches to time-based measures.
@pkhuong I'm only doing retired CPU instructions. As mentioned the reason I need to specify CPU core is that otherwise schedaffinity limiting to that core results in getting results of 0 instead of the real numbers... At some point I should verify if that happens with a different high level wrapper I guess.
UPDATE: Nope, still broken when using both, I was being fooled by my fallback logic which switches to time-based measures.