@dysfun right but couldn't you do the cheap thing when _capturing_ the backtrace and keep the expensive stuff for later? I guess not?
@fasterthanlime Symbol resolution should be the most expensive part, but that is already done lazily. Capturing backtraces requires for each frame iterating through the .eh_frame_hdr of every dynamic library to find the right unwind info and then interpret the unwinding byte code from the start of the function until the location where the current instruction pointer is to find where each register is saved. There is a paper about compiling this byte code to executable code which is much faster.
@bjorn3 Is there a fast way to do this without debuginfo? What does `perf` do when you tell it to not use dwarf?
@fasterthanlime @bjorn3 frame pointers, which are disabled by default because you know, back in the old days of 32 bit x86, you didn't have many registers, so you'd abuse the frame pointer to get one, which made your program a bit faster, but also way harder to profile efficiently, making it effectively slower again, and nowadays the tradeoff doesn't really make sense, but it still sticks around.
-Cforce-frame-pointers=true enables them
@nilstrieb @fasterthanlime @bjorn3 damn, I was 40 seconds too late