Last Week on My Mac: Why E cores make Apple silicon fast

Apple silicon architecture is designed to get background processes out of the way of our apps running in the foreground, by using the E cores.

The Eclectic Light Company

> The fact that an idle Mac has over 2,000 threads running in over 600 processes is good news

Not when one of those decides to wreck havoc - spotlight indexing issues slowly eating away your disk space, icloud sync spinning over and over and hanging any app that tries to read your Documents folder, Photos sync pegging all cores at 100%… it feels like things might be getting a little out of hand. How can anyone model/predict system behaviour with so many moving parts?

and if it paid off, that would almost be acceptable! But no. After spotlight has indexed my /Applications folder, when I hit command-spacebar and type "preview.app", it takes ~4 seconds on my M4 laptop to search the sqlite database for it and return that entry.

grumble

On pre-Tahoe macOS there is the “Applications” view (accessible e.g. from the dock). Since the only thing I would use Spotlight for is searching through applications to start, I changed the Cmd+Space keybind to launch the Applications view. The search is instant.

Spotlight, aside from failing to find applications also pollutes the search results with random files it found on the filesystem, some shortcuts to search the web and whatnot. Also, at the start of me using a Mac it repeatedly got into the state of not displaying any results whatsoever. Fixing that each time required running some arcane commands in the terminal. Something that people associate with Linux, but ironically I think now Linux requires less of that than Mac.

But in Tahoe they removed the Applications view, so my solution is gone now.

All in all, with Apple destroying macOS in each release, crippling DTrace with SIP, Liquid Glass, poor performance monitoring compared to what I can see with tools like perf on Linux, or Intel VTune on Windows, Metal slowly becoming the only GPU programming option, I think I’m going to be switching back to Linux.

macOS profiling tools completely blow Linux’s perf out of the water. It’s not even close.
You are saying that like perf is Linux only profiling tool.
That’s what they compared to

https://blog.bugsiki.dev/posts/apple-pmu/

> I quickly found out that Apple Instruments doesn’t support fetching more than 10 counters, sometimes 8, and sometimes less. I was constantly getting errors like '<SOME_COUNTER>' conflicts with a previously added event. The maximum that I could get is 10 counters. So, the first takeaway was that there is a limit to how many counters I can fetch, and another is that counters are, in some way, incompatible with each other. Why and how they’re incompatible is a good question.

Also: https://hmijailblog.blogspot.com/2015/09/using-intels-perfor...

PMU Counters on Apple Silicon

Last time I wrote about profiling in Zig on Apple Silicon, I touched on PMU counter profiling. This time I decided to go further and create my own tool to fetch all available counters for Apple Silicon processors (M1, M2, and later). Brief explanation of PMU counters PMU (Performance Monitoring Unit) counters are hardware counters that track microarchitectural events inside the CPU, e.g. executed instructions, retired operations, branches, cache misses, and more.

It’s a condemnation of how bad perf is if it loses to that.
So far I have provided you with examples of how Instruments.app loses to perf. Perf does not have these limitations. You have not provided any examples in the reverse direction.

Both of your examples are actually very good at explaining my point. Both Instruments and perf largely expose the same information, since they use trace features in the hardware together with kernel support to profile code. Where they differ is the UI they provide. perf provides almost nothing; Instruments provides almost everything. This is because perf is basically a library and Instruments is a tool that you use to find performance problems.

Why do I like Instruments and think it is better? Because the people who designed it optimized it for solving real performance problems. There are a bunch of "templates" that are focused on issues like "why is my thing so slow, what is it doing" to "why am I using too much memory" to "what network traffic is coming out of this app". These are real, specific problems while perf will tell you things like "oh this instruction has a 12% cache miss rate because it got scheduled off the core 2ms ago". Which is something Instruments can also tell you, but the idea is that this is totally the wrong interface that you should be presenting for doing performance work since just presenting people with data is barely useful.

What people do instead with perf is they have like 17 scripts 12 of which were written by Brendan Gregg to load the info into something that can be half useful to them. This is to save you time if you don't know how the Linux kernel works. Part of the reason why flamegraphs and Perfetto are so popular is because everyone is so desperate to pull out the info and get something, anything, that's not the perf UI that they settle for what they can get. Instruments has exceptionally good UI for its tools, clearly designed by people who solve real performance problems. perf is a raw data dump from the kernel with some lipstick on it.

Mind you, I trust the data that perf is dumping because the tool is rock-solid. Instruments is not like that. It's buggy, sometimes undocumented (to be fair, perf is not great either, but at least it is open source), slow, and crashes a lot. This majorly sucks. But even with this I solve a lot more problems clicking around Instruments UI and cursing at it than I do with perf. And while they are slow to fix things they are directionally moving towards cleaning up bugs and allowing data export, so the problems that you brought up (which are very valid) are solved or on their way towards being solved.