iai-callgrind gives you deterministic, instruction-count benchmarks. The catch: it needs Valgrind, and Valgrind has zero Apple Silicon support.

Here's how I run them locally on an M-series Mac in a native arm64 container - seccomp trap and all.

#rust #rustlang #performance #benchmarking #applesilicon #valgrind

https://martinhicks.dev/articles/running-iai-callgrind-on-apple-silicon

Running iai-callgrind on Apple Silicon

iai-callgrind needs Valgrind, and Valgrind doesn't run on Apple Silicon. Here's how to run aarch64 Callgrind benchmarks in a native arm64 Docker container.

Martin Hicks

For those following valgrind stable release there is now valgrind 3.27.1 with some (regression) bug fixes.

https://valgrind.org

#valgrind

Valgrind Home

Official Home Page for valgrind, a suite of tools for debugging and profiling. Automatically detect memory management and threading bugs, and perform detailed profiling. The current stable version is valgrind-3.27.1.

#valgrind 3.27.0 released!

New options for helgrind (--show-events and --track-destroy), new SSE4.1 instruction for x86, support for s390x z/Architecture features from 15th edition, integrated binutils objdump for s390x disassembly, new linux syscalls, address space manager tracking linux kernel lightweight guard pages, freebsd 16.0-CURRENT, macOS up to version 13 Ventura (Intel), new client requests (VALGRIND_REPLACES_MALLOC, VALGRIND_GET_TOOLNAME).

https://gnu.wildebeest.org/blog/mjw/2026/04/20/anticipating-valgrind-3-27-0/

Mark J. Wielaard » Blog Archive » Anticipating Valgrind 3.27.0

Valgrind

Official Home Page for valgrind, a suite of tools for debugging and profiling. Automatically detect memory management and threading bugs, and perform detailed profiling. The current stable version is valgrind-3.27.0.

Anticipating #valgrind 3.27.0.

There are two new client requests (macros defined in valgrind.h) https://valgrind.org/docs/manual/manual-core-adv.html#manual-core-adv.clientreq

- VALGRIND_REPLACES_MALLOC Returns 1 if the tool replaces malloc (e.g., memcheck). Returns 0 if the tool does not replace malloc (e.g., cachegrind and callgrind) or if the executable is not running under VALGRIND.

- VALGRIND_GET_TOOLNAME Get the running tool name as a string. Takes two arguments, an input buffer pointer and the length of that buffer.

Valgrind

Official Home Page for valgrind, a suite of tools for debugging and profiling. Automatically detect memory management and threading bugs, and perform detailed profiling. The current stable version is valgrind-3.26.0.

A lot of code in #valgrind 3.27.0 to support MacOS was previously maintained by Louis Brunner out of tree https://github.com/LouisBrunner/valgrind-macos
GitHub - LouisBrunner/valgrind-macos: A valgrind mirror with latest macOS support

A valgrind mirror with latest macOS support. Contribute to LouisBrunner/valgrind-macos development by creating an account on GitHub.

GitHub

Anticipating #valgrind 3.27.0.

Paul Floyd @paulf had more commits than all others combined for this release. Paul takes care of the alternative toolchains, Solaris/illumos, FreeBSD and Darwin/MacOS ports.

Tested Oracle Solaris 11.4, OpenIndiana Hipster and OmniOS.

FreeBSD works on both amd64 and arm64, support for 16.0-CURRENT has been added.

Supported MacOS versions, 10.13 (bug fixes), 10.14, 10.15, 11.0 (Intel only), 12.0 (Intel only), 13.0 (Intel only, preliminary). No arm64 support yet

Anticipating #valgrind 3.27.0.

Martin also added Valgrind address space manager support for tracking linux kernel lightweight guard pages, created through madvise (MADV_GUARD_INSTALL).

These guard pages are very low overhead for the kernel because they aren't tracked as separate VMAs and don't show up in the process proc maps. But Valgrind does still need to know whether the addresses are accessible. A new --max-guard-pages option controls the memory Valgrind reserves for tracking these pages.

Anticipating #valgrind 3.27.0.

Martin Cermak maintains the Linux Test Program (LTP) valgrind integration, which checks our syscall wrappers work correctly. And he makes sure newer linux syscalls are wrapped. Valgrind 3.27.0 adds support for file_getattr, file_setattr, lsm_get_self_attr, lsm_set_self_attr, lsm_list_modules. And corrects various syscall and ioctl corner cases.

Anticipating #valgrind 3.27.0.

Andreas also showed there are still meaningful optimizations to be made on how memcheck tracks undefinedness bits as outlined in the original "Using Valgrind to detect undefined value errors with bit-precision" https://valgrind.org/docs/memcheck2005.pdf paper.

His optimization of memcheck instrumenting a bitwise AND/OR with a constant is clever and simplifies the generated code: https://sourceware.org/cgit/valgrind/commit/?id=8514763a215f863bc52400991eba332df115eaea