Zixian Cai

@caizixian
104 Followers
194 Following
79 Posts
Programming languages, computer architecture, and performance analysis/optimization.
Homepagehttps://www.zcai.org
GitHubhttps://github.com/caizixian
@irene @thinkingfish thanks for the pointer.
@irene @thinkingfish is there an external available recording?
@dan do you use any inventory management software?
@TaliaRinger nice! What roaster do you use? I have a Kaffelogic, but haven’t roasted for a while due to dissertation writing.
@ah what tool did you use to test the blocking efficacy? Is it the stats report from Pi-Hole?
@ah Multi PRO. I think it’s the recommended one in the middle of the tradeoff spectrum.
@ah yep, I used it with dnsmasq and dnscrypt-proxy. Have a systemd timer that downloads new rule files every day. dnscrypt-proxy can even automatically load config file changes without service restart. Overall it has been working great. The few false positives (well, to some extent not) are websites that are legitimate but runs software with vulnerabilities, and were used for some phishing campaigns.

Lots of fun details in Huang's thesis: static vs dynamic prefetching (static is fine), computation of how much one could gain if cache-miss latency were eliminated, what the GC time would be if hardware prefetchers were disabled (20-80% slower; see attached figure); mutator time without prefetchers (sometimes it's better??!?); how to use "perf mem"; all good stuff!

I don't know what's in the water at ANU but they have been doing lots of great work at all levels recently

Claire Huang wrote an undergraduate honor's thesis, supervised by @steveblackburn and @caizixian https://www.steveblackburn.org/pubs/theses/huang-2025.pdf

She uses sampling PEBS counters and data linear addressing (DLA) on Intel chips to attempt to understand the structure and attribution of load latencies in MMTk.

After identifying L1 misses in the trace loop as a significant overhead, she adds prefetching and reduces GC time by 10% or so across a range of benchmarks, and more on Zen4.

Well I wasn't going to write this post until Friday, but I spent a bunch of time beating my head against python versioning issues, and while waiting for things to happen I decided to finish this.

So: Allocation profiling with bpftrace!

https://www.mgaudet.ca/technical/2025/5/28/finding-hot-allocation-sites-with-bpftrace

Finding hot allocation sites with bpftrace — Matthew Gaudet

Matthew Gaudet