Mastodawn

P-Y Sep 20, 2024

Nothing like debugging heap graph traversal code at the airport with a cocktail & steak

#AndroidDev #dcnyc2024

P-Y Sep 20, 2024

I'm playing with rendering a heap dump as a treemap based on the heap's dominator tree. The issue is, most nodes end up having root as a dominator, making this not a very useful visualization. So I'm trying to figure out what's leading to most nodes being dominated by the forest root.

P-Y Sep 20, 2024

Boarding, jfk to sfo. I might figure this out and build a very cool thing . Most likely I'll fall asleep though. Wish me luck

P-Y Sep 21, 2024

Uuuugh so this was a bad idea.. because now I thought deeply about the existing code and realized I probably made a dumb mistake, and I think I found an edge case that shows I never really computed dominators right.

So, let's go back to basics, why do we care about dominators in a jvm heap anyway?

P-Y Sep 21, 2024

I learned about dominators from reading the Android Studio code, and how it computes retained size.

An easy def is that a dominator D of a node A is a node that if removed would make A unreachable.

Hence why if you sum up the size of all the nodes that a dominator dominates, you get the retained size.

The immediate dominator is the closest dominator to the node.

P-Y Sep 21, 2024

I had this algorithm in my head for progressively building up dominators: do a graph traversal, first time you find a node set it's parent as dominator, then every time you find that node again you go up the dominator chain you've already built, on both the previous parent side and the new parent side, until you find a common node, that's the new dominator.

I need to re read the AS code to see if it's doing something similar or not.

P-Y Sep 21, 2024

Anyway, the problem is, this doesn't account for running into the original parent node from yet another spot after you've done that. At which point you're not updating the dominated node because you're not exploring it, just its parent.

I wrote a failing unit test against the dominator data structure and confirmed it's broken.

So now I need to craft a fake heap dump and confirm I compute retained size wrong.

Then check Android Studio & Yourkit, see if they got different results.

P-Y Sep 21, 2024

For years now I've been meaning to go compare retained size between LeakCanary, Android Studio and Yourkit, and investigate any differences. Never got around to it, I had an intuition this could be a rabbit hole.

P-Y Sep 21, 2024

P-Y Sep 21, 2024

Yep, as I suspected, this is clearly not working when putting it all together.

P-Y Sep 22, 2024

So, I made a real life example, ran the code then dumped the heap. The results are interesting.

Note "AppWatcher.objectWatcher.watch(guide)" which tells LeakCanary that HitchHiker.guide is pretend leaking and to dump the heap.

P-Y Sep 22, 2024

Much to my surprise, LeakCanary is computing the correct result this time: "Everything" only retains but not "Universe", nor the answer "42". That's not what I was getting with my unit test, so I need to figure out what's different.

References are still 4 bytes here, but on an Android VM every instance has an additional shadow$_klass_ ref field and a shadow$_monitor_ int field.

Android Studio and LeakCanary seem in sync on retained size.

YourKit also has the same dominator logic (retained size for Everything is shallow self + shallow size of Universe), but interestingly it computes comes up with different totals. I suspect it's account for padding for byte alignment, where a 12 byte shallow size becomes 16 bytes? Not certain.

P-Y Sep 22, 2024

I double checked, and it turns out I picked a bad example. The string "42" was unfortunately already references by a GC root, because it's a static constant in... RoomMasterTable.defaultId 🤣

P-Y Sep 22, 2024

Ok, trying again with a dedicated Answer class this time, no more string interning

P-Y Sep 22, 2024

LeakCanary now does incorrectly show 3 objects leaks / 40 bytes, when it should be 2 objects (Everything + Universe) and 28 bytes.

YourKit is also correct in its dominator computation.

So, at this point, I've confirmed again & again that LeakCanary's approach is broken, and now I need to figure out a fix.. Unfortunately, a quick search shows standard dominators is expensive to compute (simple algo is quadratic!).

P-Y Sep 22, 2024

Here's where I first documented what I intended to do: https://github.com/square/leakcanary/issues/1344 and the PR: https://github.com/square/leakcanary/pull/1348

Before that, I was using the perflib impl: https://cs.android.com/android-studio/platform/tools/base/+/mirror-goog-studio-main:perflib/src/main/java/com/android/tools/perflib/heap/analysis/LinkEvalDominators.kt;l=36;drc=499fa43009666c0f0a686d8e21722dbea8b2ecf0

Looking through cs.android.com, I've also found a ahat impl: https://cs.android.com/android/platform/superproject/main/+/main:art/tools/ahat/src/main/com/android/ahat/dominators/Dominators.java;l=281;drc=f9ccccd1dd3e1a3e24f1f2238281555658090619

And some CPP impls: https://cs.android.com/android/platform/superproject/main/+/main:external/deqp-deps/SPIRV-Tools/source/opt/dominator_tree.cpp;l=23-38;drc=1a7f71afb42983b8b21bef656260964eb3852942

or

https://cs.android.com/android/platform/superproject/main/+/main:external/llvm/include/llvm/Support/GenericDomTree.h;l=438;drc=f9ccccd1dd3e1a3e24f1f2238281555658090619

Compute retained size · Issue #1344 · square/leakcanary

See http://www.cs.au.dk/~gerth/advising/thesis/henrik-knakkegaard-christensen.pdf

GitHub