Inspired by some recent work build times, I was looking at Unity's "MegaCity Metro" sample in Unity 6. I got curious about at its build times. In particular for incremental builds (fancy speak for "I pressed the build button repeatedly but only made minimal changes inbetween.")

In my case, the "minimal changes" are actually "no changes at all." I just pressed the button repeatedly, because I saw that a lot of what I saw when making minimal changes is also present without any changes.

The total time for this process comes in at roughly 32s. There are two parts: Part 1 is what DOTS (Unity's new tech stack) is doing, Part 2 is what Unity classically is doing to build a player. The first step takes 25.5s, the second 6.5s.

I got curious about the first step, since I also see this in a build in another project. It spends almost the entirety of the time hashing some files, apparently. Hashing is not surprising for a build system, it is completely natural behavior. The time spent here however is!

My first thoughts were: can we make this faster? Run in parallel? There are two operations here, are they maybe hashing the same files and we can save half the work? -- But those are the wrong questions to ask.

The right question to ask is this: This is a warm build. There are no I/O stalls. 25s is an eternity, and this sample is still a comparatively small game. What tens of gigabytes of data are we even hashing here?

So I put a cache into the hashing and hash every file we see exactly once. I clear the cache at the very beginning of the build, then use it for all build steps. The cache keeps track of what's requested. We are hashing the same file 19922 times, and not much else.

Lo and behold, with the cache in place I now had to zoom in to even find these steps: the time dropped to 800ms, from 25.5s. The hashing is about 2ms, the rest is different overhead.

The lesson here is that you should always look at the data and ask "is this reasonable?"

Another insight is the by now old-news that @superluminal is awesome because it has mixed callstack support for Unity.

(I can also tell you how to make many other parts of Unity faster. This change here only requires adjustments to the Scriptable BuildPipeline package, but other changes require Unity source access. Get in touch if this is something you want to explore.)
@sschoener @superluminal why in the world is it hashing the same file 20k times :D
@dotstdy Because the interface that it is using doesn't hash files but "object identifiers", and the object identifier contains a path. Multiple objects can live in the same file, but it always hashes the entire file and then salts in additional data. So there is no step that explicitly goes over all files and hashes them, but it goes over all objects it is interested in and hashes them. Wrong level of abstraction, apparently.
com.unity.scriptablebuildpipeline/Editor/Shared/PrefabPackedIdentifiers.cs at fc1b2d2a0c897287604f8aa591f8d20786274041 · needle-mirror/com.unity.scriptablebuildpipeline

The Scriptable Build Pipeline moves the asset bundle build pipeline to C#. Use the pre-defined build flows, or create your own using the divided up APIs. This system improves build time, fixes in...

GitHub
@sschoener Ugh, so much wrong here. We spent a lot of time, making sure that an empty build does *nothing*. But so easy for packages, etc, to break and regress it. Should have added tests for this to big projects (like MegaCity). If you are interested in digging deeper, opening Library/Bee/fullprofile.json in Chromes tracing viewer should give you a lot of details of what is going on.
@jonas Thank you! Yes, I think that incremental builds *when setup correctly* are something incredibly fast in Unity. It's just so easy to break that in practice people don't get to experience it. In this case, the culprit is DOTS, but the problem could just as well have been a user script.
@sschoener Yeah, it's too fragile in practice. In retrospect, the biggest regret I have is that we never touched the "data build" part (and some package or custom project code _always_ changes some data between builds on reasonably non-trivial projects). "Data builds" were always "Don't worry about that, someone else already has plans for that" (which was a frequent cause of stuff not being fixed at Unity for very long times).
@jonas yeah, everybody has a build script, and not everybody that has a build script understands the responsibility they just signed up for, haha. Luckily a lot of the problems are often easy to fix.
@sschoener yes, these issues are individually easy to fix. The problem is making sure they don't resurface. IIRC, I tried to add a "repeat build does nothing test" to the package validation suite, but it was complicated for reasons I don't remember :)
@jonas I get that people want to automate these things (if Factorio has taught us one thing, then it is that "Yes, you do want to automate that"), but generally speaking having someone just profile things once before release is not a bad solution until the automation exists :)
@sschoener but relying on "someone should test this once before release" is not realistic to work if you have hundreds of packages, and you want to make sure that non of them ever regress to break the incremental build (especially if the regression is often not significant in small test project builds).
@jonas right now neither automation exists nor manual testing, apparently. This is not a random package going rogue but DOTS. Sure, it doesn't scale. But it doesn't have to either to fix a pretty large chunk of issues. Perfect being the enemy of the good is a pretty common mistake at Unity.