Here is my best attempt to articulate why I believe all dependencies, including compiler toolchains, belong in version control.

https://www.forrestthewoods.com/blog/dependencies-belong-in-version-control/

Dependencies Belong in Version Control

Why dependencies should be checked into version control.

@forrestthewoods what about using SVN instead of git? If everyone is constantly syncing against the same server, what's the advantage of being distributed?
@morten_skaaning i don't think the vcs actually matters that much. the main reason not to ship the toolchain is that they tend to be enormous and a huge pita to untangle from their installers.
@dotstdy then put the installers in svn.
@morten_skaaning putting the installers in helps a bit, but then you still have a really poor experience, and people have to manually run the installer and you need to correctly use the installed version over whatever else is on the workstation. it's a struggle. :/
@dotstdy @morten_skaaning maybe we need to start asking toolchain vendors to package their toolchain such that it's trivial to commit to a VCS
@JamesWidman @morten_skaaning yeah it's something I think you'll find is often begged for from the game dev community at least. With mixed success. :')
@morten_skaaning TBH I haven’t used SVN in 20 years. I’m not sure its modern feature set. It’s never really impressed me though.
@forrestthewoods the feature is that you can download exactly the snapshot you need of the global state and it handles large binary files...
@morten_skaaning is there any reason to use SVN over Perforce? Other than cost?

@forrestthewoods @morten_skaaning SVN was designed as a direct replacement for CVS, and included such novelties as tracking the state of the entire repository tree, rather than per-file checkouts, or branches that you could actually understand how to use.

Anyone who has used CVS is not laughing at the circus.

@wolfpld @forrestthewoods I can't tell whether irony or sarcasm is used here?
@morten_skaaning @forrestthewoods No sarcasm this time. The majority of open source projects were using CVS at the time, and switching to SVN was a huge quality-of-life improvement.
@morten_skaaning @forrestthewoods Well, I guess my sarcastic tone that you read was directed at how shitty CVS was.

@wolfpld @morten_skaaning @forrestthewoods SVN was an incredible breath of modern fresh air over CVS.

I still prefer SVN over git in a lot of ways even though git obviously has some upsides as well. It just feels like the world collectively gave up on it and moved on. (At least the public world, I know a lot of closed source stuff still runs it successfully.)

@forrestthewoods There’s probably an interesting build on this essay about standardization, autoconf, and versioned package dependencies as a continuum of this solution (all the way out to Docker as you note but also starting with Linux package managers many years ago).
Related, object lesson I ran into recently: https://bugzilla.mozilla.org/show_bug.cgi?id=1866602
1866602 - Attempting to update version control tools results in KeyError: 'remote_hidden'

UNCONFIRMED (nobody) in Developer Services - Mercurial: configwizard. Last updated 2023-11-24.

@mtothevizzah Python callstacks like that trigger my PTSD. Urghhh.
@forrestthewoods Yeah, pretty opaque. Nice to have a very responsive dev, though.
@forrestthewoods so I guess it is Perforce then, as it excels at this particular use-case.

@forrestthewoods @mtothevizzah as you pointed out, your dream VCS exists (more than once?). Google monorepo + VFS + VCS is exactly like that. Everything checked in from LLVM through libraries to your project, everything passing tests, everything building, everything updated to the same version (so you don't end up with 10 libraries using 10 versions of Eigen or whatever).
I absolutely loved it. :)

Oh, and I obviously also strongly agree with all your post's points. This is the only sane way.

@BartWronski @mtothevizzah yeah Meta is the same way. But with Buck and a custom Mercurial. It really is The Way.

If only the tooling were more accessible to the outside world!

@forrestthewoods @mtothevizzah Google open sources bits and pieces (Bazel). I think they would like to open source more, but a huge chunk of "magic" of how well it works is in tight integration with their infrastructure, build servers, IT, DevOps - a huge machinery with no magic single component but requiring all to run in tandem. Any single component would be just "ok" and nerfed for a standalone open source release...

@BartWronski @mtothevizzah there’s definitely a “more than the sum of its parts” aspect. Bazel/Buck both work in the wild. But the monorepo is missing which is effectively what I’m arguing should exist in the public sphere.

Build cache, build farm, CI systems etc are another layer. But I think there’d be value in “Perforce but sucks less and is broadly available”

@forrestthewoods FWIW the "sparse" stuff in git is I think workable these days. Here's an example: https://gist.github.com/jsimmons/65acdcab3a91217717a93f66fa848c06
putting all the llvm versions into a git repo and using sparse magic to avoid copying it all down

putting all the llvm versions into a git repo and using sparse magic to avoid copying it all down - git.txt

Gist

@dotstdy yeah with the right sparse, shallow, and maybe LFS(?) you can make it work. I’m so sad Microsoft abandoned GitVFS in favor of sparse. Not sure why.

Sparse is still inferior to VFS imho. VFS “just works” (modulo bugs). Sparse requires maintenance and can be fragile. But it can be done! FWIW at work we use to have sparse lists and VFS is soooo much better.

@forrestthewoods They abandoned it because they wanted to support mac apparently.

@dotstdy makes sense. But seems like a bad reason!

The world clearly needs a decent cross-platform VFS library. Unfortunately it needs to wrap FUSE for Linux, ProjFS for Windows, and macFUSE for Mac.

And possibly something else in the future when Apple inevitably breaks things with more lockdown.

@forrestthewoods Mac have already removed those APIs, that's why they gave up. The blog post is like "we had the basics up on mac and then they removed the APIs we need" :') best platform ever

@dotstdy I couldn’t tell if they were actually gone or not.

FWIW we have a VFS at work that works across the board. Public GitHub leads me to believe macOS uses NFSv3. Three parallel implementations for three platforms, woo!

@forrestthewoods I think it's deprecated and there's no workable replacement. So I guess they read that as a nightmare they didn't want to deal with whenever a random macos update completely breaks everything.
@dotstdy we have something that definitely works. I assume it’s using NFSv3 but don’t know. Also not sure why it macOS can use that why not use it on Linux and Windows as well? Way outside my expertise…

@forrestthewoods @dotstdy Isn't Scalar supposed to be their replacement for GitVFS:

https://github.com/microsoft/scalar

GitHub - microsoft/scalar: Scalar: A set of tools and extensions for Git to allow very large monorepos to run on Git without a virtualization layer

Scalar: A set of tools and extensions for Git to allow very large monorepos to run on Git without a virtualization layer - GitHub - microsoft/scalar: Scalar: A set of tools and extensions for Git t...

GitHub
@idbrii @dotstdy yeah. They abandoned VFS for sparse clones. Having dealt with monorepo sparse clones I think virtual is much better

@forrestthewoods Agreed! We haven’t taken the step to also include the compiler, but otherwise every single dependency is in P4. A build on a clean machine is just 1) sync 2) install VS 3) build.

Regarding a virtual filesystem based VCS, it’s something that comes up now and then, but it feels to me like something that would only work in an environment where *everything* is aware of the VFS-ness. E.g. stuff like a “find in files” in your editor would end up pulling the whole source tree.

@forrestthewoods yes! And, packaging up things like “visual studio” etc is not even that massive. IIRC at Unity I had that in like 200MB without even trying hard. This is for old version, but idea the same https://gist.github.com/aras-p/e5df1b7a3374b99ae31f053b14403d92
Packaging up Visual Studio & Windows 10 SDK for in-repository usage

Packaging up Visual Studio & Windows 10 SDK for in-repository usage - package_builds_vs2017.cmd

Gist
@aras @forrestthewoods I have been asking VS/Xbox for the ability to do clean vendorized installs of the debuggers for a while. It's one of the really messy bits with no good solution.
@forrestthewoods @aras hmm .. that works? I would've expected visual studio to set all kinds of archaic registry stuff that would make it not function if not "properly" installed
interesting
@logicalerror @forrestthewoods that’s what we did at Unity, so worked at least there :)

@aras @logicalerror it requires setting up your build tool to point at the committed toolchains. And C++ has a million different build systems.

At the end of the day some process calls cl.exe and link.exe. There is surprisingly little magic.

I’m increasingly a fan of nuking environment variables to prevent random PATH bullshit from leaking in. :)

@forrestthewoods @aras you mean on the server side? I never noticed visual studio in the unity repo? (certainly didn't look for it though)
do you mean it would've compiled even I hadn't installed visual studio? 🤔
@logicalerror @forrestthewoods yes (since 2018 or so). That’s about the first thing “how to setup build environment” docs used to say. And even then, I had to explain to people over and over again that no, changing your local VS install version will not affect the build one iota.
@logicalerror @forrestthewoods …and yes VS toolchain and Win10 SDK were in the repo itself, under PlatformDependent somewhere
@forrestthewoods @aras huh how about that.. left unity 2 years ago and still learning new things about its build system 😂
@aras @forrestthewoods but pretty cool that's possible .. but I'm guessing due to licensing this can't be done in an open source project
was xcode part of the build system too?
@logicalerror @forrestthewoods yes. Not Xcode the IDE, but the compiler + platform SDK bits for Mac and iOS, yes. Same for Android etc.
@forrestthewoods @aras nice, android I expected since it's open source, didn't expect the commercial compilers though
I guess this partially explains unity never open sourcing its source 😂
@logicalerror @forrestthewoods that’s NOT why. All these were already properly stripped for source code customers, and there the build system does fallback to system-installed toolchains. Open source (if it ever happened) would have used exact same approach.
@aras @forrestthewoods I didn't know about the stripping, but I did say "partially", didn't claim this is THE reason. Obviously it would've been possible to work around this, but it would take effort & time. Didn't know that was already done
@aras @logicalerror @forrestthewoods a similar thing was done for Wolfenstein development. I always thought it was a great idea. Especially made updating the project and toolchains trivial to roll out to the entire team.

@aras @forrestthewoods I miss that sooooo much. Stevedore, the thing downloading all the toolchains, was one of the best part or our buildsystem at Unity!

Getting a lil bit closer now again by using https://prefix.dev/ at work to define all the tooling dependencies. But obviously it can't redistribute stuff like Visual Studio (but pins the version for that one at least :))

7 Reasons to Switch from Conda to Pixi

Pixi is conda-compatible and comes with more speed, lockfiles and tasks. It is the next-gen package manager for Python and R and more.

prefix.dev
@wumpf @aras @forrestthewoods the whole build system at unity was amazing. we were spoiled there
@logicalerror @wumpf @aras @forrestthewoods it really was the smoothest build system I've ever used for that size of projects. I still have the scoop bucket with the right version of perl to make building in a git bash shell on windows seamless.
@shana @wumpf @aras @forrestthewoods somebody should really do a public talk about the unity build system
@forrestthewoods I agree with you. From gamedev perspective, there is a room for a new, better VCS with all the pros of Git, P4, and more. Committing the compiler and standard library may be too much IMHO, but 3rd party libraries definitely yes. I'm mostly thinking about game assets. Having to download 100's of GB of them to every dev machine and keeping them in sync with code is a big failure and room for improvement that no one has time to fix because everyone is crunching to ship their game.

@forrestthewoods
I largely agree. I go so far as to check in generated code, because it both saves other people time, and it reduces opportunities for skew (and unexpected diffs show where there's a problem!)

You'll need the system image too with all the system libraries. At some point, is better to archive the entire disk. Use a well defined VM (or container) for building, so it's reproducible forever!

@StompyRobot I’m also a fan of committing generated code. Makes things easier to search. Can cause merge pain though.

I think it’s potentially worth having a full system image. But if you commit full toolchains it is largely unnecessary!

@forrestthewoods
You'll want the runtime libraries the toolchain uses.
Libc versions, .net runtime versions, msvc dll versions, ...
Sadly, most of this isn't statically linked.

Btw, if you get a conflict in the generated code, just blow away upstream, because you have the current source version and generation is deterministic. (Right?)

(If this doesn't work, you have some determinism problems to fix!)

@StompyRobot correct. When I say commit the full toolchain I really do mean everything in the Visual Studio/LLVM/etc directory! All of it!

Yeah re-generating code should definitely work. But it can make VCS cranky when it fails to cleanly merge/rebase and it wants to run merge tools or inject conflict markers.

Ideally VCS knows what files are generated and how to regenerate them. But that’s a separate story!

@StompyRobot @forrestthewoods
re:
> You'll need the system image too with all the system libraries.

note, this will not work on macOS on Big Sur or later, because:

> New in macOS Big Sur 11 beta, the system ships with a built-in dynamic linker cache of all system-provided libraries. As part of this change, copies of dynamic libraries are no longer present on the filesystem.
[...]

@StompyRobot @forrestthewoods
[cont'd]
> Code that attempts to check for dynamic library presence by looking for a file at a path or enumerating a directory will fail. Instead, check for library presence by attempting to dlopen() the path, which will correctly check for the library in the cache. (62986286)

source:
https://web.archive.org/web/20200902041407/https://developer.apple.com/documentation/macos-release-notes/macos-big-sur-11-beta-release-notes/

Apple Developer Documentation