software is amazing... my otherwise-idle 128 GB machine just started swapping and then invoked the OOM killer to nuke a Debug + code coverage LLVM build
@regehr how much parallelism?
@whitequark @regehr build system which gracefully handles memory pressure any day now!
@dotstdy @regehr is this even a build system's job? I can see both answers being appropriate, but if "yes" then you'll have to deal with a lot of pain making a cross-platform one
@whitequark @dotstdy @regehr ninja has a concept of "pools" for exactly this sort of thing: limiting the number of concurrent linker invocations: https://ninja-build.org/manual.html#ref_pool
The Ninja build system

@tedmielczarek @whitequark @regehr yeah but that's very much the "problem is users own" approach, since the value for every concurrency limit needs to change depending on the available hardware resources.
@dotstdy @tedmielczarek @regehr but we don't have a way to predict how much memory a linker invocation will consume, do we? in which case that's sort of what we're stuck with, since any solution must necessarily be reactive
@whitequark @tedmielczarek @regehr I think tuning it the other way (i.e. this invocation requires 2gb) and having the job server track a budget, delaying launch (without introducing deadlocks) would allow you to scale to arbitrary hardware so long as the annotations are somewhat accurate. But yeah it's fraught with complexity. See also some kind of back-off recovery for extreme pressure since the work is ideally possible to re-launch.
@dotstdy @tedmielczarek @regehr I don't think there's any way to make the annotations somewhat accurate

@dotstdy @tedmielczarek @regehr I think what would really help is if a build system had the knobs to suspend a process if it consumes too much memory. ld eats more than 2 GB? pause it, let other processes finish, then let it restart

I don't know if that's feasible

@whitequark @dotstdy @tedmielczarek it doesn't seem hard (deadlocks notwithstanding)
@regehr @dotstdy @tedmielczarek how would you do it? SIGSTOP?
@whitequark @dotstdy @tedmielczarek any convenient mechanism would be fine for pausing the process, and then I guess a new syscall to push the entire process out of RAM? I mean, most unixes have been able to do that, but I don't know that Linux currently can...
@regehr @whitequark @dotstdy @tedmielczarek now this description is starting to feel like using CRIU to checkpoint a gcc process to a file in userspace
@mokomull @regehr @dotstdy @tedmielczarek I did think of maybe using CRIU here, but it feels like it'll cause hard-to-debug issues in slightly unusual configurations (what if a compiler plugin uses a network to talk to a database? this exists), not to mention being completely non-portable
@whitequark @regehr @dotstdy @tedmielczarek I wrote off portability when working-around-the-oomkiller's-behavior was mentioned ... and when I've had this same frustration has been on shockingly-modern Linux kernels at $last_job.
@mokomull @regehr @dotstdy @tedmielczarek Windows and Darwin definitely have the APIs to account for a process' memory use and to kill it! seems like it could be portable to those platforms at least (and frankly almost any I can think of)