Mastodawn

software is amazing... my otherwise-idle 128 GB machine just started swapping and then invoked the OOM killer to nuke a Debug + code coverage LLVM build

Show thread

✧✦Catherine✦✧2d ago

@regehr how much parallelism?

Show thread

Josh Simmons 2d ago

@whitequark @regehr build system which gracefully handles memory pressure any day now!

Show thread

✧✦Catherine✦✧2d ago

@dotstdy @regehr is this even a build system's job? I can see both answers being appropriate, but if "yes" then you'll have to deal with a lot of pain making a cross-platform one

Show thread

Ted Mielczarek 2d ago

@whitequark @dotstdy @regehr ninja has a concept of "pools" for exactly this sort of thing: limiting the number of concurrent linker invocations: https://ninja-build.org/manual.html#ref_pool

The Ninja build system

Show thread

Josh Simmons 2d ago

@tedmielczarek @whitequark @regehr yeah but that's very much the "problem is users own" approach, since the value for every concurrency limit needs to change depending on the available hardware resources.

Show thread

✧✦Catherine✦✧2d ago

@dotstdy @tedmielczarek @regehr but we don't have a way to predict how much memory a linker invocation will consume, do we? in which case that's sort of what we're stuck with, since any solution must necessarily be reactive

Show thread

Josh Simmons 2d ago

@whitequark @tedmielczarek @regehr I think tuning it the other way (i.e. this invocation requires 2gb) and having the job server track a budget, delaying launch (without introducing deadlocks) would allow you to scale to arbitrary hardware so long as the annotations are somewhat accurate. But yeah it's fraught with complexity. See also some kind of back-off recovery for extreme pressure since the work is ideally possible to re-launch.

Show thread

✧✦Catherine✦✧2d ago

@dotstdy @tedmielczarek @regehr I don't think there's any way to make the annotations somewhat accurate

Show thread

✧✦Catherine✦✧2d ago

@dotstdy @tedmielczarek @regehr I think what would really help is if a build system had the knobs to suspend a process if it consumes too much memory. ld eats more than 2 GB? pause it, let other processes finish, then let it restart

I don't know if that's feasible

Show thread

John Regehr 2d ago

@whitequark @dotstdy @tedmielczarek it doesn't seem hard (deadlocks notwithstanding)

Show thread

✧✦Catherine✦✧2d ago

@regehr @dotstdy @tedmielczarek how would you do it? SIGSTOP?

Show thread

John Regehr 2d ago

@whitequark @dotstdy @tedmielczarek any convenient mechanism would be fine for pausing the process, and then I guess a new syscall to push the entire process out of RAM? I mean, most unixes have been able to do that, but I don't know that Linux currently can...

Show thread

Matt Mullins 2d ago

@regehr @whitequark @dotstdy @tedmielczarek now this description is starting to feel like using CRIU to checkpoint a gcc process to a file in userspace

Show thread

✧✦Catherine✦✧2d ago

@mokomull @regehr @dotstdy @tedmielczarek I did think of maybe using CRIU here, but it feels like it'll cause hard-to-debug issues in slightly unusual configurations (what if a compiler plugin uses a network to talk to a database? this exists), not to mention being completely non-portable

Show thread

Matt Mullins 2d ago

@whitequark @regehr @dotstdy @tedmielczarek I wrote off portability when working-around-the-oomkiller's-behavior was mentioned ... and when I've had this same frustration has been on shockingly-modern Linux kernels at $last_job.

Show thread

✧✦Catherine✦✧

@mokomull @regehr @dotstdy @tedmielczarek Windows and Darwin definitely have the APIs to account for a process' memory use and to kill it! seems like it could be portable to those platforms at least (and frankly almost any I can think of)