I'm building webkit-gtk right now. It's one of these messy packages where a few source files need a lot of memory to compile, and ninja can randomly order jobs so that all of them suddenly start compiling simultaneously. So to keep things going smoothly without OOM-ing, I've been dynamically adjusting the available job count via steve the #jobserver.
While doing that, I've noticed that ninja isn't taking new jobs immediately after I increased the job count. So I've started debugging steve, and couldn't find out anything wrong with it. Finally, I've looked into ninja and realized how lazy their code is.
So, there are two main approaches to acquiring job tokens. Either you do blocking reads, and therefore wait for a token to become available, or you use polling to get noticed when it becomes available. Ninja instead does non-blocking reads, and if there are no more tokens available… it waits till one of its own jobs finish.
This roughly means that as other processes release tokens, ninja won't take them until one of its own jobs finish. And if ninja didn't manage to acquire any job tokens to begin with, it is just running a single process via implicit slot, and that process finishing provides it with the only chance to acquire additional tokens. So realistically speaking, as long as there are other build jobs running in parallel, ninja is going to need to be incredibly lucky to ever get a job token, since all other processes will grab the available tokens immediately.
This isn't something that steve can fix.
0 days since it turned out that GNU #make does not respect the #jobserver protocol it designed itself in yet another way. Or to put it otherwise, I've wasted my whole morning implemented something that cannot work because it didn't occur to me that GNU make people only set rules without caring to actually follow them.
#Gentoo #jobserver revealed another problem with steve in particular, and (I believe) the jobserver protocol in general: blocking clients are prioritized over polling clients.
The problem is simple: when handling blocking reads, steve can issue a job token immediately. When handling a poll, it merely indicates that a token is available, and the client must issue another read request to get it. So if tokens are scarce and there are both blocking and polling clients running, the former are likely to be taking all the incoming tokens.
My idea of working around this is to implement temporary reservations. If a client polls for a token, we reserve one for it. The reserved token can afterwards be only read by the same client. This way, both blocking and polling clients get a token — the former get it immediately, the latter get it reserved for them. And if there are no tokens available, both get into a single FIFO queue, for a poor man's round-robin (steve also throttles all reads to one token at a time).
However, polls technically don't guarantee that the client will eventually read the token, so we need to handle reservation expirations as well.
Worked on some more #Gentoo global #jobserver goodies today.
Firstly, Portage jobserver support patch: https://github.com/gentoo/portage/pull/1528. It's not yet perfect, but it seems to work. If enabled via FEATURES=jobserver-token, it causes emerge to acquire a job token for every job it starts, therefore:
1. emerge won't start more jobs than the jobserver specifies. For example, `steve -j12` will cap emerge at 12 jobs, even if you specify --jobs=16. However, lower cap for emerge will be respected, so you can leave a bunch of job tokens free for other processes to use.
2. emerge will share the job pool with other jobserver-enabled software such as GNU make or Ninja. When emerge is using up all 12 job tokens, all builders inside it will run serial. When it's running 6 jobs, all builders will be able to start up to 6 extra jobs, etc. Ideally, no more than 12 active jobs will be running at a time over the system.
3. Jobserver-enabled software will now correctly respect jobserver job count. When running with externally-provided jobserver, these tools assumed that the jobserver already acquired one token for them, and therefore ran one extra job without a token; say, 13 jobs for a single emerge process. Now the token is actually acquired, so the accounts will match.
Secondly, I've packaged dev-python/pytest-jobserver with a bunch of fixes to make it behave correctly and support Gentoo jobserver. We're not integrating it with the eclasses yet, but you can install it and force-enabled via EPYTEST_FLAGS="-p jobserver". With it, pytest-xdist will acquire job tokens while running tests, and therefore parallel #PyTest jobs will also be counted towards total job count.
Again, it's not a perfect solution, but it works reasonably. The plugin still starts -n jobs as specified by the arguments, but it acquired job tokens prior to executing every test, therefore delaying actual testing until tokens are available. It doesn't seem to cause noticeable overhead either.
New on my #Gentoo blog: One #jobserver to rule them all
"""
A common problem with running Gentoo builds is concurrency. Many packages include extensive build steps that are either fully serial, or cannot fully utilize the available CPU threads throughout. This problem becomes less pronounced when running building multiple packages in parallel, but then we are risking overscheduling for packages that do take advantage of parallel builds.
Fortunately, there are a few tools at our disposal that can improve the situation. Most recently, they were joined by two experimental system-wide jobservers: #guildmaster and #steve. In this post, I’d like to provide the background on them, and discuss the problems they are facing.
"""
https://blogs.gentoo.org/mgorny/2025/11/30/one-jobserver-to-rule-them-all/
#Steve the #Jobserver has undergone a major rewrite over the last week. It's now implemented using CUSE, the #FUSE API for character devices. It is using pidfd to track processes acquiring job tokens, and automatically reclaims them if processes die without returning them, preventing dead processes from effectively locking the system jobserver.
The code's still a bit ugly — it's a C-changed-midway-to-C++, with libevent for event loops and (still) FUSE's ugly argument parsing.
If someone wants to play with it, the live ebuild is available in #Gentoo as dev-build/steve.