I took some time to finally hack up CLONE_AUTOREAP with pidfds. A much requested feature (e.g., see [1] and [2]).
Current state of the branch is in [3].

[1]: https://github.com/uapi-group/kernel-features/issues/45
[2]: https://github.com/kata-containers/kata-containers/issues/12489
[3]: https://github.com/brauner/linux/commits/work.pidfs.autoreap/

#kernel #linux #pidfd

And btw, #pidfd file handles can be generated and opened unprivileged. The only restriction that's enforced is that the task the #pidfd refers to must be resolvable in the caller's pid namespace.

https://web.git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs.git/log/?h=work.pidfs.fhandle

That's pretty neat because you can open up a #pidfd solely based on the inode number (they're unique and race-free).

Making sure you're not a bot!

I have a series ready that allows generating and opening autonomous #pidfd file handles.

Regular file handles require the caller to specify a file descriptor identifying the filesystem with open_by_handle_at().

For in-kernel filesystems that are a singleton (single superblock for the lifetime of the system) that's just not at all necessary.

Capable filesystems can opt-in to generating autonomous/self-sufficient/self-descriptive file handles.

@mhoye thanks for the update @bluca!

So this is apparently implemented since #DBus 1.15.8 ( https://gitlab.freedesktop.org/dbus/dbus/-/merge_requests/398 ) and should theoretically allow to have this functionality in #systemd

@bluca do I understand this correctly, that this doesn't need any changes to the request-initiating application but only requires a Kernel with #PIDFD support?

@pid_eins @sjuvonen

Use PID FD if available from SO_PEERPIDFD, and return it via GetConnectionCredentials() (!398) · Merge requests · dbus / dbus · GitLab

Requires !399, and a kernel release...

GitLab

@mhoye the problem with garbage collection is, that it has to be deterministic and not heuristic to properly work, otherwise random processes might get killed e.g. due to PID reuse or fuzzy process identification.

The only proper way I see right now to do this would be based on a #PIDFD that the initiating process would have to pass to #DBus that then can be used for tracking the process state.

@pid_eins @sjuvonen

@mhoye ah, but I see... If anything in your session activates a system service via #DBus, this one will be around still after your user-session is gone...
In "systemctl show ..." I can't identify any attribute that would indicate that #systemd stores the original activator of the service and even if it were, this might be quite hard to do properly without some kind of #PIDFD like mechanism.

Any ideas @pid_eins?

@sjuvonen

"Adhemerval Zanella (5):
linux: Add posix_spawnattr_{get,set}cgroup_np (BZ 26731)
posix: Add pidfd_spawn and pidfd_spawnp (BZ 30349)
posix: Add pidfd_fork (BZ 26371)
posix: Add PIDFDFORK_NOSIGCHLD for pidfd_fork
linux: Add pidfd_getpid"

YES YES YES, TO ALL OF IT.
https://sourceware.org/pipermail/libc-alpha/2023-July/149741.html

#glibc #linux #kernel #pidfd #cgroups

[PATCH v6 0/5] Add pidfd and cgroupv2 support for process creation