PSA: So you want to be a good kid, and understand that UNIX file system paths are kind wonky, and not stable references to inodes. So you drink the Linux cool-aid, and become a heavy O_PATH user: you pin all inodes via fds, validate them before you open them, use openat() heavily to get from one inode to descendents and are extra careful everywhere. And you think you saw the light.

But then one day, you realize, you *actually* have been doing it all wrong.

Because here's the thing: if you go from one dir for which you have an O_PATH fd to another dir contained in it, via openat(fd, name, O_PATH|O_CLOEXEC), then this actually doesn't trigger autofs mounts. So if "name" actually is an autofs mount, you end up pinning an inode on the autofs, and not one on the file system it's supposed to be overmounted with. So if you then use openat(…, O_PATH|O_CLOEXEC) to go further down, it will *always* fail, because the autofs doesn't contain any further…

…inodes this could possibly open for you.

Yikes.

But here I am, to help you, if you find yourself in this situation. As it turns out there actually *is* a system call you can use here that does what is needed, and (almost) noone knows about it: open_tree().

(See: https://github.com/brauner/man-pages-md/blob/main/open_tree.md)

If used without the OPEN_TREE_CLONE flag (and that part is crucial) it is equivalent to openat() with O_PATH, except in one regard: it will trigger automounts if you want, and it will thus get…

man-pages-md/open_tree.md at main · brauner/man-pages-md

Contribute to brauner/man-pages-md development by creating an account on GitHub.

GitHub

…you an O_PATH fd to the overmounting fs, not the autofs one.

Yay!

TLDR: there's a good chance many (most?) of the openat(O_PATH) calls in the wild are kinda wrong, and everyone should have used open_tree() instead...

Lesson learned:

https://github.com/systemd/systemd/pull/38048

(And of course, @brauner thanks for enlighening me about this fix)

chase: when chasing paths, trigger automounts by poettering · Pull Request #38048 · systemd/systemd

As it turns out open() with O_PATH does not trigger autofs, you get a reference to the autofs inode, if not triggered. But there's a way out: open_tree() (when specified without OPEN_TREE_CLONE...

GitHub
@pid_eins @brauner
this could be an extremely wrong / nonsensical question because idk how much of this works internally, but does it also apply to using openat() *without* O_PATH?
@refi64 @brauner no, if you do not use O_PATH nothing of the above matters. But you kinda have to use O_PATH if you want to operate securely, because it allows you to pin an inode first, figure out its details and then act on it. If you open it directly you might end up talking to a driver because you opened a device node accidentally, or all kinds of other weird stuff.