Trying to answer a question no sane person ever had to ask: How Hard Is It, To Open a File?

This one is about the great POSIX idea of a filesystem, and why you could not play your games and open chrome for a few days.

https://blog.sebastianwick.net/posts/how-hard-is-it-to-open-a-file/

How Hard Is It To Open a File?

It’s a question I had to ask myself multiple times over the last few months. Depending on the context the answer can be: very simple, just call the standard library function extremely hard, don’t trust anything If you are an app developer, you’re lucky and it’s almost always the first answer. If you develop something with a security boundary which involves files in any way, the correct answer is very likely the second one.

swick's blog
@swick great post! The sad part is that even after all those years chaseat() in systemd still gets relevant changes every few months, that non-trivially rearrange my PoV on file system interfacing. I.e. right now we are working on reinventing chaseat() around a new InodeRef structure that combines and fd *and* a path into one (together with some other fields) so that we don't lose the ability to write useful log messages (you really want a path for that) but can do the actual ops via fds...

@swick it's amazing how broken and unsecure posix fs apis have been from day one and still are (i.e. there is no posix way to convert an O_PATH fd to a real one for regular files for example)...

And really sad that even modern programming language standard libraries always focus on the posix fs api, mostly ignoring the new stuff, -- rather than focussing on the newer stuff and then trying to retrofit the old stuff to work like the new stuff wherever possible.

@swick and it's really shameful that supposedly security minded programming language communities (rust...) don't grok that, and happily work with the guaranteed insecure traditinal posix stuff instead of doing things better. I am pretty sure posix fs shenanigans are a bigger attack surface these days to gain privs than frickin memory unsafety, and focussing solely on memory stuff ignoring the fs stuff is just bad security engineering.

@pid_eins @swick This!!!
It's not just essential for security, but also dramatically increases robustness of the resulting application - I ran into the latter just last week (debugging data loss for a non-security-relevant app).

Even though I know about all of this, I still use the POSIX-like interfaces a lot because they're default in many languages and readily available and "it's not security relevant anyway". Until it is. Better defaults would be so nice!

I wouldn't say "happily".

Rust standard library folks are very well aware of the ideal of doing fd-based operations whenever possible, and we'd love to. Linux is doing great work on adding ways to do everything one might want to do using fds. However, we can't force people to run on exclusively modern Linux, as opposed to old Linux or other OSes. And it's much more challenging to design *portable* interfaces around fds without accepting capability limitations or lowest-common-denominator.

We could probably make an extremely capable interface, if we stuck most of it in `std::os::linux`, and had some of it fail if run on older Linux.
@josh @swick @pid_eins As far as I know, the cap-std crate (https://github.com/bytecodealliance/cap-std) does what is explained in the blog post, using an API that is close to the standard library. We use it a lot in bootc and related projects.
GitHub - bytecodealliance/cap-std: Capability-oriented version of the Rust standard library

Capability-oriented version of the Rust standard library - bytecodealliance/cap-std

GitHub
@josh @swick yeah, i don't buy into that race to the bottom thinking. Designing stuff with the shittiest model in mind instead of the best is just awful engineering. Always figure out where you want to be, i.e. go for the summit — and then fill in the gaps/degrade gracefully where you have to on worse systems. But that's really not what rust is doing there. It's letting itself be held hostage by the worst system, and let's that heavily leak into its APIs...

@pid_eins Rust's std::fs and std::io were an MVP for the 1.0 release, and unfortunately mostly stayed like that.

Back then Rust still had to prove that the memory safety and data-race guarantees could even work, that it could be stable and backwards-compatible language, while having even bigger unfinished gaps in the language and libstd.

Now std::fs sticks out as the weak point, but Rust couldn't take moonshots in every aspect all at once. It would never ship.

@kornel @pid_eins I wonder whether it's still possible to fix that in rust's standard library. One quick and likely silly idea is to change std::path::Path and std::path::PathBuf to also optionally include a file descriptor and prepopulate that one on the first file system related sys-call. That might resolve most of those TOCTOU issues. It likely will break other stuff horribly, so for now that's just an silly idea without much research behind it.

@weiznich Unfortunately the `Path` API has a no-alloc conversion from `&str`, so there's no room for a new field. Adding fd there would require stuffing it into the path, which seems hacky and could backfire.

The Path isn't good anyway. Can't even store \0 for C APIs nor UCS-2 for Windows. Useless for browser FileSystem APIs (WASM).

@kornel I think there are plans to change the various Range types in an also incompatible way with the next edition. It might be possible to do something similar here as well, at least that would give "us" the ability to change the internal layout in an incompatible way. Yes this would cause a lot of churn as old migrated code would then use something like std::edition_2024::path::Path instead of std::path::Path and likely get a deprecation warning for the old items at some point, but it doesn't seem to be impossible to change it. Especially given that the various std::fs functions all take generic arguments that require AsRef<Path> (not sure if we would get away with using the new path there, although I expect it should be fine as long as all currently existing variants are accepted there.).
That are obviously all unfinished quick ideas what could be tried. Any of that would need a bunch of research first if it is wanted and then what exactly is feasible and what not.
@pid_eins FWIW, I have a change for glnx_chase which adds a strategic callback for every path segment that gets resolved, so we can, for example, build the path without adding more complexity to glnx_chase itself. I also hinted at a new cross-platform API in GLib/Gio where we would want to have an opaque handle, which for posix would contain the fd, but it could also contain the path as well. So yeah, I agree that it's the right design, but I think it's something we should do on top of glnx_chase.