Wrote an implementation of thread local storage in #no_std #rust, for Linux x86_64, with options to a) use libc's thread local storage, b) leverage libc's thread local storage that something in the same process has set up, or c) work without native thread local storage at all.

Toying with the idea of publishing it as a crate but, uh, I guess these requirements are pretty niche.

The context is that this is part of the Shadow simulator. We have an LD_PRELOAD'd shim library that needs a little bit of thread-local storage. So far we've usually gotten away with just using "regular" libc thread local storage (`static __thread`), but once in a while it bites us in the posterior. It doesn't help that most of this code runs in the context of a seccomp signal handler, so ought to be async signal safe.

So we're gradually rewriting it in #no_std #rust. We can make direct syscalls, which are allowed by our seccomp filter so as to avoid recursion, and that's about it.

Luckily most of what this library does is set up fast IPC to our main process and delegate most of the syscall handling to there, so there's not that much we need to implement in this constrained environment. It's certainly making for some interesting exercises, though.