This bug is a masterpiece and you owe it to yourself to read this. So much effort for such a situational bug, it's heartbreakingly beautiful.
https://www.qualys.com/2023/07/19/cve-2023-38408/rce-openssh-forwarded-ssh-agent.txt
This bug is a masterpiece and you owe it to yourself to read this. So much effort for such a situational bug, it's heartbreakingly beautiful.
https://www.qualys.com/2023/07/19/cve-2023-38408/rce-openssh-forwarded-ssh-agent.txt
The key to it: OpenSSH will open and close dylibs in response to the agent protocol, because that's how it implements smart cards.
Then: ELF lets you mark functions in your dylib that get invoked on open (__attribute__(constructor)) and close (destructor).
Finally: lots of libraries in /usr/lib have constructors with side effects (like registering system call handlers that don't get unregistered because they're never expected to be unloaded, or invoking system call handlers by crashing when you randomly load them). So: UAF primitive.
There's like several thousand words of exposition about how they found the right sequence of opens and closes to set up a signal handler, groom the address space, set the stack executable (another dlopen side effect!), and trigger the signal.
But what makes it :art: is this bit:
"As a last and extreme example of a remote attack against ssh-agent forwarding, we noticed that one shared library's constructor function (which can be invoked by a remote attacker via an ssh-agent forwarding) starts a server thread that listens on a TCP port, and we discovered a remotely exploitable vulnerability (a heap-based buffer overflow) in his server's implementation.”
@tqbf On SCW ya'll joke about stunt cryptography exploitation; this was truly a beautiful example of stunt vulnerability work by Qualys. :cook: :kiss:
The WTF for me was the whole "the PKCS#11 API consists of dlopen() and pray, with no 'here's how you recognize a legit and safe provider' mechanism". Where did this 90's level trust leak through from? Was it NSS/NSPR? Or a Solaris thing?
@guenther It sort of does figure out real PKCS#11 providers --- it dlsym's the PKCS#11 entrypoint function after it opens, and closes the library if it's not there.
Not only that, but they found actual PKCS#11 handler libraries that had these side effects, so even parsing ELF and doing the symbol lookup outside of dlsym wouldn't have completely killed this bug class! It's wild!
@tqbf Sorry, I meant no "side-effect free" recognition mechanism. I think ELF had DT_INIT from the start but maybe they weren't used enough back then to appear a risk.
The "your legit PKCS#11 handler library contains an exploitable vulnerability" problems feels more like the generic "all software sucks and we must continue to fix it" and not something that, well, OpenSSH or other users of PKCS#11 can really do anything about. I mean, the scope of a PKCS#11 handler is not something you can easily bound by limiting privileges, is it?
$PREFIX/etc/pkcs.whatev/*.json registry like there is for Vulkan layers) or at least "libraries placed under $PREFIX/lib/pkcs.thingy". Just an allow-list consisting of /usr/lib and /usr/local/lib seems ridiculously permissive.dlclose()?@Shamar @tqbf What did plan9 do instead which provided similar reduction in resource usage?
I am indeed assuming it did something because I'd just be disappointed if the answer was just "lol suck it up".
Its lack would also make dynamic FFI profoundly annoying to implement, since it'd require the generation of (throwaway?) stub programs and IPC (with predictable performance implications).
I suspect that the real problem is in glibc.
@tqbf
>*While browsing through ssh-agent's source code, we noticed that a remote attacker, who has access to the remote server where Alice's ssh-agent is forwarded to, can load (dlopen()) and immediately unload (dlclose()) any shared library in /usr/lib* on Alice's workstation (via her forwarded ssh-agent, if it is compiled with ENABLE_PKCS11, which is the default).*
Oh no, this will not end well