I find stack overflow security bugs fascinating; and on Linux, compilers still don't protect against stack overflows by default when stack frames are bigger than stack guard pages.

So I went looking around in Android, and thanks to how Android's RPC mechanism allows recursive synchronous callbacks in some cases, I managed to find a way to jump a thread guard page in system_server from shell context and (with very low success rate) get instruction pointer control:
https://project-zero.issues.chromium.org/issues/465827985

Project Zero

@jann Lovely research! Underpins why one shouldn't underestimate compiler hardening flags!
@ljrk Yeah, stack overflows in particular feel to me like the programmer isn't really making a particular mistake that can be called a security bug, it just randomly happens in legitimate code... and the only thing that can reliably stop it is the compiler. So it kinda feels wrong to me to call it a hardening flag, it feels more like a... correctness flag?
@jann @ljrk I wonder why it is still necessary to employ imperfect mechanisms like guard pages to avoid stack overflows. Shouldn't it be possible to have a stack pointer limit checked by hardware? Like https://interrupt.memfault.com/blog/using-psp-msp-limit-registers-for-stack-overflow on some ARM chips?
A Guide to Using ARM Stack Limit Registers

A community and blog for embedded software makers

Interrupt
@jannic that MSPLIM thing you linked to seems to be specific to Cortex-M chips, probably for when you don't have an MMU?
When you have an MMU, I imagine explicit stack pointer limit checks probably cause unnecessary hardware overhead compared to relying on implicit bounding by guard pages?
@jann These chips do have an MPU (memory protection unit, like an MMU but without the capability to remap memory addresses). So MSPLIM is not strictly necessary to implement some kind of stack overflow protection, but much easier to set up.
Additional hardware overhead could be a reason it's not used on bigger CPUs, yes.
And with many cores, each core would need a separate stack limit.