Inspired by @pervognsen, I too now microblog on gists sometimes:

https://gist.github.com/RealNeGate/904c5bdf95190f271bb0de38f58848dc

This was a proof of concept safepoint polling system that worked in C for windows x64, normal page poking stuff but the idea was to showcase how easy it actually would be to get consistent pausing in userland given you instrument your code like a managed lang might (although no need to track reg+stack state, we're not deopting or anything), idk use it how you will i made green threads with it :)

greenies.c

GitHub Gist: instantly share code, notes, and snippets.

Gist
@negate Using guard page arming/disarming for this is neat. For the SEH part of it, do you have an idea for how to make something equivalently composable on non-Windows systems? You can definitely catch/resume with signal handlers (e.g. see how Dolphin implements memory mapped IO, https://github.com/dolphin-emu/dolphin/blob/620fbcdfb70e3da0f6784c66b9d247231a28fd36/Source/Core/Core/HW/Memmap.cpp#L167). But I'm not 100% sure of how to make it composable with arbitrary library code (given threading-unfriendly signal semantics).
dolphin/Source/Core/Core/HW/Memmap.cpp at 620fbcdfb70e3da0f6784c66b9d247231a28fd36 · dolphin-emu/dolphin

Dolphin is a GameCube / Wii emulator, allowing you to play games for these two platforms on PC with improvements. - dolphin-emu/dolphin

GitHub
@negate Although as a more portable fallback, a relaxed atomic load + almost-never-taken safepoint branch per CFG backedge isn't the end of the world either. It would hurt for tight loops, though. An option with the safepoint branch but without the load is something like @pkhuong's dynamic flags although that's going to have an even higher "trigger" cost than what you're doing, more akin to traditional JIT safepoints: https://pvk.ca/Blog/2021/12/19/bounded-dynamicism-with-cross-modifying-code/
Bounded dynamicism with cross-modifying code - Paul Khuong: some Lisp

Paul Khuong's personal blog. Some Lisp, some optimisation, mathematical or computer.

@pervognsen Doing safepoint as cmp+branch comes with it's own tradeoffs, for instance it'll just have a far smaller latency, I believe it was 5micros before i could even get control back from the exception, the issue is the hot loops case although if you're controlling the optimizer then strip mining might solve that problem:

do {
work();
poll();
i += 1;
} while (i < n);

VVV

do {
next := min(i + 1024, n)
do {
work();
i += 1;
} while (i < next);
poll();
} while (i < n);

@pervognsen It's mostly built for closed environments so it wouldn't play nice with uninstrumented code, the trick is used in JITs all the time for this reason im just showing what it looks like ripped out of a massive JIT compiler, if i had to manage things without guard pages i'd just use `mprotect(poll_site, 4096, PROT_NONE)`
@negate Gotcha. Yeah, for that use case it seems like a practical point on the design spectrum.