the expectation of being able to run docker whenever in CI jobs is probably the single worst outcome of free GitHub Actions minutes because reproducing it in a bring-your-own-compute environment is borderline impossible unless you make every machine single-tenant
@whitequark Don’t all modern CI systems run each job in an ephemeral VM? It’s about the only security boundary that I’d think you could defend against someone able to run arbitrary code these days, unless you lock down the environment so much that CI can’t do things it needs to do.
@david_chisnall Forgejo Actions runners offer you a choice of "Docker", "Podman", "LXC", and "lol rawdog it on the host"
@david_chisnall right now I'm using rootless Podman and I think it's defendable enough that I'm okay offering it to friends (who may still click on Approve & Run from a sketchy source, mind) but it's not letting cibuildwheel or other Docker-expecting applications run which is a problem
@whitequark @david_chisnall are you on using systemd? If so, are you tailoring your security options for the service to be highly restrictive? If yes then no, I might have some good starting place for those options at the office.
@c0dec0dec0de @david_chisnall I am using systemd but I don't see how this would help considering the attack surface I'm concerned is "the kernel" and maybe "Podman", not "the Forgejo Actions runner" (which is the service I'd be configuring)
@whitequark @david_chisnall minimize blast radius for the process tree running Podman. We’re doing it with the Jenkins agent config at work, though admittedly there’s only so much you can do.
@c0dec0dec0de @david_chisnall considering that one of the top risks is "one project's workflow compromising other project's releases" this seems like the wrong surface to defend
@whitequark @david_chisnall it doesn’t slap the whole process tree down unceremoniously on violation (or does it? That would be bad, the Accessibility leg of the CIA triad is pretty much primary in CI, as you say).

@c0dec0dec0de @david_chisnall here is how I think this works (ignore my earlier posts, I had some invalid assumptions I've since corrected)

forgejo-action-runner
L podman
L stuff inside job 1...
L malware?
L podman
L stuff inside job 2...

so let's say the malware breaks out of podman. now it runs with fjar permissions. which means that touching job 2's stuff is not a violation of any kind, from the kernel's and systemd's perspective

@c0dec0dec0de @david_chisnall this is why I think the only actual solution to this is VMs of some sort, either commodity cloud runners spawned on demand, or firecracker or something
Kata Containers - Open Source Container Runtime Software

Kata Containers is an open source container runtime, building lightweight virtual machines that seamlessly plug into the containers ecosystem.

@whitequark @c0dec0dec0de

It's worth noting that most cloud container systems are also isolated VMs. This is partly for software compatibility (the guest can be a shiny new kernel, the host can be a LTS or CIP release), but mostly because cloud providers don't regard anything other than a VM as a defensible boundary.

Azure did a bunch of things with nested virtualisation, but they've now, I believe, upstreamed something to Linux that exposes a device compatible with KVM that lets one VM delegate pages to another and gives the abstraction of nested virtualisation where the 'child' is a child in the 'administration is delegated to the parent' sense and not in the 'recursive nested paging' sense.

@david_chisnall @c0dec0dec0de huh, that's really interesting to hear re: Azure.