Mastodawn

Matthew Garrett

My annual plea for a thing: I want a type 1 hypervisor that just has a small isolated VM and then passes through the rest of the hardware to the main VM which runs Linux. The small VM is intended to be used to run small pieces of code that the main OS should not be able to interfere with. Does such a thing exist? (Think Xen, but with a Dom0 that can't see into DomUs)

bEA 🔓May 8, 2024

@mjg59 you missed "in an ssh-agent”.

Stephen Crane May 8, 2024

@mjg59 you’ve looked at halfnium for this? https://hafnium.googlesource.com/hafnium/+/HEAD/docs/Architecture.md

Hafnium - Hafnium architecture

Stephen Crane May 8, 2024

@mjg59 AVF with pKVM is also effectively this but the hypervisor is a split off part of the Linux kernel, so not exactly type 1.

Zimmie May 8, 2024

@rinon @mjg59 It’s not like “type 1” versus “type 2” is a real technical distinction.

Дамјан Георгиевски May 14, 2024

@rinon @mjg59
how about this qualcomm gunyah thing?
https://github.com/quic/gunyah-hypervisor

GitHub - quic/gunyah-hypervisor: Gunyah is a Type-1 hypervisor designed for strong security, performance and modularity.

Gunyah is a Type-1 hypervisor designed for strong security, performance and modularity. - quic/gunyah-hypervisor

GitHub

Matthew Garrett May 8, 2024

@rinon That's broadly what I want, but is ARM only

Gaveen Prabhasara May 8, 2024

@mjg59 doesn't quite sound like what Qubes OS is doing

Howard Chu @ Symas May 8, 2024

@mjg59 sounds like something you'd need Secure Encrypted Virtualization for https://www.amd.com/en/developer/sev.html

Matthew Garrett May 8, 2024

@hyc No, once you're in SEV-land you're not really in a good place to do hardware passthrough

Howard Chu @ Symas May 8, 2024

@mjg59 hm, that's a tough one then, maintaining isolation.

Matthew Garrett May 8, 2024

@hyc I'm fine with the hypervisor being able to see what's happening in arbitrary guests, but there needs to be isolation between the primary VM and the security VM (Hyper-V manages this fine in Windows land)

Florian Idelberger May 13, 2024

@mjg59 @hyc does one know how it manages this? Does it just pretend?

Matthew Garrett May 13, 2024

@fl0_id @hyc it's a hypervisor, it simply imposes a barrier between the resources? This isn't a conceptually complicated situation, modern CPUs support it just fine

Florian Idelberger May 13, 2024

@mjg59 @hyc sure, but I just meant if the hv can technically see into all guests, who enforces the rules for security vm? The cpu or the hv or both? If the hv, this is likely more easily overridden.

Matthew Garrett May 13, 2024

@fl0_id @hyc overridden by whom?

baloo May 8, 2024

@mjg59 @hyc
Curious: what kind of hardware should the security VM need to access?
(I can only guess TPM? For state bootstrap or something?)

Morten Linderud May 8, 2024

@baloo @mjg59 @hyc

I suspect this is a continuation of the fingerprint issue Matthew was writing about a couple of months(?) ago.

EDIT: This post https://nondeterministic.computer/@mjg59/111456696748600420

Matthew Garrett (@[email protected])

https://blackwinghq.com/blog/posts/a-touch-of-pwn-part-i/ is some very nice research, with some terrifying takeaways: 1) Microsoft developed a secure communications path between the OS and any biometric devices 2) One vendor used the same backing store for both the secure and insecure path, allowing enrollment of fingerprints via the insecure path that were then trusted in the secure path 3) Another vendor used their own fucked up TLS-based implementation rather than the Microsoft one 4) *Microsoft* didn't use their own protocol

Nondeterministic Computer

Matthew Garrett May 8, 2024

@baloo @hyc Potentially the TPM, but otherwise nothing - just CPU, RAM, and some sort of simple intra-VM communication channel.

baloo May 8, 2024

@mjg59 @hyc
I know you already dismissed SEV, but https://github.com/project-oak/oak seems vaguely related?

This is a VM inside the main OS, but the binary inside the TEE is available over grpc.

GitHub - project-oak/oak: Meaningful control of data in distributed systems.

Meaningful control of data in distributed systems. - project-oak/oak

GitHub

Matthew Garrett May 8, 2024

@baloo @hyc Right, you can do it the other way around with SEV, but that then leaves you with very restricted hardware support at the moment

baloo May 8, 2024

@mjg59 @hyc yeah definitely. You will need a piece of code in the main os to make the bridge for any hardware resource you might need.

Jonathan McDowell May 8, 2024

@mjg59 @hyc Why can you not use SEV-SNP for the security VM, with the main OS running directly on the bare metal?

Jonathan McDowell May 8, 2024

@mjg59 @hyc Ah, you want to carve the TPM away from the main OS?

Matthew Garrett May 8, 2024

@noodles @hyc Some form of secret manager, at least

Matthew Garrett May 9, 2024

@noodles @hyc SEV is pretty much exclusive to server parts, and I have a laptop

Alexander Graf May 8, 2024

@mjg59 sounds pretty close to Jailhouse?

Matthew Garrett May 8, 2024

@agraf My recollection is that Jailhouse does static partitioning and no scheduling, ie you need to give it a CPU? It also starts from Linux which makes it harder to sequester secrets that Linux can't get at.

Alexander Graf May 8, 2024

@mjg59 I'm not sure how much both of these are embedded into its architecture or just artifacts of how its main users consume it.

Matthew Garrett May 8, 2024

@agraf I'm pretty sure the lack of scheduling is a design choice that would need to be retrofitted. Launching from Linux is more about how it's managed, so that's probably an easier thing to fix.

Alexander Graf May 8, 2024

@mjg59 true, it doesn't seem to support any scheduling at all. That said, I'd expect a simple round robin scheduler may not be super difficult to implement. Either way, not an off the shelf solution for your use case.

Fi 🏳️‍⚧️May 8, 2024

@mjg59 is this like, kind of a secure enclave/hsm-equiv situation you're looking for?

Matthew Garrett May 8, 2024

@munin Yeah, like Windows does with Credential Guard

Simon Bisson May 8, 2024

@mjg59 @munin Which is based on their Krypton minimal hypervisor.

bluca May 8, 2024

@mjg59 there's work in progress by @l0kod but don't think it's merged yet: https://lore.kernel.org/all/2024050313[email protected]/

[RFC PATCH v3 0/5] Hypervisor-Enforced Kernel Integrity - CR pinning - Mickaël Salaün

Morten Linderud May 8, 2024

@bluca @mjg59 @l0kod

It always somewhat amazes me that something that reads like is super complicated is accomplished with a couple hundred lines of code.

Matthew Garrett May 8, 2024

@bluca @l0kod Not quite the same - you still have Linux with the ability to see everything, I think?

Mickaël Salaün May 8, 2024

@mjg59 @bluca kind of, Heki is the equivalent of Windows's Virtualization Based Security (foundation of Credential Guard and other security mechanisms) for Linux (with KVM or Hyper-V). The host/VMM is part of the TCB like the hypervisor, but the Linux guest VM requests the hypervisor to protect itself (guest). For now this is only CR-pinning (v3) and memory permissions (v2). We could probably implement the same mechanism with Jailhouse, but that would remove a lot of VM use cases

Adam Hawkes May 8, 2024

@mjg59 Like Proxmox? Or maybe I have it backwards.

@mjg59 maybe check on kata and firecracker.
These are container engines and not really made for you usecase, but they do run a minimal system Linux, and then run your applications in isolated mini VMs.
Maybe some of their tech can be addapted

Mark Esler May 8, 2024

@mjg59 @[email protected]

🗦new🗧 FireFly May 8, 2024

@mjg59 I don't think it's in a usable state yet (at least for x86 hosts, according to their FAQ), but I think seL4-as-hypervisor would fit the bill otherwise from my understanding

cf. https://docs.sel4.systems/projects/sel4/frequently-asked-questions.html#how-good-is-sel4-at-supporting-virtual-machines
& https://sel4.systems/About/seL4-whitepaper.pdf

Frequently Asked Questions on seL4 | seL4 docs

Tobias May 8, 2024

@mjg59 So Qubes one step further?

keen456 keen456 May 8, 2024

@mjg59 Sorta like what m1n1 does for #asahilinux ? https://github.com/AsahiLinux/m1n1

Commits · AsahiLinux/m1n1

A bootloader and experimentation playground for Apple Silicon - Commits · AsahiLinux/m1n1

GitHub

Emelia/Emi May 8, 2024

@mjg59 So basically "a programmable HSM" like a less-locked-down version of apple's secure enclave? I honestly think trying to achieve secure isolation on the same CPU as the rest of the OS is a fool's game, and the only way to ensure isolation is to physically isolate things onto independent cores via a mailbox interface.

(I've wanted something similar for literally ever...)

Matthew Garrett May 8, 2024

@becomethewaifu Hypervisors are "good enough", given that we haven't seen multi-tenant cloud turn into a complete disaster

Graham Sutherland / Polynomial May 8, 2024

@mjg59 a concept like SGX enclaves / LSASS isolation but actually accessible and convenient to use would be very nice.

Chip Collier May 8, 2024

@mjg59 acrn? https://projectacrn.org but also I think xen can also do this.

Home - Project ACRN™

..Read more

Project ACRN™

Kensan May 8, 2024

@mjg59 It’s more or less how I use/dogfood Muen¹. What makes it not usable in your case is probably static partitioning and highly target hardware dependent.
Would you mind elaborating a bit what the rest of your envisioned system looks like?
__
¹ https://muen.sk

Muen | SK for x86/64

Zimmie May 8, 2024

@mjg59 To be sure I understand, you want a small VM and a big VM. The big VM gets all the hardware minus what’s needed to run the hypervisor and the small VM. Communications between the big VM and the small VM are strictly controlled in both directions such that neither can interfere with the other.

What sort of thing are you trying to do with this small VM?

This sounds kind of like what a TPM is for, or maybe a BMC/SMC/LOM.

Matthew Garrett May 8, 2024

@bob_zim Manage secrets in ways that the TPM can't (eg, the TPM can't establish a secure communications channel with a biometric reader)

Zimmie May 8, 2024

@mjg59 So the small VM would own the physical link to the biometric reader, then provide its own attestation about the biometric reader’s attestation it was presented an authentic biometric?

Hmm. I’m not sure I know of a way to do that in software. Decent biometric readers should already use asymmetric keys, though. It should be possible to get a secure element like a TPM or smart card to only unlock a stored key when presented with a valid signature from the reader’s private key.

Matthew Garrett May 8, 2024

@bob_zim No need for a physical link (eg, TLS is secure without you having to trust the physical link, modern biometric devices implement equivalent functionality). It is not possible to use a TPM in this way given the hardware that exists.

Zimmie May 8, 2024

@mjg59 That’s what I’m getting at: decent biometric readers should already use asymmetric keys. You may not be able to hook that directly to an off-the-shelf TPM, though I thought some had firmware allowing them to trust external public keys for exactly this reason. Might require writing custom RoT firmware like Oxide has done.

A guest can never keep a secret from the hypervisor under which it runs. The host always has full control over the guest, including the ability to inspect and change stack frames. At that point, you’re guaranteed to have a single piece of software which can get at both the clear key material from the small VM’s RAM and the data the key controls from the big VM.

Assuming that’s what you’re trying to prevent, I don’t know of any software system I would trust to provide sufficient isolation between the guests, even at a theoretical level. Computer-in-a-computer stuff like a TPM is it.

Matthew Garrett May 8, 2024

@bob_zim I can't rewrite the firmware for my TPM, so that's not a viable approach. I also trust the hypervisor. What I want is to not trust Linux.

Nicolás Alvarez May 8, 2024

@mjg59 @bob_zim The goal is not to hide secrets from the hypervisor, but for the small VM to hide secrets from the big VM, using the trusted hypervisor.

Zimmie May 9, 2024

@nicolas17 @mjg59 Then just use dom0 instead of the small VM. That’s easy. The hypervisor can keep secrets from a guest. That’s an obvious solution, though, which is why I wanted to clarify the requirements.

Matthew Garrett May 9, 2024

@bob_zim @nicolas17 But then it's massively harder to plumb all my hardware (including ACPI) into a domu

Zimmie May 9, 2024

@mjg59 @nicolas17 So is the goal something like a software secure element for workstation use?

Matthew Garrett May 9, 2024

@bob_zim @nicolas17 yes