Mastodawn

#qemu's #usermode #emulation is now officially orphaned with its #linux and #bsd parts having been set at "Odd Fixes" for some time now. We know usermode is heavily used by cross compilers and compilers for unit testing but the current lack of maintainer cover is unsustainable in the long term.

Show thread

penguin42 3d ago

@stsquad It would be great if some of the code could be shared between the pile of projects that all do similar userspace emulation/interception e.g. valgrind

Show thread

paulf 1d ago

@penguin42 @stsquad There aren't that many. Valgrind VEX, DrMemory DynamoRio and Intel pin (if that it still maintained).

As far as I understand there are fairly big differences between them. QEMU is just doing binary translation, which may be cross architecture. I guess that much of the complexity comes from the many architectures that it supports.

DynamoRio (amd64,x86,arm.arm64) and pin (amd64 and x86 only) do dynamic binary instrumentatiion so roughly speaking the original opcodes pass through with added instrumentation. Valgrind VEX (amd64,x86,arm64,arm,ppc32,ppc64,mpis32,misp64,s390) does a more complete recompilation, converting to IR, performing register allocation and regenerating machine code.

Valgrind has a fairly chronic lack of developers working on VEX. IBM s390 is well supported, and there is some work going on to share features like SSE4.1 between amd64 ans x86. There are big, long standing gaps in our amd64 coverage (avx512, fp16 and many more). Our floating point emulation is not great as well (rounding mode issues, no FPE support).

QEMU is much bigger than the other projects. It is about twice the size of all of Valgrind (or about 4x the size of VEX). It also has many more contributors and commits - 5 to 10x what we get in Valgrind. As far as I know DynamoRio is pretty much a one man show. I don't know much about Intel pin.

Lastly I believe that the architecture of these tools is fundamentally different (and thus difficult to combine)

VEX is based on swtch/case code.
QEMU is table driven
DynamoRio and pin use dispatch tables and function calls.

Show thread

penguin42 23h ago

@paulf @stsquad The parts I was thinking of sharing aren't the CPU part; I was more thinking about the syscall and /proc emulation specific to the user-mode side of things. (I hadn't realised DynamoRio was still around).

Show thread

paulf 23h ago

@penguin42 @stsquad

syscall wrappers in Valgrind are a fairly big part of the code, but mostly they don't do much, just check the arguments and check pointers to memory that the syscalls read or write. And do some fd checking / monitoring. It's only process/CPU syscalls where Valgrind has to do a lot more intrusive work.

For /proc and sysctls we do virtualise a few but for most of them we can just pass through to the originals.

ELF (and macho) and DWARF readers are also sources of headaches for us.

Show thread

Alex

@paulf @penguin42 same for #qemu really - the majority of the code is boilerplate except where we run into our edge cases. For qemu these are a) 64 bit file cookies for 32 bit b) overly fancy memory mapping (multiple maps, gcs etc) and c) just good enough sandboxing of /proc. There are probably some ioctls that are hard to do for specific HW because you deal with much more complex structures.