@simon_brooke @otfrom err, there may be 4 billion arrangements of 4 bytes, but there are many fewer valid UTF-32 sequences, because there are many fewer Unicode codepoints. 0x10FFFF ~= 1 million is not a small number, but 3 orders of magnitude is more than a rounding error here.
though I can unfortunately imagine that failing to distinguish between UTF-32 and UCS-4 (as originally published) is not an uncommon error when it comes to handling encoded text.
I agree with @JdeBP that much of the blame for this vulnerability lies in user interfaces making the amazingly unwise decision to hide this text from the user (not that assigning blame is particularly meaningful: if an attack exists, it exists, and PUA abuse of all sorts is already out there, so…)