Streaming public key authenticated encryption with insider auth security

Note: this post will probably only really make sense to cryptography geeks.

In “When a KEM is not enough”, I described how to construct multi-recipient (public key) authenticated encryption. A naïve approach to this is vulnerable to insider forgeries: any recipient can construct a new message (to the same recipients) that appears to come from the original sender. For some applications this is fine, but for many it is not. Consider, for example, using such a scheme to create auth tokens for use at multiple endpoints: A and B. Alice gets an auth token for accessing endpoints A and B and it is encrypted and authenticated using the scheme. The problem is, as soon as Alice presents this auth token to endpoint A, that endpoint (if compromised or malicious) can use it to construct a new auth token to access endpoint B, with any permissions it likes. This is a big problem IMO.

I presented a couple of solutions to this problem in the original blog post. The most straightforward is to sign the entire message, providing non-repudiation. This works, but as I pointed out in “Digital signatures and how to avoid them”, signature schemes have lots of downsides and unintended consequences. So I developed a weaker notion of “insider non-repudiation”, and a scheme that achieves it: we use a compactly-committing symmetric authenticated encryption scheme to encrypt the message body, and then include the authentication tag as additional authenticated data when wrapping the data encryption key for each recipient. This prevents insider forgeries, but without the hammer of full blown outsider non-repudiation, with the problems it brings.

I recently got involved in a discussion on Mastodon about adding authenticated encryption to Age (a topic I’ve previously written about), where abacabadabacaba pointed out that my scheme seems incompatible with streaming encryption and decryption, which is important in Age use-cases as it is often used to encrypt large files. Age supports streaming for unauthenticated encryption, so it would be useful to preserve this for authenticated encryption too. Doing this with signatures is fairly straightforward: just sign each “chunk” individually. A subtlety is that you also need to sign a chunk counter and “last chunk” bit to prevent reordering and truncation, but as abacabadabacaba points out these bits are already in Age, so its not too hard. But can you do the same without signatures? Yes, you can, and efficiently too. In this post I’ll show how.

One way of thinking about the scheme I described in my previous blog post is to think of it as a kind of designated-verifier signature scheme. (I don’t hugely like this term, but it’s useful here). That is, we can view the combination of the committing MAC and authenticated KEM as a kind of signature scheme where the signature can only be verified by recipients chosen by the sender, not by third-parties. If we take that perspective, then it becomes clear that we can just do exactly the same as you do for the normal signature scheme: simply sign each chunk of the message separately, and include some chunk counter + last chunk marker in the signature.

How does this work in practice?

Firstly, we generate a fresh random data encryption key (DEK) for the message. This is shared between all recipients. We then use this DEK to encrypt each chunk of the message separately using our compactly-committing AEAD. To prevent chunk reordering or truncation we can use the same method as Age: Rogaway’s STREAM construction, which effectively just encodes the chunk counter and last-chunk bit into the AEAD nonce. (Personally, I prefer using a symmetric ratchet instead, but that’s for another post). This will produce a compactly committing tag for each chunk—typically 16 bytes per chunk (or 32 bytes if we care about malicious senders).

The original scheme I proposed then fed this tag (of which there was only 1) as associated data when wrapping the DEK for each recipient using an authenticated key-wrap algorithm and a per-recipient wrapping-key derived from an authenticated KEM. If the DEK is 32 bytes, and the key-wrapping algorithm produces a 16-byte tag then this outputs 48 bytes per recipient. We can do exactly the same thing for the new scheme, but we only feed the tag from the first chunk as the associated data, producing wrapped keys that commit to the first chunk only.

We then simply repeat the process for each subsequent chunk, but as the DEK is unchanged we can leave it empty: effectively just computing a MAC over the commitment for each chunk in turn. In our example, this will produce just a 16-byte output per recipient for each chunk. If we compare this to typical signature schemes that would be used for signing chunks otherwise, we can fit 4 recipient commitments in the same space as a single Ed25519 signature (64 bytes), or 16 recipients in the same space as an RSA-2048 signature.

To support such a scheme, the interface of our KEM would need to change to include a new operation that produces an intermediate commitment to a particular chunk tag, with an indication of whether it is the last tag or not. The KEM is then free to reuse the shared secrets derived for each recipient, avoiding the overhead of computing new ones for each chunk. This is an efficiency gain over using a normal digital signature for each chunk.

Here is a sketch of what the overall process would look like, to hopefully clarify the ideas presented. Alice is sending a message to Bob and Charlie. The message consists of three “chunks” and is using an authenticated KEM based on X25519.

  • First, Alice generates a random 32-byte DEK and uses it to encrypt the message, producing tags t1, t2, and t3.
  • Alice generates a random ephemeral X25519 keypair: (esk, epk).
  • She computes a shared secret with Bob, ssb, via something like HPKE’s DH-AKEM.
  • Likewise, she computes a shared secret with Charlie, ssc.
  • Alice then wraps the DEK from step 1 for Bob and Charlie, using a key-wrap algorithm like AES in SIV mode (keyed with ssb and then ssc), including t1 as additional authenticated data. She outputs the two wrapped keys plus the ephemeral public key (epk) as the encapsulated key blob. This will be 32 bytes for the epk, plus 48 bytes for each key blob (one for Bob, another for Charlie), giving 128 bytes total.
  • She then calls a new “commit” operation on the KEM for each subsequent chunk tag: i.e., t2 and t3. This commit operation performs the same as step 5, but with a blank DEK, outputting just a 16-byte SIV for each recipient for a total of 32 bytes per chunk. These commitment blocks can then be appended to each chunk. (In fact, they can replace the normal AEAD tag, saving even more space).
  • The total space taken is then 128 + 32 + 32 = 192 bytes, and we can remove the 3 original 16-byte AEAD tags, giving a net overhead of just 144 bytes. Compared to signing each chunk with Ed25519 which would need 192 bytes, or RSA-2048 which needs 768 bytes.

    Decryption then performs the obvious operations: decrypting and recomputing the MAC tag for each chunk using the decapsulated DEK and then verifying the commitment blocks match at the end of each subsequent chunk.

    This is still very much a sketch, and needs to be reviewed and fleshed out more. But I believe this is quite a neat scheme that achieves streaming authenticated encryption without the need for tricksy little signatureses, and potentially much more efficient too.

    #authenticatedEncryption #cryptography #publicKeyEncryption #streamingEncryption

    When a KEM is not enough

    In my previous post, I described the KEM/DEM paradigm for hybrid encryption. The key encapsulation mechanism is given the recipient’s public key and outputs a fresh AES key and an encapsulati…

    Neil Madden

    Wikipedia’s definition of a digital signature is:

    A digital signature is a mathematical scheme for verifying the authenticity of digital messages or documents. A valid digital signature on a message gives a recipient confidence that the message came from a sender known to the recipient.

    —Wikipedia

    They also have a handy diagram of the process by which digital signatures are created and verified:

    Source: https://commons.m.wikimedia.org/wiki/File:Private_key_signing.svg#mw-jump-to-license (CC-BY-SA)

    Alice signs a message using her private key and Bob can then verify that the message came from Alice, and hasn’t been tampered with, using her public key. This all seems straightforward and uncomplicated and is probably most developers’ view of what signatures are for and how they should be used. This has led to the widespread use of signatures for all kinds of things: validating software updates, authenticating SSL connections, and so on.

    But cryptographers have a different way of looking at digital signatures that has some surprising aspects. This more advanced way of thinking about digital signatures can tell us a lot about what are appropriate, and inappropriate, use-cases.

    Identification protocols

    There are several ways to build secure signature schemes. Although you might immediately think of RSA, the scheme perhaps most beloved by cryptographers is Schnorr signatures. These form the basis of modern EdDSA signatures, and also (in heavily altered form) DSA/ECDSA.

    The story of Schnorr signatures starts not with a signature scheme, but instead with an interactive identification protocol. An identification protocol is a way to prove who you are (the “prover”) to some verification service (the “verifier”). Think logging into a website. But note that the protocol is only concerned with proving who you are, not in establishing a secure session or anything like that.

    There are a whole load of different ways to do this, like sending a username and password or something like WebAuthn/passkeys (an ironic mention that we’ll come back to later). One particularly elegant protocol is known as Schnorr’s protocol. It’s elegant because it is simple and only relies on basic security conjectures that are widely accepted, and it also has some nice properties that we’ll mention shortly.

    The basic structure of the protocol involves three phases: Commit-Challenge-Response. If you are familiar with challenge-response authentication protocols this just adds an additional commitment message at the start.

    Alice (for it is she!) wants to prove to Bob who she is. Alice already has a long-term private key, a, and Bob already has the corresponding public key, A. These keys are in a Diffie-Hellman-like finite field or elliptic curve group, so we can say A = g^a mod p where g is a generator and p is the prime modulus of the group. The protocol then works like this:

  • Alice generates a random ephemeral key, r, and the corresponding public key R = g^r mod p. She sends R to Bob as the commitment.
  • Bob stores R and generates a random challenge, c and sends that to Alice.
  • Alice computes s = ac + r and sends that back to Bob as the response.
  • Finally, Bob checks if g^s = A^c * R (mod p). If it is then Alice has successfully authenticated, otherwise it’s an imposter. The reason this works is that g^s = g^(ac + r) and A^c * R = (g^a)^c * g^r = g^(ac + r) too. Why it’s secure is another topic for another day.
  • Don’t worry if you don’t understand all this. I’ll probably do a blog post about Schnorr identification at some point, but there are plenty of explainers online if you want to understand it. For now, just accept that this is indeed a secure identification scheme. It has some nice properties too.

    One is that it is a (honest-verifier) zero knowledge proof of knowledge (of the private key). That means that an observer watching Alice authenticate, and the verifier themselves, learn nothing at all about Alice’s private key from watching those runs, but the verifier is nonetheless convinced that Alice knows it.

    This is because it is easy to create valid runs of the protocol for any private key by simply working backwards rather than forwards, starting with a response and calculating the challenge and commitment that fit that response. Anyone can do this without needing to know anything about the private key. That is, for any given challenge you can find a commitment for which it is easy to compute the correct response. (What they cannot do is correctly answer a random challenge after they’ve already sent a commitment). So they learn no information from observing a genuine interaction.

    Fiat-Shamir

    So what does this identification protocol have to do with digital signatures? The answer is that there is a process known as the Fiat-Shamir heuristic by which you can automatically transform certain interactive identification protocols into a non-interactive signature scheme. You can’t do this for every protocol, only ones that have a certain structure, but Schnorr identification meets the criteria. The resulting signature scheme is known, amazingly, as the Schnorr signature scheme.

    You may be relieved to hear that the Fiat-Shamir transformation is incredibly simple. We basically just replace the challenge part of the protocol with a cryptographic hash function, computed over the message we want to sign and the commitment public key: c = H(R, m).

    That’s it. The signature is then just the pair (R, s).

    Note that Bob is now not needed in the process at all and Alice can compute this all herself. To validate the signature, Bob (or anyone else) recomputes c by hashing the message and R and then performs the verification step just as in the identification protocol.

    Schnorr signatures built this way are secure (so long as you add some critical security checks!) and efficient. The EdDSA signature scheme is essentially just a modern incarnation of Schnorr with a few tweaks.

    What does this tell us about appropriate uses of signatures

    The way I’ve just presented Schnorr signatures and Fiat-Shamir is the way they are usually presented in cryptography textbooks. We start with an identification protocol, performed a simple transformation and ended with a secure signature scheme. Happy days! These textbooks then usually move on to all the ways you can use signatures and never mention identification protocols again. But the transformation isn’t an entirely positive process: a lot was lost in translation!

    There are many useful aspects of interactive identification protocols that are lost by signature schemes:

    • A protocol run is only meaningful for the two parties involved in the interaction (Alice and Bob). By contrast a signature is equally valid for everyone.
    • A protocol run is specific to a given point in time. Alice’s response is to a specific challenge issued by Bob just prior. A signature can be verified at any time.

    These points may sound like bonuses for signature schemes, but they are actually drawbacks in many cases. Signatures are often used for authentication, where we actually want things to be tied to a specific interaction. This lack of context in signatures is why standards like JWT have to add lots of explicit statements such as audience and issuer checks to ensure the JWT came from the expected source and arrived at the intended destination, and expiry information or unique identifiers (that have to be remembered) to prevent replay attacks. A significant proportion of JWT vulnerabilities in the wild are caused by developers forgetting to perform these checks.

    WebAuthn is another example of this phenomenon. On paper it is a textbook case of an identification protocol. But because it is built on top of digital signatures it requires adding a whole load of “contextual bindings” for similar reasons to JWTs. Ironically, the most widely used WebAuthn signature algorithm, ECDSA, is itself a Schnorr-ish scheme.

    TLS also uses signatures for what is essentially an identification protocol, and similarly has had a range of bugs due to insufficient context binding information being included in the signed data. (SSL also uses signatures for verifying certificates, which is IMO a perfectly good use of the technology. Certificates are exactly a case of where you want to convert an interactive protocol into a non-interactive one. But then again we also do an interactive protocol (DNS) in that case anyway :shrug:).

    In short, an awful lot of uses of digital signatures are actually identification schemes of one form or another and would be better off using an actual identification scheme. But that doesn’t mean using something like Schnorr’s protocol! There are actually better alternatives that I’ll come back to at the end.

    Special Soundness: fragility by design

    Before I look at alternatives, I want to point out that pretty much all in-use signature schemes are extremely fragile in practice. The zero-knowledge security of Schnorr identification is based on it having a property called special soundness. Special soundness essentially says that if Alice accidentally reuses the same commitment (R) for two runs of the protocol, then any observer can recover her private key.

    This sounds like an incredibly fragile notion to build into your security protocol! If I accidentally reuse this random value then I leak my entire private key??! And in fact it is: such nonce-reuse bugs are extremely common in deployed signature systems, and have led to compromise of lots of private keys (eg Playstation 3, various Bitcoin wallets etc).

    But despite its fragility, this notion of special soundness is crucial to the security of many signature systems. They are truly a cursed technology!

    To solve this problem, some implementations and newer standards like EdDSA use deterministic commitments, which are based on a hash of the private key and the message. This ensures that the commitment will only ever be the same if the message is identical: preventing the private key from being recovered. Unfortunately, such schemes turned out to be more susceptible to fault injection attacks (a much less scalable or general attack vector), and so now there are “hedged” schemes that inject a bit of randomness back into the hash. It’s cursed turtles all the way down.

    If your answer to this is to go back to good old RSA signatures, don’t be fooled. There are plenty of ways to blow your foot off using old faithful, but that’s for another post.

    Did you want non-repudiation with that?

    Another way that signatures cause issues is that they are too powerful for the job they are used for. You just wanted to authenticate that an email came from a legitimate server, but now you are providing irrefutable proof of the provenance of leaked private communications. Oops!

    Signatures are very much the hammer of cryptographic primitives. As well as authenticating a message, they also provide third-party verifiability and (part of) non-repudiation.

    You don’t need to explicitly want anonymity or deniability to understand that these strong security properties can have damaging and unforeseen side-effects. Non-repudiation should never be the default in open systems.

    I could go on. From the fact that there are basically zero acceptable post-quantum signature schemes (all way too large or too risky), to issues with non-canonical signatures and cofactors and on and on. The problems of signature schemes never seem to end.

    What to use instead?

    Ok, so if signatures are so bad, what can I use instead?

    Firstly, if you can get away with using a simple shared secret scheme like HMAC, then do so. In contrast to public key crypto, HMAC is possibly the most robust crypto primitive ever invented. You’d have to go really far out of your way to screw up HMAC. (I mean, there are timing attacks and that time that Bouncy Castle confused bits and bytes and used 16-bit HMAC keys, so still do pay attention a little bit…)

    If you need public key crypto, then… still use HMAC. Use an authenticated KEM with X25519 to generate a shared secret and use that with HMAC to authenticate your message. This is essentially public key authenticated encryption without the actual encryption. (Some people mistakenly refer to such schemes as designated verifier signatures, but they are not).

    Signatures are good for software/firmware updates and pretty terrible for everything else.

    https://neilmadden.blog/2024/09/18/digital-signatures-and-how-to-avoid-them/

    #authenticatedEncryption #cryptography #misuseResistance #signatures

    Digital signature - Wikipedia

    Galois/Counter Mode, better known as GCM, is very popular way of encrypting data with AES to provide authenticated encryption with associated data (AEAD). Phew, that’s a lot of terminology. Let’s just say that AEAD is the gold standard for data encryption, and AES-GCM is one of the most popular choices—by far the most widely used encryption algorithm on the internet.

    Whenever you encrypt a message with GCM, you need to use a unique nonce (number used once). For protocols like TLS, this nonce is generated deterministically: essentially a counter is initialised when the connection is first established and incremented after every message. This is mostly fine for stateful protocols like TLS (except when it isn’t), but is incredibly hard to do in a stateless protocol, where servers and clients may be coming and going, crashing, resuming VMs, etc. Reusing a nonce even once for GCM is absolutely catastrophic, as an observer can then trivially recover the authentication sub-key and probably the message content too.

    So the solution that most people use is to use random nonces, created using a CSPRNG. The problem is that GCM’s nonce is only 96 bits long, which means that the probability of two random nonces being the same (a collision) approaches 50% after around 248 messages. 50% is way too high for comfort, so NIST advises to keep the chance of a collision to less than 2-32, or about 1 in 4 billion. That limit comes after 232 messages for GCM: 4 billion again, give or take, and that is the limit NIST imposes for GCM with a random nonce. That sounds like a lot, and for many uses it is, but for some high-frequency usage cases, like Google’s front-end servers, that limit can be reached quite quickly. Google have stated (pp. 4) that when under DDoS attack, their servers may have to produce “several hundred million [encrypted tokens] per second”. Under that load, they would hit the 232 limit in 43 seconds or less! The solution they designed is described in that linked paper: AES-GCM-SIV, which is able to tolerate some number of nonce collisions, but under a weaker notion of security that is only really applicable to that use-case (where the data being encrypted is itself random).

    But is GCM limited to a 96-bit nonce in the first place? The answer turns out to be no (with some caveats).

    But is GCM limited to a 96-bit nonce in the first place? The answer turns out to be no.

    Where does the 96-bit limit come from? GCM is, as the name suggests, based on a simpler mode called Counter Mode (or CTR for short). In CTR mode, a block cipher like AES is used to encrypt a sequence of incrementing counter values: 0, 1, 2, 3, … and so on. This produces a sequence of pseudorandom data (the keystream) that is then bitwise-XORed with the message to encrypt it. AES is a 128-bit block cipher, so the input counter is encoded as 128 bits, or 16 bytes, and it produces 16 bytes of output each time. This means that the counter needs to be incremented for each (16-byte) block of data encrypted rather than each message. To ensure that no two blocks ever collide, GCM splits the 128-bit counter into two parts: a 96-bit per-message nonce, and a 32-bit block counter. Each message uses a unique nonce, and the block counter is reset to 1 each time (the all-zero counter is used internally) and incremented for each block, allowing a maximum of 232 × 16 = 64GiB per message. This allows the application to simply increment the nonce for each message, without having to keep track of how many blocks of data have been encrypted.

    That’s fine for deterministic nonces, but for random nonces segregating the counter space in this way is counter-productive (pun intended!). Unless you are encrypting really large messages (approaching the 64GiB limit), it is generally better to randomise the entire 128-bit counter space. That is, you pick a random starting point in the counter space and then increment that initial counter for each block of the message. This will create “runs” of used-up nonces, spread uniformly around the full 2128 space. Although it may initially seem like this would make collisions more likely compared to having a separate block counter, in fact 2128 is so enormously bigger than 296 that the chance of two “runs” of counters overlapping is vanishingly small until you’ve encrypted a very large number of blocks. In fact, you would hit NIST’s 2-32 probability limit after encrypting around 248 blocks (281 trillion). For small messages, of just a few blocks in length, as in Google’s case, that allows you to encrypt close to 248 messages. For example, suppose those messages are all <= 128 bytes in length. In that case, you could encrypt 245 messages before you hit the limit, which means Google would only need to change the key every 4 days or so, even under large-scale DDoS attack.

    OK, very well, you might be saying, but GCM doesn’t support randomising the entire 128-bit counter space, so this is all academic, isn’t it? Well, it turns out that it does. GCM has the strange property that it allows the nonce (IV) to be any length, not just exactly 96 bits. If it is exactly 96 bits, then the initial counter value becomes the nonce concatenated with the 32-bit initial block counter (set to 1). If it is not 96 bits, then the value is first hashed with GHASH, GCM’s universal hash function, and the full 128-bit output becomes the initial counter value:

    Source: wikipedia

    However, we can’t just assume that the output of GHASH is ideal with respect to collisions, and in fact the analysis in this paper shows that it is not. In the worst case, where messages can be up to 232 -2 blocks long, there is an additional factor of 222 in the security advantage to an attacker. However, if we restrict ourselves to shorter messages then the impact is less severe. For example, taking our example of 128-byte messages and 16-byte nonces, then the probability of a collision after applying GHASH is ≤ 29/2128 = 2-119. In that case, you can encrypt about 243.5 messages (about 12.4 trillion) before you hit NIST’s limit, allowing Google to reuse a key for about 34.6 hours.

    Even in the worst case, with messages up to the maximum 64GiB in size, this analysis shows that you can still encrypt 236 messages with a random 128-bit nonce: 16 times as many as with a random 96-bit nonce. For more realistic message sizes, the advantage is greater.

    Update: /u/wearingdepends on Reddit points out a later paper that reduces that worst case factor from 222 to just 32 (i.e., 25). So even in this worst case (with very large messages), you can encrypt around 244.5 messages (almost 25 trillion) and almost 70 hours of Google-scale DDoS handling.

    So, technically you can use random nonces with AES-GCM with better bounds than folklore would have you believe. Would I recommend it? No. Even if it technically works, it’s a highly non-standard usage and against the letter of NIST’s recommendations. I wouldn’t spend innovation tokens on this kind of thing. Frankly, if I needed to do something like this, I would derive a fresh key for each message from a long nonce, in the way that XSalsa20 does: using a real pseudorandom function. Libsodium even exposes this functionality as a general-purpose facility.

    https://neilmadden.blog/2024/05/23/galois-counter-mode-and-random-nonces/

    #aesGcm #authenticatedEncryption #cryptography

    Galois/Counter Mode - Wikipedia

    Why did Yubico changed U2F Key Derivation method on YubiKey?

    Previous: https://web.archive.org/web/20160102181336/developers.yubico.com/U2F/Protocol_details/Key_generation.html https://www.yubico.com/blog/yubicos-u2f-key-wrapping/ $$ PrivateKey = \operatorname{

    Cryptography Stack Exchange
    Does authenticating fake Carter Wegman protected OTP messages consume key material?

    Assume a message protocol whereby one time pad messages are authenticated with a Carter Wegman type hash, or some similar construct utilizing a unique authentication key per message. Since this is ...

    Cryptography Stack Exchange
    For password-based authenticated encryption is it OK to derive the auth key from the crypt key with 1 iteration?

    https://crypto.stackexchange.com/questions/103951/for-password-based-authenticated-encryption-is-it-ok-to-derive-the-auth-key-from

    #authenticatedencryption #keyderivation
    For password-based authenticated encryption is it OK to derive the auth key from the crypt key with 1 iteration?

    That is, in the case where the iterations value is a large number, since iterations are costly is there a difference in security of doing something like this, where two separate derivations are per...

    Cryptography Stack Exchange
    Which ciphers have been defined that use the Keccak sponge?

    There seem to have been defined multiple ciphers using the Keccak sponge as building block / primitive. These seem to have escaped public attention, possibly because they have not been standardized...

    Cryptography Stack Exchange
    OpenSSL AES-GCM says 'bad decrypt', other block modes work fine?

    If I do a simple encrypt and decrypt test like so: echo 'Hello World' | \ openssl enc -aes-128-cbc -pass pass:SeCrEt | \ openssl enc -d -aes-128-cbc -pass pass:SeCrEt It works fine, it correctly o...

    Cryptography Stack Exchange