Didn't sleep. Attacked cipher instead.
Won against cipher on technical grounds (proved that ridiculously powerful attacker is powerful). Then slept.
Ugh, I guess that means I have to write a paper now

This email could have been a paper. And maybe it will be.

https://groups.google.com/a/list.nist.gov/g/pqc-forum/c/8cNYhg23B9k

ML-KEM is not MAL-BIND-K-CT

And now it's a paper as well, expanded on the original email:

https://ia.cr/2024/523

Unbindable Kemmy Schmidt: ML-KEM is neither MAL-BIND-K-CT nor MAL-BIND-K-PK

In "Keeping up with the KEMs" Cremers et al. introduced various binding models for KEMs. The authors show that ML-KEM is LEAK-BIND-K-CT and LEAK-BIND-K-PK, i.e. binding the ciphertext and the public key in the case of an adversary having access, but not being able to manipulate the key material. They further conjecture that ML-KEM also has MAL-BIND-K-PK, but not MAL-BIND-K-CT, the binding of public key or ciphertext to the shared secret in the case of an attacker with the ability to manipulate the key material. This short paper demonstrates that ML-KEM does neither have MALBIND-K-CT nor MAL-BIND-K-PK, due to the attacker being able to produce mal-formed private keys, giving concrete examples for both. We also suggest mitigations, and sketch a proof for binding both ciphertext and public key when the attacker is not able to manipulate the private key as liberally.

IACR Cryptology ePrint Archive
@sophieschmieg its not my field, but I love the paper title
@sophieschmieg Potentially-stupid question from a non-cryptographer: since ML-KEM encapsulation is non-deterministic already (is specified to use a random number generator as part of the input), how is this different from an honest encapsulator coincidentally generating the same shared secret?
@sophieschmieg Partial answer to my own question: the amount of randomness is the same as the length of the shared secret, so if the encoding is always value-preserving, there won’t be any such collisions normally. I don’t know myself if Kyber/ML-KEM preserves all the entropy like that but it would make sense if it did.
@jrose for ML-KEM, the entropy going into the encaps function is concatenated with the hash of the public key (or in the case of this attack, the hash of a different public key), and the concatenated string is that hashed with SHA-3-512. The first 32 bytes of that hash is the shared secret, the last 32 bytes are used as encryption entropy to encrypt the original entropy with a CPA scheme.

@jrose the trick is that on decapsulation, you recover the entropy used by the encapsulator, and can use that to rerun the encapsulation algorithm. If the encapsulator was honest, you return the shared key, if the encapsulator lied you return pseudorandom garbage.

So while the original scheme could be attacked via adaptively chosen ciphertexts, this transform forces the attacker to not have any choice in the ciphertexts they produce, making the whole construction IND-CCA

@sophieschmieg Ah, and the last piece here is that recomputing the full public key from the secret key is unnecessarily expensive, so the hash is cached instead? Which means it can be replaced without disturbing anything on decapsulation (well, except that it can no longer receive ciphertexts on the original public key). Do I have that right?
@jrose you can't even recompute the public key from the private key in Kyber, so the cache is necessary (you compute t = As + e, t and A is the public key, s is the private key, e is discarded). The cache of the hash of the public key is not though, that's just a performance optimization.
@sophieschmieg I’m spoiled by entering cryptographic programming when EC keys were the recommended thing. Super flexible, super capable, super compact. Then I forget most systems just aren’t that general, often by design. Thanks for explaining!
@sophieschmieg s/now// You get to decide when.
@sophieschmieg
Well i want to know what's the next step after having basic knowledge of cryptography and what's the job positions currently available in the market