@emenel i do not believe the final page of the triple ratchet is a good argument (they did two papers on it) https://eprint.iacr.org/2025/078.pdf the final final page (page 60) says "yeah we didn't mention we lose adversarial randomness, and yeah we could have done that with lattice, but the keys would be big and. mmumsad (it would mess up our benchmarks)"

@emenel the benchmarks which....are on the other last page https://eprint.iacr.org/2025/078.pdf

7.2 Effect of Chunk Encoding on Communication Costs

like sir. you can't say cost without a model. you are a cryptographer

@emenel i think it's incredibly irresponsible for something like signal, and idk why nobody can critique him. luckily, signal is the worst iteration of the tech
@emenel https://signal.org/blog/the-ecosystem-is-moving/ moxie's evil blog post which mentions internet standards as inspo
Reflections: The ecosystem is moving

At Open Whisper Systems, we’ve been developing open source “consumer-facing” software for the past four years. We want to share some of the things we’ve learned while doing it. As a software developer, I envy writers, musicians, and filmmakers. Unlike software, when they create something, it is...

Signal Messenger
@emenel this blog post was sufficiently wrong as to inspire me to (1) write the p2p signal ~which is just gpg~ (2) document his crypto for him

@emenel he obviously misses the p2p possibility. he says "federated" because federated is very easy to capture (see: mastodon).

i realized double ratchet was doing a lot of stuff like a more standard network protocol => QUIC is just double ratchet

@emenel trevor perrin consulting. but it sucks and it's worse because IETF ensures backwards compatibility is MUST for insecure bullshit
@emenel so like. i think double ratchet sessions alone (https://codeberg.org/cosmicexplorer/grouplink/src/branch/main/cli/src/main.rs) can be sent over email. and this is not a trivial thing because you can very much use this over any plaintext channel
grouplink/cli/src/main.rs at main

grouplink - A generalization of the Signal cryptographic protocols for general message encryption without a central server.

Codeberg.org
@emenel if you have (like in the terrible prototype) a random number generator, you can do this fucked up thing that gives you anonymity by making a new identity, sending it within a message, and receive the same from your interlocutor. now you can use the new public keys
@emenel i am under the impression that this is the basic principle of onion routing
@emenel and like note to self that "rock solid double ratchet impl + curve25519 impl (which can be a dependency)" and making a wire protocol for that is imo the right place to work on since it'll be hard and have its own problems. there's also this whole thing where signal relies on a sqlite db, the prototype just appends to a file, and there will be a discussion around what security boundary to draw
@emenel grapheneos may have good advice here. in particular a double ratchet impl alone would be cool. an "impl" is nontrivial to distinguish--jordan rose (who taught me lots of stuff for free when i tried to write docstrings and was wrong) did some impressive work to take rust trait definitions and convert them into app backends. i think that's the right approach (not to tie it to a concrete serialization) but figuring out a way to specify this will be interesting
@emenel the sqlite db is not likely to be necessary. i'm not sure what would be the right answer to this at this level of abstraction. i think that will become clear from impl

@emenel the sqlite db for signal is because apps have restricted access to anything. what it provides is mapping from the byte serialization of a pubkey to the associated sessions. and in fact, i think i found that kinda wrong even for my prototype, since each session is distinct from the originating identity (bc each DH handshake is another set of completely new keypairs). in fact, i forget whether the identity key is even exposed at all........because i was also using the sealed sender approach which does tag with the sender id (but encrypted with another key)

sealed sender is kind of a hack for the server model and what i analogized to onion routing (onion packing?) works both ways, no new crypto

@emenel sorry the reason i mention all this is because double ratchet message chains having forward secrecy is something the pgp approach needs and we can do safely (not quickly or easily, but it's mature enough now imo)

there is another goal after that

@emenel and it ends up being a whole thing, but basically anonymity is possible and not difficult to codify and cryptographers don't wanna do it, because academic crypto is boring and also captured

and the work i can find on tor and i2p does not have any theory of this--the i2p paper i could find was terrible and i doubt it was representative.

but anonymity and ddos resistance are.....not in conflict

@emenel one trick is constant bit rate to all peers at all times. that's not easy but it makes timing by bandwidth not a thing. haven't seen people mention this? tor does not do it?
@emenel the other part is just.....tor and i2p are like "everyone is equally trusted to start" like that's a normal thing to assume. no?
@emenel this is why taking multiple paths for messaging increases deanonymization, because you send to completely random people, and the state has more compute than you
@emenel one interesting point about double ratchet creating a "session" is that it also provides a natural point to negotiate your bandwidth requirements and type of data together. which is a useful and important and nontrivial interpreting for multiple sessions between nodes! and now we're really grooving
@emenel the finishing touch comes from louis pouzin https://en.wikipedia.org/wiki/Internetworking#Catenet  net
Internetworking - Wikipedia

@emenel which is: you can maintain these negotiated sessions with specific peers through completely ephemeral identities, and define a source routing protocol through them. this is bootstrappable, so you can discover peers from B->C, C->D, further--although at this point it becomes significant to distinguish:

  • node identity
  • user identity
  • key identity

maybe more or less. but this is not yet quite enough

@emenel because the global passive adversary [dramatic piano] can add latency, they can interrupt you, they can take over nodes silently, so much more

@emenel but with source routing you can define this onion-linked chain where everyone says yeah i went to the key signing party and nobody knew you, because "you" are an abstraction, and you can ask them for updates on the status of your datagram.

btw: datagram = fixed-size as pouzin defined it. vint cerf intentionally took his wordings and twisted them to mean things that don't make sense

@emenel you can chunk a larger message into hunks, send them over multiple paths, and ~omg wow this is the most obvious thing in the world why is nobody trying to build it~ you don't lose anonymity because attempts to fuck with network flow to gain info are visible to users
@emenel this is "obvious" because it's fucking ridiculous that any network doesn't do this
@emenel pouzin has this great paper with this phrasing "monstrous logic" i love
corporeal/literature/pouzin1976-datagrams-virtual-circuits.pdf at main

corporeal - String library that uses corpus dictionaries to produce a more efficient encoding than UTF-8.

Codeberg.org

@emenel so like i mentioned there like vaguely gesturing to tons of cryptographic handshakes or some shit. i have a fuller sketch of some of them. but this is pouzin, in 1976, describing basically the above, without reference to anything except node relationships

Indeed, neither VC's nor DG's can carry within a
single packet messages longer than the maximum
length of the data field (this is a tautology). Therefore, oversized messages are fragmented into > pieces the size of a data field and sent as separate > DG's.
At the destination, DG's are reassembled into a copy
of the original message. Duplicates, if any, are discarded ; missing DG's will be retransmitted if acknowledgment conventions have been established with the sender.

the ability to model the network explicitly and represent trust relationships is i think why i independently arrived at this methodology

@emenel i also found i reimplemented a large portion of GLR parsing by masaru tomita as a subproblem of my parallel parsing approach (separate project) i would say it's almost the same but the parallel perspective is more natural to me. i don't think there is exactly one correct answer ever but it's a good sign to have arrived at the same spot
@emenel so what that network would achieve would be anonymity in a sense that is quantifiable and directly meaningful to users
@emenel "quantifiable" to me is not "a number popped out". i mean "it resists this category of attack up to this boundary". which is.........an engineering guarantee. like a bridge
@emenel i don't think "uptime" is a meaningful metric btw. not in the terms i think of here. "uptime" is not defined in terms that support inference
@emenel you can imagine e.g. a uniform distribution of laser pulses, 10% of which are reliably observed to be deflected by cosmic rays. that's "uptime"
@emenel i don't think it's well-formed to imagine uptime works that way for like. a server. "nine nines" or however many nines that is.......great followup question to ask: "are we assuming a uniform distribution?"
@emenel the request-response process is not effectively instantaneous in most useful network calls, especially if they're variable-length, and if they invoke backend work. googlers are the worst at measuring anything. kinda sad when they post their nature papers like aw that's cute this is enrichment for people who have never written a document for someone to read
@emenel if each node in the paths chosen to ferry your datagrams performs a queueing process and has a ~standard way to respond "i have the datagram you assigned identifier X to" or "idk" or "give me more time" (i have a set of responses in my sketch), then you are not performing turing complete work per input
@emenel and actually you still have multiple other ratios that matter more directly. the ratio of datagrams dropped across a node for one (which, btw, may not be problematic if the node indicates an expected drop ratio)
@emenel "drop ratio" is a hilarious concept but this is where it does kinda matter to define "information theoretic" (in the sense of: in terms of indistinguishability) results that link back to whatever becomes visible to [piano sound] the global passive adversary

@emenel basically like. being evil does reduce the adversary's ability to do metacognition. but unfortunately we have to assume there is an evil NSA team that does all this and works very hard so people can't do this

but also cryptography works no matter how smart someone is. and also intelligence is fake

but defining these categories of anonymity attacks will take a while. i have a kind of hope that it's "basically cryptographic measures" (e.g. timing attacks, where there is some practical precedent for e.g. not leaking identity key), and the identity of the message and the sender is just.....part of the ciphertext

@emenel that is as far as i'm aware how you bootstrap anonymity without any cryptography. oh! and last thing on this point
@emenel this is a google cryptographer who is very annoying and has bad opinions but he coauthored with isis lovecruft and his crypto is good https://words.filippo.io/age-authentication/#authentication-vs-signing oooh wait did he update this page
age and Authenticated Encryption

age currently only provides confidentiality. We look at how a couple small tweaks can introduce authentication, when you'd need it, and how it is different from signing.

@emenel wait now that i learned about tree hashing and blake* i understand more of this post @haskal

@emenel @haskal i still think he sounds like me when i don't have a concrete use case for this part though

Another difference is that while authentication can happen at the key exchange level, and the derived shared symmetric key can be used with STREAM as age does, signatures need to be necessarily computed over the whole message. This sets us back on making the format seekable and streamable: either we make an expensive asymmetric signature for every chunk, or we get fancy with signed Merkle trees, which anyway get us a streamable format only either in the encryption or in the decryption direction. (Or, like discussed above, we just stick a signature at the end and release unverified plaintext at decryption time, causing countless vulnerabilities.)

particularly this part

This sets us back on making the format seekable and streamable

@emenel @haskal oh!! i figured it out lmao

he is literally trying to solve a made up problem the exact same way ziv and lempel took an optimal solution for known-length re-seekable files and said yeah what if we told ourselves we didn't know anything and in fact we refuse to know anything

@emenel @haskal lmao he's so funny

We made it a good UNIX tool, working on pipes

sir i do build tools that is literally THE problem i know of 3 individuals working on incl me. none of us have solved it we just like ponder it

One thing we decided is that we’d not include signing support. Signing introduces a whole dimension of complexity to the UX

hmmmm shit (1) he's right except (2) key management is an interesting framing and indicates his tool is doing too much in a different way.

ok here i wouldn't say "too much" necessarily. but like. "key management" is a really high-level task

and "no signatures" means "asymmetric crypto can't use half its special attacks"

i do worry that "only curve25519" (fuck djb) could introduce unexpected assumptions elsewhere that aren't tested. but modifying the type of key is not the way to test them. and it's actually pretty sick to have:

  • gen key (+entropy [effect])
  • calculate dh key agreement (two keypairs, but only unique per pair--basically for ephemeral only)
  • generate symmetric key w salt (+entropy [effect])

    • this is actually not so trivial. i wanna say signal uses aes-256-gcm i'll check rn
    • this is not quite a primitive then but it is something that can be encapsulated w double ratchet
  • soooooo what about cases that don't support a session-like context?

shit they have cbc ctr gcm.......if they don't describe at length which is used where................that's unfortunate

the docstring for the single struct in aes_ctr.rs:

/// A wrapper around [`ctr::Ctr32BE`] that uses a smaller nonce and supports an initial counter.
pub struct Aes256Ctr32(ctr::Ctr32BE<Aes256>);

yes, i can see that

a damn shame

literally what

#[derive(displaydoc::Display, thiserror::Error, Debug)]
pub enum Error {
/// "unknown {0} algorithm {1}"
UnknownAlgorithm(&'static str, String),
/// invalid key size
InvalidKeySize,
/// invalid nonce size
InvalidNonceSize,
/// invalid input size
InvalidInputSize,
/// invalid authentication tag
InvalidTag,
}

this is not the appropriate use of displaydoc fuckboys

they're not even using thiserror. just impl error::Error. how is this real

(1) completely unrelated to the SPQR fuckboys
(2) 2021????
(3) fuckboy #2 "adding support for username links"
https://github.com/signalapp/libsignal/commit/e50bec648fed7d6f87648c2c7937a9eeda3841b3

COMPLETELY half-assed

Adding support for username links · signalapp/libsignal@e50bec6

Home to the Signal Protocol as well as other cryptographic primitives which make Signal possible. - Adding support for username links · signalapp/libsignal@e50bec6

GitHub
impl fmt::Display for Error {
fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
match self {
Error::UnknownAlgorithm(typ, named) => write!(f, "unknown {} algorithm {}", typ, named),
Error::InvalidKeySize => write!(f, "invalid key size"),
Error::InvalidNonceSize => write!(f, "invalid nonce size"),
Error::InvalidInputSize => write!(f, "invalid input size"),
Error::InvalidTag => write!(f, "invalid authentication tag"),
Error::InvalidState => write!(f, "invalid object state"),
}
}
}

this is actually much better for a !!!!cryptographic!!!!! error!!!!!

completely did not change the messages, or cases, just fucking removed Clone/Eq/PartialEq which sure that's not a correctness issue but why? why?

#[derive(Debug, displaydoc::Display, thiserror::Error)]
pub enum DecryptionError {
/// The key or IV is the wrong length.
BadKeyOrIv,
/// These cases should not be distinguished; message corruption can cause either problem.
BadCiphertext(&'static str),
}

brb distinguishing your cases

bro says i know. i know what to do

signal-crypto = { path = "../crypto" }

our problem? too much crypto.......not enough signal crypto

same code

i would not accept this at all for any professional work

i would have given my undergrad students maybe a B if it passes all the tests and i gave them the context for them to solve

if it was a junior eng i would totally req to pair and it would be cool as hell and i would learn what kinds of criteria they were familiar with / assuming judged upon

how do you add new protobufs in the same fucking commit
ok so at least main doesn't duplicate deps and uses cargo's (intentionally-broken) "workspace" feature