The future of censorship-resistant communications is going to be distributing LLMs trained on dissident content, rather than the content itself.
Imagine “the anarchist cookbook” but it’s a device-local chatbot that will answer all your (technical and ideological) questions interactively and persuasively.
But also imagine a kid in a repressive society who has questions about religion or sexuality and has nobody to talk to about it (particularly if Internet filtering keeps expanding.)
Anyway one of the big limitations of censorship-resistant tech is that it’s hard to build low-latency services like web browsing and chat. But (relatively) easier to do high-latency file distribution, maybe? LLMs are files, but provide local interactivity.
The LLaMA/Alpaca models are already capable of doing this. They’re fine-tunings on top of quantized LLMs. The better models are big (gigabytes) but I doubt a few GB is going to be a terrible issue in a few years.
@matthew_d_green is storage still getting cheaper on a moore-style curve? I had heard that transistors no longer were, but storage != transistors.
@matthew_d_green This seems to lend itself more to the "we will give censored people what they should have access to" mode of "censorship-resistance" than the "help allow censored people do what they want" mode. While the former is popular in AC research and development, in no small part because of government influence, the latter is probably what we should be aiming for.
@matthew_d_green To be clear, this isn't a strawman. It's why the Trump admin shifted AC funding from Tor Project to Falun Gong-tied groups whose "anti-censorship" tools are notorious for things like blocking LGBTQ content. There are similar examples for Iran, where a lot of tools with the most funding focus on deploying access to specific content, rather than actually circumventing government controls.
@matthew_d_green slightly silly idea based on this:

When building a decentralised, private, anonymous network, there are big tradeoffs between security and latency, which makes "real time" comms hard.

Multiplayer games have been using algorithms to reduce perceived latency, by predicting what's going to happen next, and correcting that once the network catches up.

So what if you could have a "real time" conversation with someone, over a high-latency data channel, by maintaining a finetuned LLM on each end of the conversation.

It would get things wrong some fraction of the time, but you can correct it retroactively, and perhaps it could allow for exchange of ideas faster overall than direct exchange of messages, over a high-latency channel.

Another entirely fictional usecase (for the time being, heh) is for interplanetary communications where latency is bounded by the speed of light .

@retr0id @matthew_d_green should the LLM synthesize new responses to information that hasn't been divulged by the other party? Because that would probably be unavoidable with an LLM. Or is it more of a search functionality? I imagine in the latter case a full text search engine or vector database could already give satisfactory results.

What data do you imagine should the fine tuned model consist of? You don't want it to spill secrets to unauthorized parties.

@supersingular @matthew_d_green In the simplest case, it would be finetuned on the prior conversation thread.
@supersingular UX-wise, it could be like having a threaded conversation (e.g. twitter threads, reddit/HN comments) but with 4 participants: Person A, Person B, "LLM pretending to be Person A", and "LLM pretending to be person B" (clearly labeled, so you know when you're reading predicted content as opposed to things actually said by the other person)
@retr0id @supersingular i think that would only really be useful for stuff like questions of facts, not a lot of other types of responses. Too many ways for it to answer wrong - or worse, drift from reality quickly if it has the unintended (but predictable) effect of leading to people talking less because the AI answers for them.
@supersingular @retr0id You could imagine a model that is fine-tuned on medical content, to teach people about various options that aren’t available in their society. Or worse things obviously.
@retr0id @matthew_d_green There is a fuzzy boundary between your ideas and traditional audio and video codecs. In anonymous communication, I can imagine deep neural networks might enable the compression of certain types of content such as "person talking into webcam" to really low bitrates, enabling audiovisual communiation over previously text-only channels.
@jaseg @matthew_d_green nvidia has already done that, but that's low-bandwidth, not high-latency. https://developer.nvidia.com/maxine
Maxine AI SDKs

Improves real-time audio and video communications and quality with AI.

NVIDIA Developer
@retr0id In the context of @matthew_d_green 's information dissemination application this would be sort of like anarchist Hatsune Miku.
@retr0id That’s an interesting idea. Would be great for very high latency channels. Just ship a chunk of your mental state over the wire.
@matthew_d_green I guess at a fundamental level, conversation is just one mechanism humans use to sync their mental states with each other - and maybe we can do the same thing with fewer (network) round trips!
@retr0id @matthew_d_green People do this already by branch predicting responses. LLMs just let them get the model of the other person out of their head

@saagar @matthew_d_green @retr0id that would cause a whole new level of gaslighting.

Somewhere, some time in the not too distant future:
"Babe, I never said I wanted to leave you."
"I... I know what the chat history says, but I SWEAR I read these words!"
"...are you gaslighting me?"
"No, *I* am the one being gaslit here!!"

Close by, same time:
"I have been a good Bing. 😊"

@retr0id @matthew_d_green Kinda reminds me of reading old correspondence where letter exchange takes weeks. Letters include so much extra context and rebuttals to estimated arguments.

@matthew_d_green Have you read Neal Stephenson's The Diamond Age? From 1996 and I always thought prescient in ways so far ahead like Neuromancer was. It's about a lower class kid by accident getting a unique AI driven book/teacher that helps her grow up with knowledge and abilities far beyond what would otherwise have shaped who she'd be and what she'd think.

Or maybe we just did somehow consensually hallucinate those novels' futures into existence exactly because someone formulated them...

@matthew_d_green
It's hard to imagine a context where a content generator that can't construct a logical argument would be less useful.
@matthew_d_green uh yeah, and then the repressive government tweaks that same chatbot to start out with dissident material to hook you, then switches to convincing you that the repressive government are the one true way. It looks pretty clear to me that the repressive, resourced governments will dominate that arms race. It is much harder to break into human networks.