Mastodawn

leontrolski 3d ago

"Disregard That" Attacks

https://calpaterson.com/disregard.html

"Disregard that!" attacks

Why you shouldn't share your context window with others

calpaterson.com

Show thread

marcus_holmes

The hypothetical approach I've heard of is to have two context windows, one trusted and one untrusted (usually phrased as separating the system prompt and the user prompt).

I don't know enough about LLM training or architecture to know if this is actually possible, though. Anyone care to comment?