This is fun. Google Gemini’s “Summarize email” function is vulnerable to invisible prompt injection utilized to deceive users, including with fake security alerts.

#infosec #cybersecurity #blueteam

https://0din.ai/blog/phishing-for-gemini

The GenAI Bug Bounty Program

We are building for the next generation in GenAI security and beyond.

0din.ai
I continue to maintain that Apple’s slower march to AI puts them in a better place than the rest of the platforms rushing to create new user exposure for bad actors to exploit.

SANITIZE YOUR INPUTS.

Everyone rushing to LLM-ify everything forgot every lesson about input sanitization.

smdh.

@neurovagrant I'm pretty sure "sanitizing" inputs is fundamentally impossible, as in you must solve the Halting Problem in order to accomplish it.

If you don't want hostile inputs, you need to implement much more aggressive models of what input can even be, and you need to enforce those. Cf. the entire field of language-theoretic security https://langsec.org/ . tl;dr: "be liberal in what you accept" is a plan that has been extensively tested and comprehensively debunked.

@davidfetter @neurovagrant The halting problem is decidable for any finite computer. Just limit how much RAM and compute time can be used.

Beyond that, though, why is the model taking instructions from an email at all?

@bob_zim @neurovagrant Decidable, sure. It's complexity O(2^B) in the happiest case, where B is the number of bits in the device. I haven't done the arithmetic, but if it's short of trillions of years, I'll eat my hat.
@davidfetter @neurovagrant You just run it in a constrained environment. It either ends on its own within the constraints, or it gets killed when it hits them. The halting problem is relevant to computing theory, not to practical applications. Sure, this would prevent the system from handling an email with several megabytes of text, but that’s a desirable property anyway.

@bob_zim this is pretty much equivalent to the argument made by the langsec folks. I get the impulse to have an argument. I have it myself on occasion, as @neurovagrant can doubtless attest.

Maybe we should instead engage with the question of validating rather than sanitizing, that former perforce rejecting a lot of inputs that attempts to sanitize would accept. This rapidly runs into thought-terminating clichés like "the customer is always right," and that in turn leads directly into the political economy of software development, a generative discussion.

@neurovagrant @bob_zim @davidfetter Because LLM have no ways to distinguish data from instructions, it's their biggest shortcoming I'd say. There is no real way to avoid this kind of bugs right now, and not even a plan for the future.
@lapo @neurovagrant @davidfetter For most current LLM designs, sure, which is why they are fundamentally unsuitable for this kind of thing. LLMs aren’t the only kind of model, though. NeXT had system-level summarization of text in the 80s which could run on an MC68k.
@neurovagrant @bob_zim @davidfetter Interesting, I never had the pleasure!
… but the MC68k sure bring up some memoties, it was the first assembly I dabbled with. 🥹
(mostly using Amiga Action Replay ][ to NOP over some SUBQ to avoid decrementing lives in games 🤣)
Action Replay MK II

Action Replay MK II