Mastodawn

RE: https://mastodon.social/@campuscodi/116154291574332497

> We're entering an era where AI agents attack other AI agents. In this campaign, an AI-powered bot tried to manipulate an AI code reviewer into committing malicious code. The attack surface for software supply chains just got a lot wider.

Show thread

Christine Lemmer-Webber

Several interesting attacks in this one. What's curious is that each malicious PR discussed used a different attack.

A lot of them are injection attacks. But my favorite of all of them: rewrote CLAUDE.md so the reviewing agent took on different directives. That attack kinda rules ngl

Show thread

Christine Lemmer-Webber Mar 1

In its defense, the reviewing Claude agent identified the attack correctly and rejected it

Show thread

Christine Lemmer-Webber Mar 1

However, I suspect we'll see more and more attacks like this going forward. The CLAUDE.md attack is basically a Thompson attack but for agents instead of compilers.

Show thread

damien 🥖🐈‍⬛🧣Mar 1

@cwebber i guess that's the closest real-world equivalent we have to, like, using a Netrunner hack in Cyberpunk 2077 to tell someone to fuck off lol