There's a lot of discourse on Twitter about people using LLMs to solve CTF challenges. I used to write CTF challenges in a past life, so I threw a couple of my hardest ones at it.

We're screwed.

At least with text-file style challenges ("source code provided" etc), Claude Opus solves them quickly. For the "simpler" of the two, it just very quickly ran through the steps to solve it. For the more "ridiculous" challenge, it took a long while, and in fact as I type this it's still burning tokens "verifying" the flag even though it very obviously found the flag and it knows it (it's leetspeak and it identified that and that it's plausible). LLMs are, indeed, still completely unintelligent, because no human would waste time verifying a flag and second-guessing itself when it very obviously is correct. (Also you could just run it...)

But that doesn't matter, because it found it.

The thing is, CTF challenges aren't about inventing the next great invention or having a rare spark of genius. CTF challenges are about learning things by doing. You're supposed to enjoy the process. The whole point of a well-designed CTF challenge is that anyone, given enough time and effort and self-improvement and learning, can solve it. The goal isn't actually to get the flag, otherwise you'd just ask another team for the flag (which is against the rules of course). The goal is to get the flag by yourself. If you ask an LLM to get the flag for you, you aren't doing that.

(Continued)

@lina You can always just do them without even competing against a team of egghead academics who see it as another credential for their CV. Solving puzzles is fun on its own. Maybe we were screwed the moment people started seeing hacking not as an art and source of emotional fulfillment, but as a road to riches. Oh well. I think a lot of society will have to rethink their priorities (and the social contract) over the next decades.
@jc0f0116 It's not about credentials. I've never cared about credentials and I found competition CTFs fun. Eliminating the competition aspect removes a huge motivator for a lot of people who wouldn't otherwise do it.