I've watched LLMs write full exploit chains for years. The amazement fades fast once you hit context limits and spend hours steering the model past every hard corner. But the industry is packed with people who just arrived and are still in that first rush. This Calif post is a good example — real result, soft target (no KASLR, no canaries), 44 human prompts. The gap between demo and production hardened targets is the part nobody wants to talk about yet.

https://blog.calif.io/p/mad-bugs-claude-wrote-a-full-freebsd

MAD Bugs: Claude Wrote a Full FreeBSD Remote Kernel RCE with Root Shell (CVE-2026-4747)

To our knowledge, this is the first remote kernel exploit both discovered and exploited by an AI.

Calif