claude'ing away a coding agent security benchmark

what would you like to know?