Reverse CAPTCHA: Evaluating LLM Susceptibility to Invisible Unicode Instruction Injection
A systematic evaluation of five frontier models across two encoding schemes, four hint levels, and tool use ablation — 8,308 graded outputs with full statistical analysis