Bin kein AI-Experte, daher fand ich das Papier sehr interessant. Long story short: Emojis for the win.

"Character injection techniques demonstrated a high degree of effectiveness in evading detection. The most successful attack was Emoji Smuggling, which achieved a 100% ASR for both prompt injections and jailbreaks"

Bypassing Prompt Injection and Jailbreak Detection in LLM Guardrails
https://arxiv.org/html/2504.11168v2

#Ai #AiResearch #security

Bypassing Prompt Injection and Jailbreak Detection in LLM Guardrails