I am not going to lie, I don't really follow #AI model jailbreaking discussions very closely, but the use of multiple languages and word order as a way to bypass the guardrails was new to me.
Maybe we should call this a Babel-Attack.
I am not going to lie, I don't really follow #AI model jailbreaking discussions very closely, but the use of multiple languages and word order as a way to bypass the guardrails was new to me.
Maybe we should call this a Babel-Attack.