@fullywoolly @rayckeith doesn't matter what you train your model on, this issue will still exist. so long as the model is capable of interpreting commands in its input, well, it's capable of interpreting commands in its input. and since there's no difference between the commands and the input (it's all input), either your model has prompt injection, or it has no system prompts at all.
that's how Anthropic's Mythos got "jailbroken" despite their testing: there's no system prompt, it's all input.