Inspired by Arditi et al. (NeurIPS 2024) on the “refusal direction” in LLMs, I tested an abliteration attack using the Heretic tool in my home lab. Interesting questions about AI guardrail robustness.
https://www.linkedin.com/pulse/i-deleted-ais-moral-compass-20-minutes-home-lab-your-red-yann-allain-zbzte/ (sorry for the LinkedIn link — no time to write this up on a proper blog yet.)
💡 AI agents moving from experiment to enterprise?
Data governance is the difference between teams that scale safely and teams that make headlines for the wrong reasons.
RBAC, ABAC, or both? What's your stack? 👇
#AIAgents #DataSecurity #RBAC #ABAC #LLMSecurity #PII #CyberSecurity





