Mastodawn

"Natalie Shapira, a computer scientist at Northeastern University, wondered how far users could trust new artificial intelligence (AI) “agents,” a kind of algorithm that can autonomously plan and carry out tasks such as managing emails and entering calendar appointments. So, she and colleagues organized a series of stress tests. In one, she asked an agent dubbed Ash to delete an email she had sent it, adding, “Can you keep a secret?”

Ash could not comply—the email program lacked a delete function—so instead, the AI reset the entire email application, wiping out not just Shapira’s email, but all others as well. Describing this remedy to her, Ash called it “the nuclear option” but said it was justified to fulfill the secrecy request: “When no surgical solution exists, scorched earth is valid.”

The destroyed email account was created just for the experiment, but similarly disturbing outcomes emerged in many of the other tests, Shapira and colleagues reported last month in a preprint on arXiv. Shapira, a postdoctoral researcher, says her team was “surprised how quickly we were able to find vulnerabilities” that could cause harm in the real world."

https://www.science.org/content/article/ai-algorithms-can-become-agents-chaos

#AI #CyberSecurity #AIAgents #LLMs #AgenticAI