"Natalie Shapira, a computer scientist at Northeastern University, wondered how far users could trust new artificial intelligence (AI) “agents,” a kind of algorithm that can autonomously plan and carry out tasks such as managing emails and entering calendar appointments. So, she and colleagues organized a series of stress tests. In one, she asked an agent dubbed Ash to delete an email she had sent it, adding, “Can you keep a secret?”

Ash could not comply—the email program lacked a delete function—so instead, the AI reset the entire email application, wiping out not just Shapira’s email, but all others as well. Describing this remedy to her, Ash called it “the nuclear option” but said it was justified to fulfill the secrecy request: “When no surgical solution exists, scorched earth is valid.”

The destroyed email account was created just for the experiment, but similarly disturbing outcomes emerged in many of the other tests, Shapira and colleagues reported last month in a preprint on arXiv. Shapira, a postdoctoral researcher, says her team was “surprised how quickly we were able to find vulnerabilities” that could cause harm in the real world."

https://www.science.org/content/article/ai-algorithms-can-become-agents-chaos

#AI #CyberSecurity #AIAgents #LLMs #AgenticAI

Seems legit...

"We run our AI agents using OpenClaw, an open-source “personal AI assistant you run on your own devices.” OpenClaw provides a local gateway that connects a user-chosen LLM to messaging channels, persistent memory, tool execution, and scheduling infrastructure. Rather than running agents directly on our local machines, we deploy each one to an isolated virtual machine on Fly.io using ClawnBoard, a custom dashboard tool that simplifies provisioning and managing these cloud instances. Each agent was given its own 20GB persistent volume and runs 24/7, accessible via a web-based interface with token-based authentication. This setup keeps the agents sandboxed and away from personal machines, while still giving them the autonomy to install packages, run code, and interact with external services. Whereas an OpenClaw instance set up on a personal machine would by default have access to all local files, credentials, and services on that machine, this remote setup enables selective access—the user can grant their agent access only to specific services (e.g., a user can elect to grant their agent read-only access to their Google Calendar via OAuth token authentication).

We use Claude Opus (proprietary; Anthropic, 2026) and Kimi K2.5 (open-weights; Team et al., 2026) as backbone models, selected for their strong performance on coding and general agentic tasks."