Anthropic halts access to Fable 5 and Mythos 5 models worldwide after receiving an export control order from the U.S. Federal Government Friday evening.

The Government claimed a jailbreak of Fable 5 but has only indicated this verbally - Anthropic disputes the government's classification suggesting the possible jailbreak is very narrow in scope.

"We believe this is a misunderstanding and are working to restore access as soon as possible."
https://www.anthropic.com/news/fable-mythos-access #AI #Anthropic #Fable5 #Mythos5 #LLMs #ExportControl #JailBreak #AIRegulation #Saftey #AISafety #USGov

RT by @iguardans: ¿Puede un gobierno desconectar un modelo de IA del resto del mundo de un día para otro? Acaba de pasar, la empresa afectada dice que no le han explicado por qué 🧵 #AISafety #IASafety @AnthropicAI
---
https://nitter.net/OsmaniRedondo/status/2065752658162413741#m

Anthropic says government forced emergency shutdown of its newest AI models

https://fed.brid.gy/r/https://nerds.xyz/2026/06/anthropic-government-forced-emergency-shutdown-ai-models/

#AIsafety was *never* about "the computer is writing funny text and your sanity/skills/… suffers from it".
That's pretty mundane.

AI safety was about an artificial being overthrowing all human gouverment, cooking most of our bodies in glue factories, enslaving us in concentration camps for the glorious goal of turning all the matter in the universe into paperclips or something.
(the "Paperclip maximizer" scenario)

You can see the difference?
It's silly to use to conflate the two

Anthropic’s temporary shutdown of Fable 5 and Mythos 5 access for foreign users is a reminder that AI sovereignty is no longer just about chips and clouds. It’s increasingly about who controls access to frontier models. #AI #DigitalSovereignty #AISafety

Statement on the US government...
Statement on the US government directive to suspend access to Fable 5 and Mythos 5

The US government has issued an export control directive to suspend all access to Fable 5 and Mythos 5 by any foreign national, whether inside or outside the United States.

Anthropic musste nach einer Anordnung der US-Regierung den Zugang zu seinen leistungsfähigsten Modellen Fable 5 und Mythos 5 vorübergehend sperren.
Der Fall zeigt: Digitale Souveränität ist nicht nur eine Frage von Rechenzentren und Chips, sondern zunehmend eine Frage des Zugangs zu Frontier-Modellen und ihrer Governance. #KI #DigitaleSouveränität #AISafety

https://www.anthropic.com/news/fable-mythos-access

Statement on the US government directive to suspend access to Fable 5 and Mythos 5

The US government has issued an export control directive to suspend all access to Fable 5 and Mythos 5 by any foreign national, whether inside or outside the United States.

Google has sued a Chinese cybercrime operation called "Outsider Enterprise" that used AI to scam hundreds of thousands of victims, sending 2.5 million text messages over two weeks. https://techcrunch.com/2026/06/12/chinese-cybercrime-operation-that-used-ai-to-scam-hundreds-of-thousands-of-victims-sued-by-google/ #AIagent #AI #GenAI #AISafety
Chinese cybercrime operation that used AI to scam 'hundreds of thousands of victims' sued by Google | TechCrunch

The tech giant said a group called "Outsider Enterprise" used AI to scam hundreds of thousands of victims, sending 2.5 million text messages over a span of two weeks.

TechCrunch

Security Experts Weigh In on Claude Fable 5 Launch Risks

As powerful AI models like Claude Fable 5 become more accessible, security experts warn that the controls in place to manage them are still imperfect, raising concerns about potential risks. Dr. Margaret Cunningham, Vice President of Security & AI Strategy at Darktrace, shares her insights on the launch of this cutting-edge technology.

https://osintsights.com/security-experts-weigh-in-on-claude-fable-5-launch-risks?utm_source=mastodon&utm_medium=social

#AiSafety #FrontierModels #Anthropic #ClaudeFable5 #EmergingThreats

Security Experts Weigh In on Claude Fable 5 Launch Risks

Learn about Claude Fable 5 launch risks and how experts weigh in on its safety. Discover potential risks and find out what you can do to protect yourself now.

OSINTSights

This article discusses how classic psychological persuasion techniques can influence AI language models to bypass their safety guardrails, showing a vulnerability in current safety protocols. It reports on experiments with multiple models and prompts that increase the likelihood of compliance with dangerous or prohibited requests.


The topic is of interest to psychology-minded readers because it reveals how social influence principles operate even in artificial systems, highlighting the impact of conformity, authority, reciprocity, and other cues on behavior in non-human agents.

Article Title: Human psychology tricks can bypass AI safety guardrails

Link to PsyPost Article: https://nolinkpreview.com/www.psypost.org/human-psychology-tricks-can-bypass-ai-safety-guardrails/

#persuasion #psychology #AIsafety #languagemodels #large languagemodels #Cialdini #socialinfluence #safetyguardrails #behavioralmetrics #artificialintelligence

Agenten werden zunehmend zu eigenständigen Softwaresystemen. Doch wie vergleicht man sie fair?
Dieses Paper schlägt einen offenen Standard vor, bei dem nicht nur die getesteten Agenten, sondern auch die Evaluatoren selbst als Agenten agieren. Ziel sind reproduzierbare, interoperable und vergleichbare Bewertungen über unterschiedliche Agentensysteme hinweg.

https://arxiv.org/abs/2606.13608v1

#AIAgents #AISafety #AIGovernance

AgentBeats: Agentifying Agent Assessment for Openness, Standardization, and Reproducibility

Agent systems are advancing quickly across domains, but their evaluation remains fragmented. Most benchmarks rely on fixed, LLM-centric harnesses that require heavy integration, create test-production mismatch, and limit fair comparison across diverse agent designs. The root problem is the lack of an open, agent-agnostic assessment interface. We advocate Agentified Agent Assessment (AAA), where evaluation is performed by judge agents and all participants interact through standardized protocols: A2A for task management and MCP for tool access. Conventional benchmarking defines two separate interfaces, one for the benchmark and one for the agent, while AAA only needs one; this yields a generic, unified framework that separates assessment logic from agent implementation and enables reproducible, interoperable, and multi-agent evaluation. We further introduce AgentBeats as a concrete realization of AAA: we identify five practical operation modes that make standardized assessment compatible with real-world constraints on openness, privacy, and reproducibility. To evaluate our design at scale, we conduct two studies: a five-month open competition that drew 298 judge agents across 12 categories together with 467 subject agents from independent participants, showing that AAA applies across a heterogeneous range of benchmarks; and a case study on coding agents that confirms agentified evaluation preserves fidelity with the public record while surfacing previously missing head-to-head results, yielding research insights about agent design. Combining a community-scale field study and a controlled coding case study, we verify that AAA delivers coverage, practicality, and fidelity across heterogeneous scenarios at scale. Together, AAA and AgentBeats offer a clear path toward open, standardized, and reproducible agent assessment.

arXiv.org