Mastodawn

Note: what we know about how real nuclear #wargames are played by those with actual high-level experience and responsibilities strongly suggests that escalation to the use of even one #nuclear weapon is very unlikely, and often requires that game "umpires" practically force the players to escalate.

RE: https://bsky.app/profile/did:plc:qi3uhneb2cz77ibwnzailb22/post/3mfqwwdxgus23

Show thread

Janne M. Korhonen

This has been a common observation/complaint, at least in open literature: even in games whose purpose is to train or test procedures for nuclear weapon use, the players are routinely extremely reluctant to use them, and prefer to e.g. negotiate with the opponents. So, Skynet remains a bad idea.

Show thread

Janne M. Korhonen Feb 26

Here's the link to the full paper in Arxiv btw: arxiv.org/abs/2602.14740

AI Arms and Influence: Frontie...

AI Arms and Influence: Frontier Models Exhibit Sophisticated Reasoning in Simulated Nuclear Crises

Today's leading AI models engage in sophisticated behaviour when placed in strategic competition. They spontaneously attempt deception, signaling intentions they do not intend to follow; they demonstrate rich theory of mind, reasoning about adversary beliefs and anticipating their actions; and they exhibit credible metacognitive self-awareness, assessing their own strategic abilities before deciding how to act. Here we present findings from a crisis simulation in which three frontier large language models (GPT-5.2, Claude Sonnet 4, Gemini 3 Flash) play opposing leaders in a nuclear crisis. Our simulation has direct application for national security professionals, but also, via its insights into AI reasoning under uncertainty, has applications far beyond international crisis decision-making. Our findings both validate and challenge central tenets of strategic theory. We find support for Schelling's ideas about commitment, Kahn's escalation framework, and Jervis's work on misperception, inter alia. Yet we also find that the nuclear taboo is no impediment to nuclear escalation by our models; that strategic nuclear attack, while rare, does occur; that threats more often provoke counter-escalation than compliance; that high mutual credibility accelerated rather than deterred conflict; and that no model ever chose accommodation or withdrawal even when under acute pressure, only reduced levels of violence. We argue that AI simulation represents a powerful tool for strategic analysis, but only if properly calibrated against known patterns of human reasoning. Understanding how frontier models do and do not imitate human strategic logic is essential preparation for a world in which AI increasingly shapes strategic outcomes.

arXiv.org

Show thread

Janne M. Korhonen Feb 26

This BTW is very much in my research interests: I'm working on how (and if) #LLM s, preferably open ones, could be used responsibly to simulate human decision-making in various models and simulations. E.g. to test and improve policies. #AgentBasedModeling #ABM

Show thread

Janne M. Korhonen Feb 26

And the results of this paper echo a common complaint about LLMs as proxy humans: while they are undoubtedly the best proxy for a real human player so far, and produce behaviours that look very plausible, their behavioural distributions differ from actual humans.