Mastodawn

Latest newsletter post has a little on Art Brut (due to a museum visit) before the paywall. https://open.substack.com/pub/arnicas/p/titaa-765-art-brut-and-the-squash?r=sv8a&utm_campaign=post&utm_medium=web

TITAA #76.5: Art Brut and the Squash Cluster

Tims in Stories - Bullshit Questions - Disorganized Agents - Three.js - WFC - Embeddings

Things I Think Are Awesome

Directed Alberto Cetoli Graph 3d ago

Show thread

Szymon Sokół 🇵🇱🇪🇺🇺🇦3d ago

@cstross @[email protected] @feorag I saw this just this morning: https://phys.org/news/2026-03-palm-sized-superconducting-magnet-tesla.html

Palm-sized superconducting magnet achieves 42 tesla, rivaling the world's biggest

When we think of powerful magnets used in particle accelerators or for NMR (nuclear magnetic resonance), we often envision bulky machines, sometimes the size of buildings. But in an extraordinary breakthrough for physics, scientists at ETH Zurich have created magnets that are small enough to fit in the palm of your hand yet powerful enough to rival some of the world's most powerful magnets.

Phys.org

Directed Alberto Cetoli Graph 3d ago

Show thread

Leshem Choshen 5d ago

I find it an expected but thought-provoking point. Most safety methods are post hoc ways to change a model, so they provide weak safety, one can easily apply them the opposite way.
Paper link: https://arxiv.org/abs/2603.10012

Measuring and Eliminating Refusals in Military Large Language Models

Military Large Language Models (LLMs) must provide accurate information to the warfighter in time-critical and dangerous situations. However, today's LLMs are imbued with safety behaviors that cause the LLM to refuse many legitimate queries in the military domain, particularly those related to violence, terrorism, or military technology. Our gold benchmark for assessing refusal rates, which was developed by veterans of the US Army and special forces, is to our knowledge the first dataset of its kind. We present results for refusal and deflection rates on 31 public models and 3 military models. We observe hard rejection rates as high as 98.2% and soft deflection rates ranging from 0% to 21.3%. We also present results on two additional synthetic datasets and show their correlations with the gold dataset. Finally, we perform abliteration using the Heretic library on a military-tuned gpt-oss-20b model, showing an absolute increase in answer rate of 66.5 points but an average relative decrease of 2% on other military tasks. In our concluding remarks, we argue for deeper specialization, including with mid-training and end-to-end post-training, to achieve zero refusals and maximum military task accuracy for closed military models.

arXiv.org

Directed Alberto Cetoli Graph 3d ago

Emily Young 5d ago

TIL that the NES "blacker than black" palette entry 0d is not just blacker than black, but imitates a sync pulse that can cause TVs and scalers/capture devices to lose sync.

What's more, The Immortal on NES used it to gain 1 more shade of grey. I've yet to be able to run it.

Directed Alberto Cetoli Graph 5d ago

Show thread

Leshem Choshen Mar 9

We hope the findings motivate context management systems that more carefully weigh the consequences of storing past model outputs.

📍paper: http://arxiv.org/abs/2602.24287
@lchoshen.bsky.social @ramon-astudillo.bsky.social Tamara Broderick, Jacob Andreas

🇧🇷 To appear at the ICLR 2026 MemAgents Workshop

Do LLMs Benefit From Their Own Words?

Multi-turn interactions with large language models typically retain the assistant's own past responses in the conversation history. In this work, we revisit this design choice by asking whether large language models benefit from conditioning on their own prior responses. Using in-the-wild, multi-turn conversations, we compare standard (full-context) prompting with a user-turn-only prompting approach that omits all previous assistant responses, across three open reasoning models and one state-of-the-art model. To our surprise, we find that removing prior assistant responses does not affect response quality on a large fraction of turns. Omitting assistant-side history can reduce cumulative context lengths by up to 10x. To explain this result, we find that multi-turn conversations consist of a substantial proportion (36.4%) of self-contained prompts, and that many follow-up prompts provide sufficient instruction to be answered using only the current user turn and prior user turns. When analyzing cases where user-turn-only prompting substantially outperforms full context, we identify instances of context pollution, in which models over-condition on their previous responses, introducing errors, hallucinations, or stylistic artifacts that propagate across turns. Motivated by these findings, we design a context-filtering approach that selectively omits assistant-side context. Our findings suggest that selectively omitting assistant history can improve response quality while reducing memory consumption.

arXiv.org

Directed Alberto Cetoli Graph 5d ago

Leshem Choshen Mar 9

Do LLMs Benefit from Their Own Words?🤔

In multi-turn chats, models are typically given their own past responses as context.
But do their own words always help…
Or are they more often a waste of compute and a distraction?
🧵
arxiv.org/abs/2602.24287
#AI

Directed Alberto Cetoli Graph Mar 8

WikiResearch Mar 8

RT @lewoniewski: 🛸 #ScienceFiction and #Fantasy in #Wikipedia: Exploring Structural and Semantic Cues https://t.co/0xpZtxL0PN https://t.co/…

via https://twitter.com/WikiResearch/status/2030497182626074860

Science Fiction and Fantasy in Wikipedia: Exploring Structural and Semantic Cues

Identifying which Wikipedia articles are related to science fiction, fantasy, or their hybrids is challenging because genre boundaries are porous and frequently overlap. Wikipedia nonetheless offers machine-readable structure beyond text, including categories, internal links (wikilinks), and statements if corresponding Wikidata items. However, each of these signals reflects community conventions and can be biased or incomplete. This study examines structural and semantic features of Wikipedia articles that can be used to identify content related to science fiction and fantasy (SF/F).

arXiv.org

Directed Alberto Cetoli Graph Mar 7

Andrew (Television Executive)Mar 7

The first two episodes of our new translation and score for the 1915 science fiction film Filibus: The Mysterious Air Pirate is out now.

You can watch it on Youtube: https://www.youtube.com/watch?v=0YtFNsrf3DU

and on New Ellijay Television:
https://vod.newellijay.tv/w/p/uYkJMUrdZBvkKgrZgeKXjt?playlistPosition=1

Filibus - The Mysterious Air Pirate (1915) - Chapter One - New Score and Translation

YouTube

Directed Alberto Cetoli Graph Mar 7

Emiel van Miltenburg Mar 5

RE: https://scholar.social/@evanmiltenburg/116177276888989619

Article is in Dutch, but maybe understandable via Google Translate.

TL;DR: we talk about how AI poetry is used in three different ways.
1. To study progress in creative computing;
2. To explore new literary forms and to see how people interpret those;
3. As a means to (deceptively) market AI with appeals to its 'humanity.'

Directed Alberto Cetoli Graph Mar 7