Mastodawn

New paper on knowledge agents trained via reinforcement learning.
KARL (Knowledge Agents via Reinforcement Learning) proposes a framework where agents learn to retrieve, verify, and integrate information through iterative interaction rather than static prompting. The goal is to make LLM-based systems more reliable in knowledge-intensive tasks by training them to actively reason over external information sources.
https://arxiv.org/abs/2603.05218v1
#AgenticAI #KnowledgeAgents #LLMResearch

AI Daily Post Mar 5

New research shows an AI system can flag probable matches and shrink a pool of anonymous accounts to a short list, raising fresh questions about privacy, online identity and the power of LLMs. How safe is anonymity when machine learning gets that good? Dive into the methodology and implications. #AIPrivacy #AnonymousAccounts #LLMResearch #MachineLearning

🔗 https://aidailypost.com/news/ai-system-flags-probable-matches-narrows-anonymous-accounts-shortlist

Harald Klinke Jan 23

Think of LLMs not as “models that answer,” but as systems that act—reading files, running code, fetching resources, and verifying results across domains.
A new paper introduces LLM-in-Sandbox, showing that large language models gain markedly stronger agentic capabilities when they can explore a virtual computer environment instead of generating text
https://arxiv.org/abs/2601.16206v1
#AgenticAI #LLMResearch #ArtificialIntelligence

LLM-in-Sandbox Elicits General Agentic Intelligence

We introduce LLM-in-Sandbox, enabling LLMs to explore within a code sandbox (i.e., a virtual computer), to elicit general intelligence in non-code domains. We first demonstrate that strong LLMs, without additional training, exhibit generalization capabilities to leverage the code sandbox for non-code tasks. For example, LLMs spontaneously access external resources to acquire new knowledge, leverage the file system to handle long contexts, and execute scripts to satisfy formatting requirements. We further show that these agentic capabilities can be enhanced through LLM-in-Sandbox Reinforcement Learning (LLM-in-Sandbox-RL), which uses only non-agentic data to train models for sandbox exploration. Experiments demonstrate that LLM-in-Sandbox, in both training-free and post-trained settings, achieves robust generalization spanning mathematics, physics, chemistry, biomedicine, long-context understanding, and instruction following. Finally, we analyze LLM-in-Sandbox's efficiency from computational and system perspectives, and open-source it as a Python package to facilitate real-world deployment.

arXiv.org

Hacker News Jun 16, 2025

Breaking Quadratic Barriers: A Non-Attention LLM for Ultra-Long Context Horizons

https://arxiv.org/abs/2506.01963

#HackerNews #BreakingQuadraticBarriers #NonAttentionLLM #UltraLongContext #LLMResearch #AIInnovation

Breaking Quadratic Barriers: A Non-Attention LLM for Ultra-Long Context Horizons

We present a novel non attention based architecture for large language models (LLMs) that efficiently handles very long context windows, on the order of hundreds of thousands to potentially millions of tokens. Unlike traditional Transformer designs, which suffer from quadratic memory and computation overload due to the nature of the self attention mechanism, our model avoids token to token attention entirely. Instead, it combines the following complementary components: State Space blocks (inspired by S4) that learn continuous time convolution kernels and scale near linearly with sequence length, Multi Resolution Convolution layers that capture local context at different dilation levels, a lightweight Recurrent Supervisor to maintain a global hidden state across sequential chunks, and Retrieval Augmented External Memory that stores and retrieves high-level chunk embeddings without reintroducing quadratic operations.

arXiv.org

Anand Philip Oct 21, 2024

Four papers on LLM reasoning summarized by @melaniemitchell https://aiguide.substack.com/p/the-llm-reasoning-debate-heats-up along with the background in her latest. Of these, the chain of thought prompting paper's attempt to identify sources of predictions (memorization vs reasoning] is very interesting, although chaotic. Stats people might hate the conclusions. #LLMReasoning #LLMResearch

The LLM Reasoning Debate Heats Up

Three recent papers examine the robustness of reasoning and problem-solving in large language models

AI: A Guide for Thinking Humans