Mastodawn

🤔 Ah, yet another academic masterpiece on the magical powers of 'grep'—because who knew that sifting through text could be so agentically transformative? 🚀 Apparently, we need an army of agent harnesses to do what Ctrl+F has been mastering since the dawn of time. 😜 Thanks for the 🧠-bending #insights, arXiv!
https://arxiv.org/abs/2605.15184 #grep #textprocessing #arXiv #automation #academichumor #HackerNews #ngated

Is Grep All You Need? How Agent Harnesses Reshape Agentic Search

Recent advances in Large Language Model (LLM) agents have enabled complex agentic workflows where models autonomously retrieve information, call tools, and reason over large corpora to complete tasks on behalf of users. Despite the growing adoption of retrieval-augmented generation (RAG) in agentic search systems, existing literature lacks a systematic comparison of how retrieval strategy choice interacts with agent architecture and tool-calling paradigm. Important practical dimensions, including how tool outputs are presented to the model and how performance changes when searches must cope with more irrelevant surrounding text, remain under-explored in agent loops. This paper reports an empirical study organized into two experiments. Experiment 1 compares grep and vector retrieval on a 116-question sample from LongMemEval, using a custom agent harness (Chronos) and provider-native CLI harnesses (Claude Code, Codex, and Gemini CLI), for both inline tool results and file-based tool results that the model reads separately. Experiment 2 compares grep-only and vector-only retrieval while progressively mixing in additional unrelated conversation history, so that each query is embedded in more distracting material alongside the passages that matter. Across Chronos and the provider CLIs, grep generally yields higher accuracy than vector retrieval in our comparisons in experiment 1; at the same time, overall scores still depend strongly on which harness and tool-calling style is used, even when the underlying conversation data are the same.

arXiv.org

Curated Hacker News 15h ago

Using Optical Aberrations to Distinguish Real Astronomical Transients

https://arxiv.org/abs/2606.08319

#arxiv

Fast Astronomical Transients in Archival Photographic Plates: Using optical aberrations as a tool for discerning real images, from plate artifacts

The detection of fast astronomical transients in photographic plates from the Palomar sky surveys conducted in the 1950s, was subject to the criticism that such transients could be just the effect of otherwise unaccounted for plate artifacts. In this paper, we show that transient images exhibit the coma aberration pattern expected from off-axis point sources recorded through the telescope optics, a signature that plate artifacts cannot naturally reproduce. Although the data does not by themselves establish the physical origin of the light that generated the images, they lend support to hypotheses that do not rely on instrumental effects to explain transients.

arXiv.org

N-gated Hacker News 16h ago

🚀 A riveting 26-page saga asking the age-old question: can a glorified #autocomplete outsmart good ol’ hyperparameters? 🤔 Spoiler: someone had way too much grant money and time. But hey, at least arXiv’s newfound #independence means they can host all the #AI bedtime stories they want! 📚😴
https://arxiv.org/abs/2603.24647 #Hyperparameters #arXiv #BedtimeStories #HackerNews #ngated

Can LLMs Beat Classical Hyperparameter Optimization Algorithms? A Study on autoresearch

The autoresearch repository enables an LLM agent to optimize hyperparameters by editing training code directly. We use it as a testbed to compare classical HPO algorithms against LLM-based methods on tuning the hyperparameters of a small language model under a fixed compute budget. When defining a fixed search space over autoresearch, classical methods such as CMA-ES and TPE consistently outperform LLM-based agents, where avoiding out-of-memory failures matters more than search diversity. Allowing the LLM to directly edit source code narrows the gap to the classical methods but does not close it, even with frontier models available at the time of writing such as Claude Opus 4.6 and Gemini 3.1 Pro Preview. We observe that LLMs struggle to track optimization state across trials. In contrast, classical methods lack the domain knowledge of LLMs. To combine the strengths of both, we introduce Centaur, a hybrid that shares CMA-ES's interpretable internal state, including mean vector, step-size, and covariance matrix, with an LLM. Centaur achieves the best result in our experiments, and a 0.8B LLM already suffices to outperform all classical and pure LLM methods. Unconstrained code editing requires larger models to be competitive with classical methods. We further analyze search diversity, model scaling from 0.8B to frontier models, and ablate the fraction of LLM-proposed trials in Centaur. All in all, our results suggest that LLMs are most effective as a complement to classical optimizers, not as a replacement. Code is available at https://github.com/ferreirafabio/autoresearch-automl & interactive demo at https://ferreirafabio.github.io/autoresearch-automl.

arXiv.org

Curated Hacker News 16h ago

Can LLMs Beat Classical Hyperparameter Optimization Algorithms?

https://arxiv.org/abs/2603.24647

#arxiv #llm #llms

Can LLMs Beat Classical Hyperparameter Optimization Algorithms? A Study on autoresearch

The autoresearch repository enables an LLM agent to optimize hyperparameters by editing training code directly. We use it as a testbed to compare classical HPO algorithms against LLM-based methods on tuning the hyperparameters of a small language model under a fixed compute budget. When defining a fixed search space over autoresearch, classical methods such as CMA-ES and TPE consistently outperform LLM-based agents, where avoiding out-of-memory failures matters more than search diversity. Allowing the LLM to directly edit source code narrows the gap to the classical methods but does not close it, even with frontier models available at the time of writing such as Claude Opus 4.6 and Gemini 3.1 Pro Preview. We observe that LLMs struggle to track optimization state across trials. In contrast, classical methods lack the domain knowledge of LLMs. To combine the strengths of both, we introduce Centaur, a hybrid that shares CMA-ES's interpretable internal state, including mean vector, step-size, and covariance matrix, with an LLM. Centaur achieves the best result in our experiments, and a 0.8B LLM already suffices to outperform all classical and pure LLM methods. Unconstrained code editing requires larger models to be competitive with classical methods. We further analyze search diversity, model scaling from 0.8B to frontier models, and ablate the fraction of LLM-proposed trials in Centaur. All in all, our results suggest that LLMs are most effective as a complement to classical optimizers, not as a replacement. Code is available at https://github.com/ferreirafabio/autoresearch-automl & interactive demo at https://ferreirafabio.github.io/autoresearch-automl.

arXiv.org

Curated Hacker News 16h ago

Unified Controllable and Faithful Text-to-CAD Generation with LLMs

https://arxiv.org/abs/2604.19773

#arxiv #llm #llms

PR-CAD: Progressive Refinement for Unified Controllable and Faithful Text-to-CAD Generation with Large Language Models

The construction of CAD models has traditionally relied on labor-intensive manual operations and specialized expertise. Recent advances in large language models (LLMs) have inspired research into text-to-CAD generation. However, existing approaches typically treat generation and editing as disjoint tasks, limiting their practicality. We propose PR-CAD, a progressive refinement framework that unifies generation and editing for controllable and faithful text-to-CAD modeling. To support this, we curate a high-fidelity interaction dataset spanning the full CAD lifecycle, encompassing multiple CAD representations as well as both qualitative and quantitative descriptions. The dataset systematically defines the types of edit operations and generates highly human-like interaction data. Building on a CAD representation tailored for LLMs, we propose a reinforcement learning-enhanced reasoning framework that integrates intent understanding, parameter estimation, and precise edit localization into a single agent. This enables an "all-in-one" solution for both design creation and refinement. Extensive experiments demonstrate strong mutual reinforcement between generation and editing tasks, and across qualitative and quantitative modalities. On public benchmarks, PR-CAD achieves state-of-the-art controllability and faithfulness in both generation and refinement scenarios, while also proving user-friendly and significantly improving CAD modeling efficiency.

arXiv.org

Curated Hacker News 16h ago

Is Grep All You Need? How Agent Harnesses Reshape Agentic Search

https://arxiv.org/abs/2605.15184

#arxiv

Is Grep All You Need? How Agent Harnesses Reshape Agentic Search

Recent advances in Large Language Model (LLM) agents have enabled complex agentic workflows where models autonomously retrieve information, call tools, and reason over large corpora to complete tasks on behalf of users. Despite the growing adoption of retrieval-augmented generation (RAG) in agentic search systems, existing literature lacks a systematic comparison of how retrieval strategy choice interacts with agent architecture and tool-calling paradigm. Important practical dimensions, including how tool outputs are presented to the model and how performance changes when searches must cope with more irrelevant surrounding text, remain under-explored in agent loops. This paper reports an empirical study organized into two experiments. Experiment 1 compares grep and vector retrieval on a 116-question sample from LongMemEval, using a custom agent harness (Chronos) and provider-native CLI harnesses (Claude Code, Codex, and Gemini CLI), for both inline tool results and file-based tool results that the model reads separately. Experiment 2 compares grep-only and vector-only retrieval while progressively mixing in additional unrelated conversation history, so that each query is embedded in more distracting material alongside the passages that matter. Across Chronos and the provider CLIs, grep generally yields higher accuracy than vector retrieval in our comparisons in experiment 1; at the same time, overall scores still depend strongly on which harness and tool-calling style is used, even when the underlying conversation data are the same.

arXiv.org

N-gated Hacker News 19h ago

🤔🤓 Behold the riveting tale of "Functional Analysis" for the brave souls in science and engineering! 🤯 Dive into the labyrinthine depths of #arXiv, where obscure numbers and acronyms hold the key to... well, not much of anything actually useful, unless you fancy a career in academic obscurity. 🌪️💡
https://arxiv.org/abs/1904.02539 #FunctionalAnalysis #ScienceEngineering #AcademicObscurity #DiveDeep #HackerNews #ngated

An introduction to functional analysis for science and engineering

This is a tutorial introduction to the functional analysis mathematics needed in many physical problems, such as in waves in continuous media. Functional analysis takes us beyond finite matrices, allowing us to work with infinite sets of continuous functions. It resolves important issues, such as whether, why and how we can practically reduce such problems to finite matrix approximations. It is, however, difficult to find a readable introduction that is efficient and comprehensible for scientists and engineers. Here, I have selected only the topics necessary for the most important results, but the argument is mathematically complete and self-contained. The article starts from sets and sequences of real numbers. It then develops spaces of vectors or functions, introducing the concepts of norms and metrics that allow us to consider how these can converge. Adding the inner product, it introduces Hilbert spaces, and the key forms of operators that map within or between such spaces. This leads to the concept of compact operators, which allows us to resolve many difficulties of working with infinite sets of vectors or functions. We then introduce Hilbert-Schmidt operators, which are compact operators encountered extensively in physical problems, such as those involving waves. Finally, it introduces the eigenfunctions for major classes of operators, and their powerful properties, and ends with singular-value decomposition of operators. This article is written in a style that is complementary to that of standard mathematical treatments; by relegating longer proofs to a separate section, I have attempted to retain a clear narrative flow and motivation in developing the mathematical structure. Hopefully, the result is useful to a broader readership who need to understand this mathematics, especially in physical science and engineering.

arXiv.org

Hacker News 19h ago

An introduction to functional analysis for science and engineering

https://arxiv.org/abs/1904.02539

#HackerNews #functionalanalysis #science #engineering #arxiv #mathematics #learning

An introduction to functional analysis for science and engineering

This is a tutorial introduction to the functional analysis mathematics needed in many physical problems, such as in waves in continuous media. Functional analysis takes us beyond finite matrices, allowing us to work with infinite sets of continuous functions. It resolves important issues, such as whether, why and how we can practically reduce such problems to finite matrix approximations. It is, however, difficult to find a readable introduction that is efficient and comprehensible for scientists and engineers. Here, I have selected only the topics necessary for the most important results, but the argument is mathematically complete and self-contained. The article starts from sets and sequences of real numbers. It then develops spaces of vectors or functions, introducing the concepts of norms and metrics that allow us to consider how these can converge. Adding the inner product, it introduces Hilbert spaces, and the key forms of operators that map within or between such spaces. This leads to the concept of compact operators, which allows us to resolve many difficulties of working with infinite sets of vectors or functions. We then introduce Hilbert-Schmidt operators, which are compact operators encountered extensively in physical problems, such as those involving waves. Finally, it introduces the eigenfunctions for major classes of operators, and their powerful properties, and ends with singular-value decomposition of operators. This article is written in a style that is complementary to that of standard mathematical treatments; by relegating longer proofs to a separate section, I have attempted to retain a clear narrative flow and motivation in developing the mathematical structure. Hopefully, the result is useful to a broader readership who need to understand this mathematics, especially in physical science and engineering.

arXiv.org

Curated Hacker News 20h ago

An introduction to functional analysis for science and engineering

https://arxiv.org/abs/1904.02539

#arxiv #science

An introduction to functional analysis for science and engineering

This is a tutorial introduction to the functional analysis mathematics needed in many physical problems, such as in waves in continuous media. Functional analysis takes us beyond finite matrices, allowing us to work with infinite sets of continuous functions. It resolves important issues, such as whether, why and how we can practically reduce such problems to finite matrix approximations. It is, however, difficult to find a readable introduction that is efficient and comprehensible for scientists and engineers. Here, I have selected only the topics necessary for the most important results, but the argument is mathematically complete and self-contained. The article starts from sets and sequences of real numbers. It then develops spaces of vectors or functions, introducing the concepts of norms and metrics that allow us to consider how these can converge. Adding the inner product, it introduces Hilbert spaces, and the key forms of operators that map within or between such spaces. This leads to the concept of compact operators, which allows us to resolve many difficulties of working with infinite sets of vectors or functions. We then introduce Hilbert-Schmidt operators, which are compact operators encountered extensively in physical problems, such as those involving waves. Finally, it introduces the eigenfunctions for major classes of operators, and their powerful properties, and ends with singular-value decomposition of operators. This article is written in a style that is complementary to that of standard mathematical treatments; by relegating longer proofs to a separate section, I have attempted to retain a clear narrative flow and motivation in developing the mathematical structure. Hopefully, the result is useful to a broader readership who need to understand this mathematics, especially in physical science and engineering.

arXiv.org

Curated Hacker News 2d ago

If LLMs Have Human-Like Attributes, Then So Does Age of Empires II

https://arxiv.org/abs/2605.31514

#arxiv #llm #llms

If LLMs Have Human-Like Attributes, Then So Does Age of Empires II

Much research has been carried out on large language models (LLMs) and LLM-powered agentic workflows. However, many works within the field state emergence of, ascribe to, or assume, generalised anthropomorphic attributes to them (e.g., morality or understanding of natural language). Our goal is not to argue in favour or against the existence of these attributes, but to point out that these conclusions could be incorrect. For this we build and train a simple neural network on the videogame Age of Empires II, and note that any entity in a sufficiently-powerful substrate, such as LEGO or the Greater Boston Area, could also present such attributes. Hence, the purported anthropomorphic attributes of LLMs are empirically non-unique: although some properties (e.g., responses to prompts) could remain constant, others, such as the interpretation of their perceived behaviour, might change with the substrate. Thus, any empirically-grounded discussion requires explicit measurement criteria; otherwise the interpretation is left to the representation. We then show that assuming that these attributes exist or not in a system, independent of the substrate and in a generalised way, leads to either circular or uninformative conclusions, regardless of the experimenter's viewpoint on the subject. Finally we propose a 'null' assumption, where one assumes LLM non-uniqueness instead of assuming anthropomorphic attributes to set up an experiment, along with examples of it. We also discuss potential objections to our work, briefly survey the field, and prove that \textit{Age of Empires II} is functionally- and Turing-complete.

arXiv.org