Mastodawn

🚀 A riveting 26-page saga asking the age-old question: can a glorified #autocomplete outsmart good ol’ hyperparameters? 🤔 Spoiler: someone had way too much grant money and time. But hey, at least arXiv’s newfound #independence means they can host all the #AI bedtime stories they want! 📚😴
https://arxiv.org/abs/2603.24647 #Hyperparameters #arXiv #BedtimeStories #HackerNews #ngated

Can LLMs Beat Classical Hyperparameter Optimization Algorithms? A Study on autoresearch

The autoresearch repository enables an LLM agent to optimize hyperparameters by editing training code directly. We use it as a testbed to compare classical HPO algorithms against LLM-based methods on tuning the hyperparameters of a small language model under a fixed compute budget. When defining a fixed search space over autoresearch, classical methods such as CMA-ES and TPE consistently outperform LLM-based agents, where avoiding out-of-memory failures matters more than search diversity. Allowing the LLM to directly edit source code narrows the gap to the classical methods but does not close it, even with frontier models available at the time of writing such as Claude Opus 4.6 and Gemini 3.1 Pro Preview. We observe that LLMs struggle to track optimization state across trials. In contrast, classical methods lack the domain knowledge of LLMs. To combine the strengths of both, we introduce Centaur, a hybrid that shares CMA-ES's interpretable internal state, including mean vector, step-size, and covariance matrix, with an LLM. Centaur achieves the best result in our experiments, and a 0.8B LLM already suffices to outperform all classical and pure LLM methods. Unconstrained code editing requires larger models to be competitive with classical methods. We further analyze search diversity, model scaling from 0.8B to frontier models, and ablate the fraction of LLM-proposed trials in Centaur. All in all, our results suggest that LLMs are most effective as a complement to classical optimizers, not as a replacement. Code is available at https://github.com/ferreirafabio/autoresearch-automl & interactive demo at https://ferreirafabio.github.io/autoresearch-automl.

arXiv.org

SRF IRIS Apr 11, 2025

IRIS Insights I Nico Formanek: Are hyperparameters vibes?
April 24, 2025, 2:00 p.m. (CEST)
Our second IRIS Insights talk will take place with Nico Formanek.
🟦
This talk will discuss the role of hyperparameters in optimization methods for model selection (currently often called ML) from a philosophy of science point of view. Special consideration is given to the question of whether there can be principled ways to fix hyperparameters in a maximally agnostic setting.
🟦
This is a WebEx talk to which everyone who is interested is cordially invited. It will take place in English. Our IRIS speaker, Jun.-Prof. Dr. Maria Wirzberger, will moderate it. Following Nico Formanek's presentation, there will be an opportunity to ask questions. We look forward to active participation.
🟦
Please join this Webex talk using the following link:
https://lnkd.in/eJNiUQKV
🟦
#Hyperparameters #ModelSelection #Optimization #MLMethods #PhilosophyOfScience #ScientificMethod #AgnosticLearning #MachineLearning #InterdisciplinaryResearch #AIandPhilosophy #EthicsInAI #ResponsibleAI #AITheory #WebTalk #OnlineLecture #ResearchTalk #ScienceEvents #OpenInvitation #AICommunity #LinkedInScience #TechPhilosophy #AIConversations

This link will take you to a page that’s not on LinkedIn

JMLR Nov 29, 2024

'Empirical Design in Reinforcement Learning', by Andrew Patterson, Samuel Neumann, Martha White, Adam White.

http://jmlr.org/papers/v25/23-0183.html

#reinforcement #experiments #hyperparameters

Empirical Design in Reinforcement Learning

Carl Gold, PhD Nov 27, 2024

#CausalML update - I am now fitting my first #CausalForest on real data!

Does anyone have advice on the most important #hyperparameters (After the # of trees & tree depth.)

I'm working on large imbalanced data sets and a large number of treatment variables, so it's not like anything you see in the economics literature. 🤔 #ML #AI #causal

JMLR Sep 11, 2024

'On the Hyperparameters in Stochastic Gradient Descent with Momentum', by Bin Shi.

http://jmlr.org/papers/v25/22-1189.html

#sgd #hyperparameters #stochastic

On the Hyperparameters in Stochastic Gradient Descent with Momentum

JMLR Aug 9, 2024

'Pre-trained Gaussian Processes for Bayesian Optimization', by Zi Wang et al.

http://jmlr.org/papers/v25/23-0269.html

#priors #prior #hyperparameters

Pre-trained Gaussian Processes for Bayesian Optimization

JMLR Aug 3, 2024

'An Algorithmic Framework for the Optimization of Deep Neural Networks Architectures and Hyperparameters', by Julie Keisler, El-Ghazali Talbi, Sandra Claudel, Gilles Cabriel.

http://jmlr.org/papers/v25/23-0166.html

#forecasting #algorithmic #hyperparameters

An Algorithmic Framework for the Optimization of Deep Neural Networks Architectures and Hyperparameters

JMLR Apr 14, 2024

'Low-rank Variational Bayes correction to the Laplace method', by Janet van Niekerk, Haavard Rue.

http://jmlr.org/papers/v25/21-1405.html

#variational #hyperparameters #approximations

Low-rank Variational Bayes correction to the Laplace method

Chris Arnold Mar 6, 2024

📢 Publicationalert: "The Role of Hyperparameters in Machine Learning Models and How to Tune Them" with with Luka Biedebach Andreas Küpfer and Marcel Neunhoeffer in Political Science Research and Methods. Margeret is loving #hyperparameters. Do you? #sciencerocks #machinelearning #socialdatascience https://doi.org/10.1017/psrm.2023.61 🧵 [1/5]

The role of hyperparameters in machine learning models and how to tune them | Political Science Research and Methods | Cambridge Core

The role of hyperparameters in machine learning models and how to tune them

Cambridge Core

Chris Arnold Jul 10, 2023

New Workingpaper: "The Role of #Hyperparameters in #MachineLearning Models and How to Tune Them". We suggest: Handle HPs with the same loving care as parameter estimates---you could end up choosing the wrong model. https://tinyurl.com/mr2akrn3