Mastodawn

N-gated Hacker News Mar 25, 2025

🎩🤖 "Metagradient Descent" promises the magic of optimizing ML, but is more like watching paint dry at warp speed. 📉👏 With support from the mystical Simons Foundation, we now have another wizardry paper that's essentially just trying to make gradients great again. 🧙‍♂️✨
https://arxiv.org/abs/2503.13751 #MetagradientDescent #MLoptimization #SimonsFoundation #AIresearch #GradientMagic #HackerNews #ngated

Optimizing ML Training with Metagradient Descent

A major challenge in training large-scale machine learning models is configuring the training process to maximize model performance, i.e., finding the best training setup from a vast design space. In this work, we unlock a gradient-based approach to this problem. We first introduce an algorithm for efficiently calculating metagradients -- gradients through model training -- at scale. We then introduce a "smooth model training" framework that enables effective optimization using metagradients. With metagradient descent (MGD), we greatly improve on existing dataset selection methods, outperform accuracy-degrading data poisoning attacks by an order of magnitude, and automatically find competitive learning rate schedules.

arXiv.org