Mastodawn

Some of my favorites from #NeurIPS2025

more neg max Lyapunov exp => faster parallelized RNN convergence
Gonzalez, X., Kozachkov, L., Zoltowski, D. M., Clarkson, K. L., & Linderman, S. Predictability Enables Parallelization of Nonlinear State Space Models. https://openreview.net/forum?id=7AGXSlXcK6

Predictability Enables Parallelization of Nonlinear State Space Models

The rise of parallel computing hardware has made it increasingly important to understand which nonlinear state space models can be efficiently parallelized. Recent advances have shown that...

Show thread

memming Dec 7

Tricks to make it even faster.
Zoltowski, D. M., Wu, S., Gonzalez, X., Kozachkov, L., & Linderman, S. (2025). Parallelizing MCMC Across the Sequence Length. The Thirty-Ninth Annual Conference on Neural Information Processing Systems. https://openreview.net/forum?id=QOjUNzOkRN

Parallelizing MCMC Across the Sequence Length

Markov chain Monte Carlo (MCMC) methods are foundational algorithms for Bayesian inference and probabilistic modeling. However, most MCMC algorithms are inherently sequential and their time...

Show thread

memming Dec 7

analysis of coupled dynamical system to study learning #cybernetics #learningdynamics
Ger, Y., & Barak, O. (2025). Learning dynamics of RNNs in closed-loop environments. In arXiv [cs.LG]. arXiv. http://arxiv.org/abs/2505.13567

Learning Dynamics of RNNs in Closed-Loop Environments

Recurrent neural networks (RNNs) trained on neuroscience-inspired tasks offer powerful models of brain computation. However, typical training paradigms rely on open-loop, supervised settings, whereas real-world learning unfolds in closed-loop environments. Here, we develop a mathematical theory describing the learning dynamics of linear RNNs trained in closed-loop contexts. We first demonstrate that two otherwise identical RNNs, trained in either closed- or open-loop modes, follow markedly different learning trajectories. To probe this divergence, we analytically characterize the closed-loop case, revealing distinct stages aligned with the evolution of the training loss. Specifically, we show that the learning dynamics of closed-loop RNNs, in contrast to open-loop ones, are governed by an interplay between two competing objectives: short-term policy improvement and long-term stability of the agent-environment interaction. Finally, we apply our framework to a realistic motor control task, highlighting its broader applicability. Taken together, our results underscore the importance of modeling closed-loop dynamics in a biologically plausible setting.

arXiv.org

Show thread

memming

score/flow matching diffusion models only starts memorizing when trained for long enough
Bonnaire, T., Urfin, R., Biroli, G., & Mezard, M. (2025). Why Diffusion Models Don’t Memorize: The Role of Implicit Dynamical Regularization in Training. https://openreview.net/forum?id=BSZqpqgqM0

Why Diffusion Models Don’t Memorize: The Role of Implicit...

Diffusion models have achieved remarkable success across a wide range of generative tasks. A key challenge is understanding the mechanisms that prevent their memorization of training data and allow...

Show thread

memming Dec 7

Theoretical Insights on Training Instability in Deep Learning TUTORIAL
https://uuujf.github.io/instability/

gradient flow-like regime is slow and can overfit while large (but not too large) step size can trasiently go far, converge faster, and find better solutions #optimization #NeurIPS2025