fly51fly (@fly51fly)

논문 'Stabilizing Native Low-Rank LLM Pretraining'(2026)이 arXiv에 공개되었습니다. Concordia와 Sorbonne 연구진(P. Janson, E. Oyallon, E. Belilovsky 등)이 저랭크(low-rank) 기반 LLM 사전학습의 불안정성 문제를 다루고 안정화 기법을 제시하는 연구로, 대규모 모델 효율화와 사전학습 개선에 중요한 시사점을 담고 있습니다.

https://x.com/fly51fly/status/2023571954712957301

#llm #pretraining #lowrank #research

fly51fly (@fly51fly) on X

[LG] Stabilizing Native Low-Rank LLM Pretraining P Janson, E Oyallon, E Belilovsky [Concordia University & Sorbonne University] (2026) https://t.co/p1gT6uCevs

X (formerly Twitter)

A connection between sparse and low rank matrices. Let S be a sparse similarity matrix, for example the distances of the 3 nearest neighbours in a low dimensional manifold. Can you recover S if you have a low rank (dense) matrix L from in a high dimensional space? This paper provides a geometric interpretation for S = max(0,L). It proposes a decomposition algorithm, that can be modelled as a ReLU neural network layer.

#MachineLearning #SparseDecomposition #LowRank #TMLR
https://openreview.net/forum?id=p8gncJbMit

A geometrical connection between sparse and low-rank matrices and...

We consider when a sparse nonnegative matrix $\mathbf{S}$ can be recovered, via an elementwise nonlinearity, from a real-valued matrix~$\mathbf{L}$ of significantly lower rank. Of particular...

OpenReview

Steffen Schotthöfer & Emanuele Zangrando from our lab are attending #NeurIPS next week in person and will present our work on #lowrank #training & #pruning of #NNs

Meet both at the poster session in HallJ#604 on Wed 30 Nov 9:30amPST

What is it? We developed a framework to perform stable and efficient training on low-rank manifolds, resulting in an order of magnitude less memory cost & training time! Tested successfully on #imagenet1k #transformers +other benchmarks

https://openreview.net/forum?id=IILJ0KWZMy9&referrer=%5Bthe%20profile%20of%20Francesco%20Tudisco%5D(%2Fprofile%3Fid%3D~Francesco_Tudisco1)

Low-rank lottery tickets: finding efficient low-rank neural...

The paper presents a novel neural network training algorithm, that allows for efficient low rank network training and search of low rank subnetworks

OpenReview