'A minimax optimal approach to high-dimensional double sparse linear regression', by Yanhang Zhang, Zhifan Li, Shixiang Liu, Jianxin Yin.
http://jmlr.org/papers/v25/23-0653.html
#sparse #thresholding #sparsity
'A minimax optimal approach to high-dimensional double sparse linear regression', by Yanhang Zhang, Zhifan Li, Shixiang Liu, Jianxin Yin.
http://jmlr.org/papers/v25/23-0653.html
#sparse #thresholding #sparsity
'skscope: Fast Sparsity-Constrained Optimization in Python', by Zezhi Wang, Junxian Zhu, Xueqin Wang, Jin Zhu, Huiyang Pen, Peng Chen, Anran Wang, Xiaoke Zhang.
http://jmlr.org/papers/v25/23-1574.html
#sparse #optimization #sparsity
#mistral's 8x22B is ~260GB
the trend is to get models smaller, not bigger
#pruning, #sparsity, #quantization, #distillation
so why such a huge model?
does mistral have no other models?
Revisiting Sparsity Hunting in Federated Learning: Why the Sparsity Consensus Matters?
'Fundamental limits and algorithms for sparse linear regression with sublinear sparsity', by Lan V. Truong.
http://jmlr.org/papers/v24/21-0543.html
#sparse #sparsity #interpolation
New podcast from @thegradient with Hattie Zhou (twitter: https://twitter.com/oh_that_hat):
`Lottery Tickets and Algorithmic Reasoning in LLMs`
https://thegradientpub.substack.com/p/hattie-zhou-lottery-tickets-and-algorithmic
The first half is focused on the lottery ticket hypothesis, which is a favorite topic of mine.