#AI #DailyPaper #MachineLearning

"Neural Redshift: Random Networks are not Random Functions", Teney et al.
https://arxiv.org/abs/2403.02241

Counters the notion that Neural Networks have an inherent "simplicity bias". Instead, inductive bias depends on components such as ReLUs, residual connections, and LayerNorm, which can be tuned to build architectures with a bias for any level of complexity.

#DeepLearning #ML #StatisticalLearning #NeuralNetworks

Neural Redshift: Random Networks are not Random Functions

Our understanding of the generalization capabilities of neural networks (NNs) is still incomplete. Prevailing explanations are based on implicit biases of gradient descent (GD) but they cannot account for the capabilities of models from gradient-free methods nor the simplicity bias recently observed in untrained networks. This paper seeks other sources of generalization in NNs. Findings. To understand the inductive biases provided by architectures independently from GD, we examine untrained, random-weight networks. Even simple MLPs show strong inductive biases: uniform sampling in weight space yields a very biased distribution of functions in terms of complexity. But unlike common wisdom, NNs do not have an inherent "simplicity bias". This property depends on components such as ReLUs, residual connections, and layer normalizations. Alternative architectures can be built with a bias for any level of complexity. Transformers also inherit all these properties from their building blocks. Implications. We provide a fresh explanation for the success of deep learning independent from gradient-based training. It points at promising avenues for controlling the solutions implemented by trained models.

arXiv.org

Always trip me up when i read “principle component analysis”, “reclusive feature elimination”, “gradual boosting” etc. I’d be concerned about the validity of everything else.

Principal vs principle, i can understand, but reclusive, that is so far out…

#ml #datascience #statisticallearning

New #preprint: "Rules and statistics: What if it’s both? A basic computational
model of statistical learning in reading acquisition", describing my first attempt at a #ComputationalModel: https://osf.io/5b76z. The model aims to explain how #StatisticalRegularities can be an integral part of orthographic systems, while little evidence points to a relationship between #StatisticalLearning ability and #reading performance. Feedback welcome! :D
I wrote a paper on #StatisticalLearning, #reading, and #dyslexia during maternity leave in 2021, and realise now I have very little recollection of what I actually wrote... https://www.mdpi.com/2076-3425/11/9/1143
Developmental Dyslexia, Reading Acquisition, and Statistical Learning: A Sceptic’s Guide

Many theories have been put forward that propose that developmental dyslexia is caused by low-level neural, cognitive, or perceptual deficits. For example, statistical learning is a cognitive mechanism that allows the learner to detect a probabilistic pattern in a stream of stimuli and to generalise the knowledge of this pattern to similar stimuli. The link between statistical learning and reading ability is indirect, with intermediate skills, such as knowledge of frequently co-occurring letters, likely being causally dependent on statistical learning skills and, in turn, causing individual variation in reading ability. We discuss theoretical issues regarding what a link between statistical learning and reading ability actually means and review the evidence for such a deficit. We then describe and simulate the “noisy chain hypothesis”, where each intermediary link between a proposed cause and the end-state of reading ability reduces the correlation coefficient between the low-level deficit and the end-state outcome of reading. We draw the following conclusions: (1) Empirically, there is evidence for a correlation between statistical learning ability and reading ability, but there is no evidence to suggest that this relationship is causal, (2) theoretically, focussing on a complete causal chain between a distal cause and developmental dyslexia, rather than the two endpoints of the distal cause and reading ability only, is necessary for understanding the underlying processes, (3) statistically, the indirect nature of the link between statistical learning and reading ability means that the magnitude of the correlation is diluted by other influencing variables, yielding most studies to date underpowered, and (4) practically, it is unclear what can be gained from invoking the concept of statistical learning in teaching children to read.

MDPI
At last, the final (corrected) version is available.
Neat modeling of the processing non-adjacent dependencies by Noémi Éltető and Peter Dayan. I am very grateful to be part of this project.
https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1009866
#statisticallearning #bayesian
Tracking human skill learning with a hierarchical Bayesian sequence model

Author summary A central function of the brain is to predict. One challenge of prediction is that both external events and our own actions can depend on a variably deep temporal context of previous events or actions. For instance, in a short motor routine, like opening a door, our actions only depend on a few previous ones (e.g., push the handle if the key was turned). In longer routines such as coffee making, our actions require a deeper context (e.g., place the moka pot on the hob if coffee is ground, the pot is filled and closed, and the hob is on). We adopted a model from the natural language processing literature that matches humans’ ability to learn variable-length relationships in sequences. This model explained the gradual emergence of more complex sequence knowledge and individual differences in an experiment where humans practiced a perceptual-motor sequence over 10 weekly sessions.

Linear Regression Using R: An Introduction to Data Modeling

Linear Regression Using R: An Introduction to Data Modeling presents one of the fundamental data modelling techniques in an informal

PYOFLIFE
An #introduction post: I love #Statistics and #Learning, which means that I love both #StatisticalLearning and #LearningStatistics. Currently, most of my professional focus is on #teaching #ResearchMethods and Statistics to #undergraduate #psychmajors. I also love #photography, #travel, and #food (who doesn’t), and have recently figured out a way to combine all of these loves into a three-week #StudyAbroad trip to #Japan where I get to teach a class on #PsychologyOfLanguage.

Hi all ^^

I'm a PhD in ClinicalPsych at the University of Oslo and Modum Bad Psychiatric Hospital.

Currently I am investigating the mental morbidity trajectories in the COVID-19 pandemic, by applying advanced statistical methods (e.g., #networks, #SEM, & #StatisticalLearning) to population-based registry resources, biobanks, and large-scale #ESM data.

Due to my background in PsychMethods, I am also interested in the estimation of network models and its #replicability.

Nice to meet everyone!