Learning without backpropagation is really taking off in 2022
First, @BAPearlmutter et al show in "Gradients without Backpropagation" that a single forward pass with perturbed weights is enough to compute unbiased estimate of gradients:
https://arxiv.org/abs/2202.08587
Then, Mengye Ren et al show in "Scaling Forward Gradient With Local Losses" that the variance of doing this is high, but can be reduced by doing activity perturbation (as in Fiete & Seung 2006), but more importantly, having many "local loss" functions:
https://arxiv.org/abs/2210.03310
Then Jeff Hinton takes the "local loss" to another level in "Forward-Forward Algorithm", and connects it to a ton of other ideas e.g. neuromorphic engineering, one shot learning, self supervised learning, ...: https://www.cs.toronto.edu/~hinton/FFA13.pdf
It looks like #MachineLearning and #Neuroscience are really converging.
Gradients without Backpropagation
Using backpropagation to compute gradients of objective functions for optimization has remained a mainstay of machine learning. Backpropagation, or reverse-mode differentiation, is a special case within the general family of automatic differentiation algorithms that also includes the forward mode. We present a method to compute gradients based solely on the directional derivative that one can compute exactly and efficiently via the forward mode. We call this formulation the forward gradient, an unbiased estimate of the gradient that can be evaluated in a single forward run of the function, entirely eliminating the need for backpropagation in gradient descent. We demonstrate forward gradient descent in a range of problems, showing substantial savings in computation and enabling training up to twice as fast in some cases.