Andrew Geng

22 Followers
29 Following
179 Posts
Math artist, composer, code monkey.
Websitehttps://pteromys.melonisland.net
GitHubhttps://github.com/pteromys

What's SELU, you might ask? A trick to make deeper neural networks self-normalizing—

As well as I've understood it, the idea seems to be that repeatedly passing a column vector through alternating steps of:

1. multiply by a matrix whose rows are nearly mean=0, variance=1
2. pass each entry through an "activation function" with the right shape

...draws the (mean, variance) of its entries towards a fixed point, which gives you a hope of getting out of batch normalization by adding more layers.

Want to see a demo of SELU (Klambauer+Unterthiner+Mayr+Hochreiter 2017)? I just added it to https://pteromys.melonisland.net/neuralnets/

1. Click into the 2D tab, hit the reroll button, and then run it for about 2000 steps just to see what the default settings do.
2. Change tanh to selu, enter 8,8,8,8,8,8,8,8,8,8,8,8 in the next box, then reroll and run for 500 steps.
3. Switch back to tanh and see how long it takes to get the same level of detail.
4. Go for a walk—watching too long gives you motion sickness.

dook is now installable using `uv` from https://pypi.org/project/dook/ !

The biggest changes are a new config format (if I broke your config, sorry—post an issue on my github if you have trouble migrating), bash and zsh completions, and tons of languages.

The silliest change is the `-i` flag for case-insensitive search that attempts some questionable transformations if you specify it twice:

Their paper includes some other neat stuff, including this graphic showing how rapidly different Deep Learning methods create dormant ReLUs.

(Direct link: https://www.nature.com/articles/s41586-024-07711-7/figures/8 )

Extended Data Fig. 4: Existing deep-learning methods on Online Permuted MNIST. | Nature

I was curious about this Dohare et al paper on mitigating loss of plasticity in neural networks by rerolling less-important cells:

https://www.nature.com/articles/s41586-024-07711-7 (previously https://arxiv.org/abs/2306.13812 )

...so I added something inspired by it to my neural networks toy:

https://pteromys.melonisland.net/neuralnets/

Qualitatively, it doesn't seem do much for the tiny networks and short training times in the toy, though debug logging tells me it's successfully finding dead ReLUs.

Loss of plasticity in deep continual learning - Nature

The pervasive problem of artificial neural networks losing plasticity in continual-learning settings is demonstrated and a simple solution called the continual backpropagation algorithm is described to prevent this issue.

Nature

dook has a mascot now!

Also I apparently need to update the js patterns to catch some common prototype manipulations. Some other evening, probably.

New tool! Working in a large codebase and need to remind yourself of some function's argument order but don't want to go find the file it's in?

https://github.com/pteromys/def searches your code for a definition. Basically, Symbex but for languages besides python, or cq but for more files at once with less control.

Folks here on Fedi, two requests: (1) what's a good crate name? and (2) any language you wish I supported? (Currently: c, c++, go, python, rust, js, ts, and tsx.)

GitHub - pteromys/def: Find code definitions.

Find code definitions. Contribute to pteromys/def development by creating an account on GitHub.

GitHub
Here are photos from my folks showing the gadget in action. Apparently some quick thinker got a photo of the corona projection! I only wish I'd been as quick and thought to fix the focus. Whoops.
Today I'm thankful I used an improvised eclipse device instead of my real telescope. It was enough resolution to see that big sunspot today so there will be future iterations targeting transits of Mercury. But I'll uh think a little harder about where I put flammable surfaces next time around, especially when the aim is off.

New toy! Use a small neural net to approximate an image or JS function. Also a quick way to check toy examples for vanishing gradients and other papercuts. Try it here:

https://pteromys.melonisland.net/neuralnets/

(To be clear, this is no image generator—only a tiny network, memorizing as much of its single input image as it can.)

Kind of #ExplorableExplanation except I didn't do much explaining; sorry.

Partly inspired by https://playground.tensorflow.org/ which is also educational about the inner workings of neural nets.

Neural network toy

Watch a small neural net learn to approximate an image.