Mastodawn

New piece: phase transitions in neural network training.

Double descent and grokking aren't quirks — they're evidence that the interesting dynamics happen *after* you cross a phase boundary.

Classical ML intuition was built for models that never get there.

The loss curve is the standard view into a training run. It goes down (good) or stops going down...

DEV Community