Mastodawn

The Gradient Jul 11, 2023

Super pumped that @thegradient has published "Interpretability Creationism", my manifesto for considering the training process in AI interpretability research!

https://thegradient.pub/interpretability-creationism

Interpretability Creationism

On “interpretability creationism” – interpretability methods that only look at the final state of the model and ignore its evolution over the course of training

The Gradient

Show thread

David Ruffner Jul 12, 2023

@nsaphra @thegradient

I really like the idea of observing what happens throughout training. I've done some of that when training object detectors and it was helpful. And it is fun too! It is really cool to see how the system learns.

Show thread

A.V.Jul 12, 2023

@nsaphra well written.