I drafted an implementation of Cyclical SGLD using Blackjax and Optax.

As you can see 👇 Cyclical SGLD, alternating exploration and sampling phases, is much better on multi-modal targets than vanilla SGLD. Next step: CIFAR-10 with a Bayesian Resnet18.

https://www.thetypicalset.com/blog/cyclical_sgld.html

Cyclical SGLD in Blackjax

@remilouf I haven't used SGLD before. It's attractive that it targets the posterior, but I wonder how that's used in practice. e.g. training a neural net, would one save the weights at regular intervals and then evaluate an ensemble of nets on an input to get a sample of outputs?

@sethaxen That’s my understanding of how people do it. We have an example classifier that we train on MNIST in the Blackjax documentation: https://blackjax-devs.github.io/blackjax/examples/SGMCMC.html

(currently broken because of a timeout in CI, need to fix it)

MNIST Digit Recognition With a 3-Layer Perceptron — Blackjax

@sethaxen One day someone asked me why you needed a Bayesian net to estimate classification uncertainty: so I took some time to think about it: https://github.com/rlouf/ama/discussions/4
Why do I need a bayesian neural net to estimate classification uncertainty? · Discussion #4 · rlouf/ama

I’ve a neural net and want to measure its uncertainty on classification, currently I just use probability of top class as a proxy, how would a Bayesian neural net change that?

GitHub

@remilouf

@sethaxen @larryshamalama Unfortunately, in Bayesian NNs uncertainty increases as you move further away from the decision boundary, not the training data distribution which is usually the type of uncertainty people are expecting.

@twiecki @remilouf @sethaxen @larryshamalama under what conditions does this happen? Interested to learn about this if you can point me at papers?
@Sdatkinson @remilouf @sethaxen @larryshamalama This happens under all conditions, it's a property of the model. It only knows about the parameters that describe the hyper-plane, not the data-generating process. You can see the effect clearly here https://twiecki.io/blog/2016/06/01/bayesian-deep-learning/#Uncertainty-in-predicted-value Why is there no uncertainty near the edges?
Bayesian Deep Learning — While My MCMC Gently Samples

@twiecki @remilouf @sethaxen @larryshamalama Thanks! I'll check this out.

@twiecki @remilouf @sethaxen @larryshamalama

Hmm, perhaps a typo? Uncertainty seems _highest_ at the decision boundary in the example and _decreases_ as you move away from it.