Things are coming together for #ArviZ's InferenceData (https://github.com/arviz-devs/InferenceObjects.jl) to be a supported output type for #Turing and #JuliaLang's  #Stan interface, similarly to how it is for #PyMC.

For details, see https://github.com/TuringLang/MCMCChains.jl/issues/381 and https://github.com/StanJulia/StanSample.jl/issues/60

#statistics #mcmc_stan #bayesian

GitHub - arviz-devs/InferenceObjects.jl: Storage for results of Bayesian inference

Storage for results of Bayesian inference. Contribute to arviz-devs/InferenceObjects.jl development by creating an account on GitHub.

GitHub

This has some really nice benefits for #Bayesian folks in #JuliaLang.

First, InferenceData is just more useful than MCMCChains.Chains because it contains more data, preserves the array structure of the draws, and integrates better with the ecosystem thanks to DimensionalData. It also follows a multi-language spec, so it's great for long-term storage and communication of inference results.

@sethaxen honestly this is the big benefit for me -- MCMCChains throws out so much information when you make the chain and it angers me. Of course, it's largely my fault, but I'm glad you're doing something about it.
@cameron One downside to InferenceData is that everything still needs to be a numeric array at some point (whether for plotting, diagnostics, statistics, or serialization), so even when we do have support for sampling Cholesky in Turing, it may not fit nicely into an InferenceData-based workflow without some loss of structure. But right now this is the exception, not the norm, and we have ideas we can try, so either way this is progress.
@sethaxen We were dealing with this problem too -- the numeric type thing is a massive pain in the ass. I think we discussed using tuples or vectors-of-vectors, but never really got around to implementing it.
@cameron I think what the TransformVariables family of packages and @cscherrer's SampleChains returns (either NamedTuple of vectors or vectors of NamedTuples or one that looks like the other) is the right thing for Julia types and for downstream tasks like conditioning or resampling. But these are not the best formats for other analyses like computing ESS or plotting, where you need real marginals. So I suspect both formats are needed, or ways to trivially interconvert.

@sethaxen @cameron I still don't understand these points.
- computing ESS
- plotting
- "real marginals"

I don't see obstacles to the first two, and I'm not sure what you mean by the third. Is there an example somewhere of something that's hard to do?

@cscherrer @cameron For ESS computation, you consider a single marginal at a time. So e.g. if your parameter is complex, you need to separately consider the real and imaginary parts. If your parameter is a struct, you need to separate it into all its real parts. Then you need effectively a matrix with dimensions (draws, chains). This really isn't compatible with a struct representation so one needs a way to interconvert.
@sethaxen @cameron Since this is recursive and turns each float into another float, couldn't this just use the same approach Zygote does for gradients? So just exactly the same structure as the input?
@cscherrer @cameron Close! It's really similar to what FiniteDifferences.to_vec does, except to_vec will discard anything discrete, and we can stop flattening at an array of reals. But I think basing something off of to_vec could provide a useful default. One catch is that we would want to preserve in the dims/coords of the flattened object some information so that the user could still know what the marginals represent.
@sethaxen @cameron No, I mean... why flatten at all? We have an array of structs (maybe represented as a struct of arrays), and we want to end up with a single struct.
@cscherrer @cameron This would be one approach, yes. It's in principle similar to what ChainRulesCore.Tangent is doing. But Tangent works because we **know** that the only supported operations on tangent vectors are addition and scalar multiplication. But ESS may compute an FFT on the marginals. And outputs may not necessarily be a single real value for each marginal. One might want a covariance. So I'm not convinced we can get away with never flattening.

@sethaxen @cameron Ok, something like covariance of the posterior makes sense, as does corner plot. But these still seem like the exception, not really central enough to build an entire ecosystem around.

What we really have in a posterior sample is an array of namespaces. Each single sample represents a collection of assignments of variables to values.

@cscherrer @cameron My point is not that we should build an ecosystem around a single way of constructing a Table. There are cases where the widest possible and tallest possible Table and anything in between are useful. I advocate for having a useful default and utilities for getting different types of Tables.
@cscherrer @cameron Most PPLs limit users to variables that are purely arrays of numbers, even sometimes just real numbers, and most users won't find this limiting, which makes life a lot easier for the PPL devs. In #JuliaLang we are greedy and want arbitrary Julia objects to be supported, which definitely causes interesting design challenges. I maintain though that the most useful format for draws at some point will usually be a collection of numeric arrays.
@cscherrer @cameron So when it comes to tooling for downstream analyses, I prefer to focus on that most common case while trying to provide utilities for those who want to use more complicated data structures.
@cscherrer @cameron I don't know if mastodon is the right venue for these types of conversations, but this has been fun! 😁
@sethaxen @cameron Yeah maybe not, sorry I get carried away with this stuff 😅