Things are coming together for #ArviZ's InferenceData (https://github.com/arviz-devs/InferenceObjects.jl) to be a supported output type for #Turing and #JuliaLang's  #Stan interface, similarly to how it is for #PyMC.

For details, see https://github.com/TuringLang/MCMCChains.jl/issues/381 and https://github.com/StanJulia/StanSample.jl/issues/60

#statistics #mcmc_stan #bayesian

GitHub - arviz-devs/InferenceObjects.jl: Storage for results of Bayesian inference

Storage for results of Bayesian inference. Contribute to arviz-devs/InferenceObjects.jl development by creating an account on GitHub.

GitHub

This has some really nice benefits for #Bayesian folks in #JuliaLang.

First, InferenceData is just more useful than MCMCChains.Chains because it contains more data, preserves the array structure of the draws, and integrates better with the ecosystem thanks to DimensionalData. It also follows a multi-language spec, so it's great for long-term storage and communication of inference results.

Second, since every PPL has its trade-offs, this promotes users using the best PPL for a given model with no changes to their downstream inference pipeline. While ArviZ.jl already supports this, having direct support in these PPLs makes this even easier.

Finally, part of this effort involves moving this integration code out of #ArviZ.jl, which still has #Python dependencies, and into pure Julia packages, so users get all of this with the convenience of #JuliaLang's package management.

While Python interop in Julia usually works quite well, sometimes the Python environment gets messed up, which blocks users from using ArviZ.jl 馃槮 , so moving this code to pure Julia packages supports more users.

@sethaxen honestly this is the big benefit for me -- MCMCChains throws out so much information when you make the chain and it angers me. Of course, it's largely my fault, but I'm glad you're doing something about it.
@cameron One downside to InferenceData is that everything still needs to be a numeric array at some point (whether for plotting, diagnostics, statistics, or serialization), so even when we do have support for sampling Cholesky in Turing, it may not fit nicely into an InferenceData-based workflow without some loss of structure. But right now this is the exception, not the norm, and we have ideas we can try, so either way this is progress.
@sethaxen We were dealing with this problem too -- the numeric type thing is a massive pain in the ass. I think we discussed using tuples or vectors-of-vectors, but never really got around to implementing it.
@cameron I think what the TransformVariables family of packages and @cscherrer's SampleChains returns (either NamedTuple of vectors or vectors of NamedTuples or one that looks like the other) is the right thing for Julia types and for downstream tasks like conditioning or resampling. But these are not the best formats for other analyses like computing ESS or plotting, where you need real marginals. So I suspect both formats are needed, or ways to trivially interconvert.
@cameron @cscherrer That's where utilities like this can help bridge the gap: https://github.com/arviz-devs/InferenceObjects.jl/issues/27
Utility for unflattening Datasets 路 Issue #27 路 arviz-devs/InferenceObjects.jl

The natural way to represent a draw from a posterior distribution is as a NamedTuple whose keys are parameter names and whose values are the values. The values can be scalars, arrays, or arbitrary ...

GitHub
@cameron @cscherrer An example is, what if Turing defined a "flattening" operation for Cholesky and its inverse (e.g. Matrix and cholesky)? The flattening is called when constructing the InferenceData and the inverse is added to metadata, but then the inverse can be used with this utility to get back exactly what Turing needs e.g. for conditioning. The one caveat being that as soon as you serialize to a format like netCDF, the inverse function is discarded.

@sethaxen @cameron I still don't understand these points.
- computing ESS
- plotting
- "real marginals"

I don't see obstacles to the first two, and I'm not sure what you mean by the third. Is there an example somewhere of something that's hard to do?

@cscherrer I think it's hard to do in a "dumb" way because you have to be aware of the underlying data structure or (generally) provide some mapping function. You're right that it's not difficult, but it is an additional layer of complexity that can be difficult to do well generally.

@cameron Sorry, I still don't get it. Any variable in the posterior is just a value. It's passed around as a value, and called in a very natural way. How does that make anything hard?

Storing everything as a flattened array seems a lot harder to me.

@cameron The one case I *can* see as being weird is if you have functions that require an array. But this is exactly what comes up in HMC, and TransformVariables seems to solve the problem just fine
@cscherrer @cameron For ESS computation, you consider a single marginal at a time. So e.g. if your parameter is complex, you need to separately consider the real and imaginary parts. If your parameter is a struct, you need to separate it into all its real parts. Then you need effectively a matrix with dimensions (draws, chains). This really isn't compatible with a struct representation so one needs a way to interconvert.
@sethaxen @cameron Since this is recursive and turns each float into another float, couldn't this just use the same approach Zygote does for gradients? So just exactly the same structure as the input?
@cscherrer @cameron Close! It's really similar to what FiniteDifferences.to_vec does, except to_vec will discard anything discrete, and we can stop flattening at an array of reals. But I think basing something off of to_vec could provide a useful default. One catch is that we would want to preserve in the dims/coords of the flattened object some information so that the user could still know what the marginals represent.
@sethaxen @cameron No, I mean... why flatten at all? We have an array of structs (maybe represented as a struct of arrays), and we want to end up with a single struct.
@cscherrer @cameron This would be one approach, yes. It's in principle similar to what ChainRulesCore.Tangent is doing. But Tangent works because we **know** that the only supported operations on tangent vectors are addition and scalar multiplication. But ESS may compute an FFT on the marginals. And outputs may not necessarily be a single real value for each marginal. One might want a covariance. So I'm not convinced we can get away with never flattening.

@sethaxen @cameron Ok, something like covariance of the posterior makes sense, as does corner plot. But these still seem like the exception, not really central enough to build an entire ecosystem around.

What we really have in a posterior sample is an array of namespaces. Each single sample represents a collection of assignments of variables to values.

@cscherrer @cameron My point is not that we should build an ecosystem around a single way of constructing a Table. There are cases where the widest possible and tallest possible Table and anything in between are useful. I advocate for having a useful default and utilities for getting different types of Tables.
@cscherrer @cameron Most PPLs limit users to variables that are purely arrays of numbers, even sometimes just real numbers, and most users won't find this limiting, which makes life a lot easier for the PPL devs. In #JuliaLang we are greedy and want arbitrary Julia objects to be supported, which definitely causes interesting design challenges. I maintain though that the most useful format for draws at some point will usually be a collection of numeric arrays.
@cscherrer @cameron So when it comes to tooling for downstream analyses, I prefer to focus on that most common case while trying to provide utilities for those who want to use more complicated data structures.
@cscherrer @cameron I don't know if mastodon is the right venue for these types of conversations, but this has been fun! 馃榿
@sethaxen @cameron Yeah maybe not, sorry I get carried away with this stuff 馃槄

@sethaxen @cscherrer @cameron

One thing I've learned from this is that Mastodon doesn't seem to have any way to filter out replies to a post I don't care about.

E.g. if I want to follow the three of you, but don't want this conversation to dominate my entire timeline, it seems I have no option other than manually selecting every post and filtering it. Definitely not ideal

@mprotter @sethaxen @cameron Ouch, yeah that's annoying
@cscherrer @mprotter @cameron Oof! Sorry about that! 馃槄

@sethaxen @cscherrer @cameron No, don't apologize! Mastodon *should* be a place where conversations like this can happen IMO. It just seems like a feature is missing here.

I've brought this up on hopefully the right Github issue: https://github.com/mastodon/mastodon/issues/18955#issuecomment-1301228826

Revamp filters to cover more use cases 路 Issue #18955 路 mastodon/mastodon

Pitch filter based on status content / cw / alt text contains substring author username/domain contains substring author id conversation id attachment type? (sensitive, no alt text, image/video/pol...

GitHub
@cscherrer @mprotter @cameron I wonder if the first replies were set to "Unlisted" privacy level if that would propagate to all subsequent replies. That would prevent polluting the timeline. I'm trying it with this post.

@mprotter @sethaxen @cscherrer @cameron In the desktop site there is an option to hide replies.

PS: I'm on fosstodon instance

@sethaxen @cameron In most cases, yes. But it's funny, I'd expect that non-arrays are especially important in nonparametric modeling, so it's strange to me that Turing is so array-focused. @cameron_pfiffer do you know how that works?

More generally, we should be able to have posteriors over arrays, but also over trees, functions, or anything else we can represent. No need to restrict to what C-based PPLs can do :)

@cscherrer @cameron @cameron_pfiffer Perhaps my array-focus is because I'm focused on common analyses for the Bayesian workflow, and I'm not certain if/how these can be performed for arbitrary draws, e.g. ESS/R-hat for non-parametric models. Similarly, you could have a sample of tree topologies, but I wouldn't know how to compute many of the useful diagnostics or construct many of the plots common to Bayesian workflow for such draws.
@cscherrer @cameron plotting is more complicated because what you usually want to interact with a plotting package is a tabular structure, but there are different ways to tabularize the data depending on the plot. A struct representation may work sometimes here, and so may a flattened representation, but you may want utilities to expressively construct different tabular structures.
@sethaxen @cameron Is this inherent in the problem, or just the way things are currently set up, because that's what a lot of PPLs do?
@cscherrer @sethaxen I think it's more general -- numeric computing writ large is more array-focused because that's how it's always been. Lots of tools/support/intuition for array structures.
@cameron @cscherrer Mm, I think this is more about how we can empower the user to concisely create bespoke plots without a lot of wrangling. Plots generally want real arrays. The Tables interface is very powerful for creating these arrays, so if we have a small set of utilities for transforming the draws into different tabular structures, then it can be very helpful.

@cameron @sethaxen Sure, but when we have lots of variables, we usually pass them as variables, each of which might be array-like in some way. These can be structured, say a lower-triangular matrix or structure like a Dirichlet on a lower-dim space.

But this "one big array" approach seems a lot more recent, and I still don't see the benefit

@cscherrer @cameron "One big array" has its uses. e.g. for constructing a corner plot with all marginals. But in my experience, it's rarely the approach I need.
@sethaxen great work! I've been thinking to move to Julia. Specially, because I like doing simulations with Bayesian inference. I'm following you just in case you can share some day applied examples. Thanks for sharing!

@estebanmonte You're welcome! Most of my current focus is more developing the tooling in and around Julia's PPLs rather than building models itself, but yeah, some applied examples will show up here. Happy to answer any questions you have about #JuliaLang's PPLs or the ecosystem in general!

When you say simulations with Bayesian inference, do you mean simulating from Bayesian models (e.g. predictive draws) or inferring parameters of simulators?

@sethaxen No, I create a bunch of statistical models representing different scenarios (e.g misfit in models) and check how the models behave under ideal and not so ideal scenarios. I test this using Bayesian inference and sometimes compared to Maximum Likelihood. Nothing fancy here.

@estebanmonte This is a repo of #JuliaLang code for examples from @rlmcelreath 's Statistical Rethinking textbook

https://shmuma.github.io/rethinking-2ed-julia/

Statistical Rethinking (2nd edition) with Julia

Port of Statistical Rethinking (2nd edition) code to Julia

rethinking-2ed-julia
@athulsudheesh @rlmcelreath thanks!! This is very useful. I used that book long time ago, I didn't know there were Julia code.