Mastodawn

Mahesh Ramanan Jul 23, 2023

ICYMI this paper really is a must read: https://doi.org/10.1111/ele.14033

Comparing and choosing best models based on AIC etc, then interpret coefficients causally (what's the effect of X on Y?) is flawed, yet so common

We must draw causal assumptions first (DAG)

"Model selection is not a valid method for inferring causal relationships. It's appropriate for predictive inference (which model best predicts Y?), which is fundamentally distinct from causal inference (what is the effect of X on Y?)"

Show thread

Francisco Rodriguez-Sanchez Jul 21, 2023

Imagine we want to assess the effect of 'Forestry' on 'Species Y'. But we know other things may also affect Y

We could put all these variables in a regression model (what R. McElreath calls a causal salad), or build models w/ different subsets of predictors and compare them.

Show thread

Francisco Rodriguez-Sanchez Jul 21, 2023

That will lead to biased estimates. The best model based on AIC & BIC includes more predictors and gives biased estimate of Forestry effect on Y

The causal model (based on DAG) has much larger AIC but gives correct estimate

Show thread

Francisco Rodriguez-Sanchez Jul 21, 2023

This applies to Machine Learning too (random forests etc). Showing high variable importance does not mean those predictors are important from a causal point of view, only that they are useful to get good predictions

Show thread

Francisco Rodriguez-Sanchez Jul 21, 2023

Causal inference is rarely taught, yet seems so important. Many papers do not aim to predict but to make inferences about how important different variables are. It seems we're too often using a wrong approach

I'm trying to learn more about this. Next on my reading list: https://doi.org/10.1002/ecm.1554

Show thread

Noam Ross Jul 21, 2023

@frod_san Oh, what a great paper, and a nice companion to this @bbolker piece: https://github.com/bbolker/discretization/blob/master/outputs/discrete.pdf

discretization/outputs/discrete.pdf at master · bbolker/discretization

opinion piece on discretization and multimodel averaging in ecological statistics - bbolker/discretization

GitHub

Show thread

Noam Ross Jul 21, 2023

@frod_san I'm nodding particularly hard at this bit: "ecologists dependent on observational data for understanding causal relationships avoid explicitly acknowledging the causal goal of research projects and instead use coded language that implies causality without explicitly saying so"

Show thread

Francisco Rodriguez-Sanchez Jul 21, 2023

@noamross I'm guilty of that for sure 😅 Like many others, we just didn't know how to do better. Now trying to get there...

Show thread

Noam Ross Jul 21, 2023

@frod_san Same! There's a reference in the paper to this issue in literature on "drivers of viral density in bats" and I flashed back to a few of our papers.

Show thread

Noam Ross Jul 21, 2023

@frod_san Let me know if you find anything interesting on causal inference with nonlinear models.

Show thread

Francisco Rodriguez-Sanchez Jul 24, 2023

@noamross Sure!

Antonio Canepa (on bird site) recommended this very interesting preprint by Pichler and Hartig:

https://arxiv.org/abs/2306.10551

Can predictive models be used for causal inference?

Supervised machine learning (ML) and deep learning (DL) algorithms excel at predictive tasks, but it is commonly assumed that they often do so by exploiting non-causal correlations, which may limit both interpretability and generalizability. Here, we show that this trade-off between explanation and prediction is not as deep and fundamental as expected. Whereas ML and DL algorithms will indeed tend to use non-causal features for prediction when fed indiscriminately with all data, it is possible to constrain the learning process of any ML and DL algorithm by selecting features according to Pearl's backdoor adjustment criterion. In such a situation, some algorithms, in particular deep neural networks, can provide near unbiased effect estimates under feature collinearity. Remaining biases are explained by the specific algorithmic structures as well as hyperparameter choice. Consequently, optimal hyperparameter settings are different when tuned for prediction or inference, confirming the general expectation of a trade-off between prediction and explanation. However, the effect of this trade-off is small compared to the effect of a causally constrained feature selection. Thus, once the causal relationship between the features is accounted for, the difference between prediction and explanation may be much smaller than commonly assumed. We also show that such causally constrained models generalize better to new data with altered collinearity structures, suggesting generalization failure may often be due to a lack of causal learning. Our results not only provide a perspective for using ML for inference of (causal) effects but also help to improve the generalizability of fitted ML and DL models to new data.

arXiv.org

Show thread

Noam Ross Jul 24, 2023

@frod_san Thanks! Also:
https://doi.org/10.1198/jcgs.2010.08162
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9778579/

Show thread

Ben Bolker Jul 21, 2023

@noamross @frod_san FWIW I may actually get this submitted/posted to a preprint server in the near(ish) future ...

Show thread

Noam Ross Jul 21, 2023

@bbolker 🥳🙏

Show thread

Ben Bolker Jul 23, 2023

@noamross Your wish is my command: https://ecoevorxiv.org/repository/view/5722/ Hoping to submit to *Methods in Eco/Evo* v soon, unless someone tells me that Wiley/MEE are on the Naughty List now ... what's *not* in here is a bunch of my own simulations to evaluate MMA coverage in different scenarios - I relied on the large number of existing studies that look at this (interestingly Burnham and Anderson are the *only* ones I can find who report good coverage from MMA CIs ...)

discretization/outputs/discrete.pdf at master · bbolker/discretization

Can predictive models be used for causal inference?

Multimodel approaches are not the best way to understand multifactorial systems