THREAD: Bias and disparity in a causal modeling framework.

1. A few months ago, @vtraag and @LudoWaltman posted a superb paper to the arXiv.

http://arxiv.org/abs/2207.13665

I've been meaning to write about it for a while and finally found the time.

Causal foundations of bias, disparity and fairness

The study of biases, such as gender or racial biases, is an important topic in the social and behavioural sciences. However, the literature does not always clearly define the concept. Definitions of bias are often ambiguous or not provided at all. To study biases in a precise manner, it is important to have a well-defined concept of bias. We propose to define bias as a direct causal effect that is unjustified. We propose to define the closely related concept of disparity as a direct or indirect causal effect that includes a bias. Our proposed definitions can be used to study biases and disparities in a more rigorous and systematic way. We compare our definitions of bias and disparity with various criteria of fairness introduced in the artificial intelligence literature. In addition, we discuss how our definitions relate to discrimination. We illustrate our definitions of bias and disparity in two case studies, focusing on gender bias in science and racial bias in police shootings. Our proposed definitions aim to contribute to a better appreciation of the causal intricacies of studies of biases and disparities. We hope that this will also promote an improved understanding of the policy implications of such studies.

arXiv.org

2. Society is rife with differences that we feel are unjust.

Differences in opportunities and outcomes according to race are ubiquitous in American society.

Or in science, I've written about gender differences in scientific authorship and citation, for example.

We often call these "biases".

3. In this paper, Traag and Waltman propose definitions and methodology for thinking more rigorously about the nature of these differences, in ways that help us identify where we can best intervene to ameliorate them.

To do so, they propose that we distinguish between *biases* and *disparities*.

4. They define biases as differences due to (1) we consider unjustified that are (2) causal and (3) causally direct.

For example, if a hiring manager chose not to hire people of a certain race, the racial differences in employment in that firm would be unjustified, caused by race, and directly so.

5. They define disparities as (1) unjustified (2) causal differences that are (3) causally indirect.

For example, if the education system denies opportunities to people of a certain race and thus they cannot even apply for employment at a firm, racial differences in employment at that firm will be unjustified, caused by race, but (assuming. a fair hiring manager) indirectly so.

These are disparities.

6. Biases upstream in a causal pathway lead to disparities downstream even when subsequent causal processes are fair.

If I'd had access to this paper back a few years ago when I wrote about what I called a gender bias in self-citation rates, I would have written a better paper.

In paper we identified a difference in self-citation rates by gender.

https://journals.sagepub.com/doi/10.1177/2378023117738903

7. We considered it unjustified, and so we called it a bias. But we didn't know (and stated this upfront) whether the difference was directly caused by men and women making different choices about self-citation, or whether there were other mediating factors such as previous publication record.
8. Subsequent work suggests that much of the difference is due to such mediating factors, most notably gender differences in the number of publications that men and women have available to self-cite. This makes the difference in self-citation rates a disparity, not a bias, using present terminology.

9. Why does it matter?

At a very basic level, it's important to point to the immediate causal sources of problems that one wants to remedy, because those sources are the best targets for interventions.

Telling women to cite themselves more frequently is not necessarily going to be a fruitful remedy to a self-citation disparity if men and women with similar numbers of previous publications cite themselves at similar rates, and the disparity is driven by an upstream cause (publication number).

10. Indeed, as Traag and Waltman illustrate with examples, interventions to correct disparities can backfire when targeted at the wrong stage of a causal pathway. When possible, they are best targeted at the stage where the bias arises.

11. There is a lot of other great stuff in the paper as well.

There's a sophisticated discussion of more subtle issues around confounds, colliders, etc.; well-articulated critique of some commonly used algorithmic fairness criteria such as independence and separation; and detailed case studies.

I learned a ton from this paper. Tonight I read it for the third time, and urge anyone interested in fairness, bias, disparity, causal inference, and related issues to take the time to read it as well.

@ct_bergstrom Thanks for this great thread on this inspiring paper. You have made me curious to read it myself!
@ct_bergstrom @vtraag @LudoWaltman thank you for such a great thread! This is a perfect example of why I am on 🦣 Sounds super interesting. Grabbing a ☕ and reading the paper 😊
@ct_bergstrom I sincerely hope this definition, with its focus on causality, allows more effective work to be done, and also I expect a shitstorm when someone in a group affected by bias finds the definition categorizes things differently from how they would prefer.

@ct_bergstrom
I love this proposed terminology. Great thread.

In this particular post, it seems to me there is a second crucial observation: addressing disparities is possible too, and requires actively counteracting upstream biases. The SCOTUS majority cooked up their affirmative action decision by ignoring that fact; their logic tacitly assumes that disparities (in the paper’s sense) do not exist, and any attempt to counteract them •must• therefore be bias.

@ct_bergstrom This is a good thread summarizing an interesting paper. I guess an issue that is recurrent in science/higher ed is the responsibilties of institutions to redress ("de-bias"?) disparities earler in the pipeline, where it's impossible to target the stage(s) where biases arise.