Notes on causal inference workshop

Yesterday, I gave a workshop on an introduction to causal inference. The slides and materials are here (the slides are called causality.htm, you will need to download the html file, and it is best not viewed in Chrome). Most of the attendees came from finance/BI/predictive fields, and so I think the material was actually new to most of them, which is great!

The central point of the workshop is that causal problems are hard, and the intuition of a predictive modeler can lead them astray. In particular, we need to think about bad controls/moderators and unobserved confounders. If blindly applying machine learning to large datasets without applying causal reasoning, we get precise estimates of a meaningless number.

After the workshop, a few of the students requested some readings. So below are some introductory papers/slides that I have found helpful in shaping my own thinking and generating course notes.

The Three Layer Causal Hierarchy. A very short piece that highlights the difference between associative and causal reasoning. A very good entry point.

FOR OBJECTIVE CAUSAL INFERENCE, DESIGN TRUMPS ANALYSIS – Rubin. Donald Rubin has written many, many great pieces saying pretty similar things. This is one! It’s a good review piece, that talks through the Rubin Causal Model, including the potential outcomes framework and the treatment assignment mechanism. Very clear.

Mastering Metrics by Angrist and Pischke, a fantastic introductory book on causal analysis, entertaining enough to be read at bed. They walk through regression, panel data, instrumental variables, regression discontinuity design, and difference-in-differences. Should have a “gateway drug” warning. If you know this stuff already, then check out Mostly Harmless Econometrics, their graduate-level treatment of the subject.

This fantastic conversation between Card and Krueger gives you a bit more of a back-story about how the techniques discussed in Mastering Metrics made their way from medicine into applied economics. It’s very entertaining.

Causal mediation – a few days ago, Andrew Gelman had a fairly short post on causal mediation (a deeper way of thinking about bad controls). One of those few occasions on the internet where the comments are better than the post.

Chapters 9, 10 and 23 from Gelman and Hill are a superb place to get a practical start in causal inference. Lots of examples, code, etc. If you want to implement the models from chapter 23, you might (should) want to use Stan rather than bugs. The examples have been translated to Stan here. If you want to implement models from the book, I recommend using the R package rstanarm, and prepending “stan_” to most of the model calls.

There’s an exciting new field in using machine learning methods (especially BART) to estimate heterogeneous treatment effects. I know the machine-learny people will want to start here, but I will tell you not to. Once you understand confounding, bad controls, fixed effects, etc. then it might be safe to play with this stuff. One great application is in this paper by Don Green and Holger Kern. Jennifer Hill has also been pushing for this sort of analysis recently, for example this paper and these slides.

Finally, Pearl. I have read bits and bobs of the book, and have found the language and notation off-putting. What I find extremely helpful is how clearly he thinks about causality, and reading his real-life arguments (as in the post on Gelman linked above).



  1. Brian Parbhu said,

    July 17, 2016 @ 6:03 pm

    Awesome, thanks again for this post and for the workshop yesterday!

  2. khakieconomist said,

    July 18, 2016 @ 1:54 pm

    Although it’s not causality, Eric Novik also pointed out the excellent prior choice recommendations on the Stan wiki.

  3. John Hall said,

    July 18, 2016 @ 2:46 pm

    Is it possible to fit BART with Stan?

  4. khakieconomist said,

    July 18, 2016 @ 8:05 pm

    Hi John – I don’t think BART is differentiable, so HMC won’t work. I use the BayesTree library in R, which is very simple to use.

  5. Harsha W said,

    August 11, 2016 @ 6:21 pm

    Hi, What is the fundamental difference between the Rubin Causality Model (RCM) and Pearl’s DAG-based Approach? Is it possible to combine both?

RSS feed for comments on this post