Update from Chicago

Here’s my much-promised, less-delivered update on Chicago: what we’re doing, how it’s going, and if we’re going to come back.

In reverse order, yes. Sue and I need to leave the country at the end of August, when my visa (but not hers) expires. But gee, it’ll not be without regrets. Escaping an apocalyptic Melbourne winter to come to this great city has been immensely enjoyable. More on that below. So yes, all the rumours are false; we’re not staying on, even if we really want to.

The reason for us making the trip over here is so that I could take up the Eric and Wendy Schmidt Data Science for Social Good fellowship, being run through the University of Chicago. The fellowship itself came about from Eric Schmidt’s involvement in the Obama 2012 campaign, the data-intensive side of which was being run by Rayid Ghani.

So the story goes, Schmidt, who has made a little bit of money running Google, wanted Ghani to put the skills of young data scientists to the public good, and funded the fellowship. In return, Ghani and his team run the fellowship, which brings together some seriously bright folk from around the world (but mainly the US) to Chicago for 12 weeks over summer. These fellows work in groups of four with partner organisations—mainly not-for-profits and government agencies—to lend their skills to help solve some difficult problems.

The team I’ve landed in is fantastic. The members are Diana Palsetia, Pete Lendwehr, and Sam Zhang. Diana is a PhD student at Northwestern who specialises in high-performance parallel computing on large datasets. She’s the sort of person who interjects sparingly, from the back of the room, with extremely useful insights. Pete is a Phd student at Carnegie Mellon University, specialising in ‘advanced computation’, but seems to spend as much time pondering theatre and hunting down excellent coffee beans. He also turns my ideas—good and bad—into deployable Python code. Sam is a brilliant 21 year old who just finished up at Swarthmore, an elite liberal arts college. He’s an all-round hacker, writer and statistician who makes me reconsider the wisdom of having wasted so much of my 20s.

Our team has been partnered with two organisations: the first is Enroll America, a well-funded not-for-profit tasked with getting as many people signed up for health insurance under the Affordable Care Act/Obamacare as possible; the second is Get Covered Illinois, which is being run out of the Illinois Governer’s office, who are attempting to do the same, though only in Illinois.

Both of these organisations have limited budgets, and the same aim—get uninsured people covered. The big question for them is who should they target with their limited resources? There are some people who will never take out insurance no matter how much they’re pestered, and sad as it is, it’s a waste of money trying to convince them to do so. There are other folk who are far more interested in taking out insurance, but have not because, frankly, the system takes a bit of work. And you can always do it tomorrow, right?

There is no shortage of data here. As many of the people working on this problem (mainly on the Enroll America side) also worked on the Obama 2012 campaign, they use similar datasets to those used to find persuadable voters. That means that some of these datasets are quite big—a row for almost every American, with plenty of details (mostly best guesses) on each.

The approach that our team is taking is to build several statistical models to help Enroll America and GCI work out who they should be spending money contacting. The first model gives, for each person, the probability that that person is uninsured. There is no point contacting someone who is insured. What is surprising about this model is that there are a surprising number of people who you’d not expect to have health insurance who do, and so it’s surprisingly difficult to build a good predictive model that sorts out the insured from the uninsured.

The second model tells us how persuadable someone is, given their probability of being uninsured. Thankfully, Enroll America ran a randomised control trial in March, in which they randomly selected a ‘control group’, who they’d not pester during their telephone and email campaign. They then compared this group to a ‘matched treatment’, who were similar folk who were pestered, and compared the differences in insurance rates after the enrolment period ended. The result was quite profound: people who were pestered by email and phone were about 6% more likely to have taken up health insurance.

While the ‘treatment effect’ of being pestered is about 6% on average, the interesting question for our team is working out what the treatment effect must be for an individual person. This is an extremely difficult problem, towards which we have been devoting most of our time. Our current solution is here.

There are other problems that we’ve not done as much work on. For instance, what is the best contact language? Where should tabling events be held? How can we best guess someone’s income (which will determine how large a subsidy they will receive)? These are for the coming weeks.

Sue and Emi have also been busy, making friends in our neighbourhood–right next to the University of Chicago–and spending long days at the beach. Emi has learned to run, and Sue has learned to spot enclosed playgrounds.

A few words on Chicago. Picture this: one in three days in this city you can freeze to death without trying hard. Yet almost 10 million people decide to live in and around the city. Why would so many people make such a choice, surely crazy to the outsider?

I’m not 100 per cent sure, but it must have something to do with the fact that it somehow combines being an extremely large city with a small-town feel. Traffic is no worse—probably better—than Melbourne. Public Transit certainly isn’t Singapore, but is cheap and effective (especially during the rush). The music, theatre and intellectual scenes are full and exciting. The beaches are fun, the food great, and the people are extremely friendly; some combination of northern-Midwestern Nice and Southern hospitality, a remnant of the Great Migrations. Finally—this was unexpected—the summer is delicious. 29C every day, often cooled by large storms at night. Splendid.

The 10 million people who live in Chicagoland aren’t mad. They could live in the Eternal Spring that is Southern California, but don’t. If Southern California had Chicago’s winter, nobody would live there.


  1. Joshua Pooley said,

    July 7, 2014 @ 11:42 pm

    Nice write-up. So, as with any big data project, I have a couple of questions/comments. First, can I assume the 6% is statistically significant? Second, do you do a further trial where the control group is randomly selected and your targeted method is used and you compare how many people you reach are uninsured and then also compare how many sign up. Finally, do you do a cost-benefit analysis as to whether your modelling is cost effective.

  2. khakieconomist said,

    July 8, 2014 @ 4:35 am

    Josh – thanks for the comment.

    First, yes you can assume that it’s ‘statistically significant’, though I don’t give huge weight to that phrasing.

    Second, Enroll’s control group was randomly drawn from people who had expressed some interest in ‘knowing more’, possibly from a past phone call or other canvassing event. The treatment group was matched to the control group, but from the same base population. Given there may be non-randomness in the initial population (although the initial canvassing events/calls were pretty random), both treatment and control populations may be slightly more likely to take up insurance. But both are essentially drawn from a population of uninsured or marginally insured people (who are the target).

    On Cost Benefit, we’ll probably not get onto that. A few thoughts though. Enroll’s campaigns mainly run with the help of thousands of volunteers, who go door-to-door in communities with low insurance rates, or make hundreds of phone calls. So the costs here are not typical. Valuing the benefits of insurance is also tough. We can be a bit more lax on this as Enroll’s sole purpose is to enroll people, and we’re simply trying to work out who they should convince first.

  3. Jaron said,

    July 9, 2014 @ 7:49 am

    Jim, the only statistic from this entry I understood was that Chicago has 10 million people. 🙂 Keep enjoying yourself and say hello to Sue and Emi.

  4. Perry Jackson said,

    July 16, 2014 @ 8:37 am

    Hey Savaaage!

    Wow – interesting problem with some cool outcomes when you make it work.

    And what a cool fellowship!

    Great post, enjoy,


RSS feed for comments on this post