Thursday, September 11, 2008

Simpson's Paradox

Simpson's Paradox is a wonderful little statistical phenomenon which is counterintuitive to most people. Here is a simple, concrete example.

Consider two cancer drugs A and B. We do a study, Study #1, where we give some cancer patients drug A and some people drug B. Let's say we give 11 people drug A and 7 people drug B. Of the 11 people given drug A, 5 die and 6 survive. Of the 7 people given drug, B 3 die and 4 survive. So it seems that drug B is better than drug A since 4/7 is greater than 6/11.

Just to be sure we do another study, Study #2. Again we give some patients drug A and some patients drug B. Of those given drug A, 6 die and 3 survive. Of those given drug B, 9 die and 5 survive. Again it seems like drug B is better than drug A since 5/14 is larger than 3/9.

But wait! What happens if we look at all the data together? Now, of those given drug A, 11 died in total and 9 survived. For drug B, 12 died in total and 9 survived. So when we look at the combined data drug, A is better than drug B.

This paradox is known as Simpson's paradox. While one might think that this is the sort of thing that only comes up with cleverly picked numbers in the real world, there are actually many examples of actual data that exhibits this behavior.

Aside from being extremely counterintuitive, this result also plays havoc with our naive notions of what constitutes confirmation of a hypothesis. In particular, the fact that we can have two separate pieces of evidence which alone constitute confirming evidence but together constitute disconfirming evidence is jarring. Results like this one undermine naive Bayesian views of how science should function.


Dan said...

Excuse me for the lack of citations. This is a fun, true example I heard in a lecture on Simpson's paradox.

In a popular travel magazine, data for two airlines was printed. A was more likely to be on time than B in cities X and Y. However, Y was snowy, and B's hub was in Y. The paradox did occur. A was better on the condition that you cared where you're going.

SO in THIS case, the local data is more useful than the global data. This is contrary to the impression you gave in the cancer drug example, where the distinction between the two trials was completely artificial.


Joshua said...

Yes, that's a good example. Generally, Simpson's paradox occurs when there is some hidden variable.

One fascinating example of this is from the early 1970s when Berkeley was sued for admissions bias against women in its graduate school. Men were consistently more likely to be admitted than women and the difference was too large to be reasonably explained away by random chance. However, when one broke things down by department one found it seemed that most departments if anything had a bias against admitting men. What it turned out upon further investigation was that women were in general more likely to apply to departments that were more competitive than men. In particular, the humanities had much lower admissions rates with many qualified candidates being turned down. And thus highly qualified female applicants were much more likely to be turned down than highly qualified male applicants.

This lead to a series of court cases and scholarly articles on what exactly constituted valid statistical evidence for gender bias.

Wikipedia's article on the topic discusses the Berkeley example as well as a variety of other real life examples.

Dan said...

Actually, I remember that example from the talk to.

Here's a question--in general one learns to be wary of statistics through such rules as don't conflate causation with correlation, etc. However it seems to me like I might not notice an instance of Simpson's paradox until it's pointed out. Might you offer any general guideline to know when a statistic might be susceptible to the paradox?

Golis said...

Here's a way of explaining the paradox that indicates its resolution.

During a baseball season, Team A can have a higher win percentage for away games than Team B and a higher win percentage for home games than Team B, but still have a worse total win percentage than Team B.

Of course, the resolution is that Team A and B both do worse when they're away than when they're home, and Team A has played more away games than Team B.

Anonymous said...

Pretty! This has been an extremely wonderful article.

Thank you for supplying this information.

Also visit my page: Abercrombie France