Simpson's Paradox is a wonderful little statistical phenomenon which is counterintuitive to most people. Here is a simple, concrete example.
Consider two cancer drugs A and B. We do a study, Study #1, where we give some cancer patients drug A and some people drug B. Let's say we give 11 people drug A and 7 people drug B. Of the 11 people given drug A, 5 die and 6 survive. Of the 7 people given drug, B 3 die and 4 survive. So it seems that drug B is better than drug A since 4/7 is greater than 6/11.
Just to be sure we do another study, Study #2. Again we give some patients drug A and some patients drug B. Of those given drug A, 6 die and 3 survive. Of those given drug B, 9 die and 5 survive. Again it seems like drug B is better than drug A since 5/14 is larger than 3/9.
But wait! What happens if we look at all the data together? Now, of those given drug A, 11 died in total and 9 survived. For drug B, 12 died in total and 9 survived. So when we look at the combined data drug, A is better than drug B.
This paradox is known as Simpson's paradox. While one might think that this is the sort of thing that only comes up with cleverly picked numbers in the real world, there are actually many examples of actual data that exhibits this behavior.
Aside from being extremely counterintuitive, this result also plays havoc with our naive notions of what constitutes confirmation of a hypothesis. In particular, the fact that we can have two separate pieces of evidence which alone constitute confirming evidence but together constitute disconfirming evidence is jarring. Results like this one undermine naive Bayesian views of how science should function.