Thursday, June 11, 2009

Awkward Parties and Ramsey Theory

Suppose you are throwing a party. Suppose further that you decide that you wish to make sure that, at the party, there is a minimum of awkwardness in the social interaction. To minimize awkwardness, you want to be sure either that there are at least three people at the party all of whom know each other or that there are at least three people who all don't know each other. Is there some number of people we can invite to guarantee that one of these two conditions will occur?

The answer is yes. In particular, if there are at least six people at your party, you will have three mutual strangers or three mutual acquaintances. Let's convince ourselves this is true: Assume there are at least six people at your party. Pick one of them, say Alice. (If you do have Alice, I incidentally recommend that you don't invite Eve since they don't get along well.) Now, of the at least five other people at the party, assume that Alice knows at least three (the argument is identical if she doesn't know at least three). Now, if any of those three people know each other, then that pair form a set with Alice of three people all of whom know each other. Then we are done. But if all three of those do not know each other, then they form a set of three which all don't know each other. So we either way, we must have our group of three people.

This problem, although it seems trivial, is an example of a problem from Ramsey Theory. Ramsey Theory is, roughly speaking, the study of inevitable structure, i.e., if we make certain types of objects large, what will we necessarily find in the objects? In this case, the objects in questions are examples of what are called graphs. By a graph, we mean a set of points (which we call vertices) and a set of edges connecting some of the points. In the situation above, we draw an edge between two points if and only if two people are friends.

Thus, the above result says that, given a graph with at least six vertices, there are either vertices forming a triangle or there are three vertices with no edges connecting them. There are two minor helpful pieces of notation: We say a graph is complete if, for any pair of vertices, there is an edge connecting those two vertices. We say a graph X is a subgraph of another graph Y if one can obtain X by cutting out some number of vertices from Y and then throwing out any edges in the remainder that do not connect on both ends to the remaining vertices.

With this notation, we have an easier way to visualize this problem: Is there some n such that a complete graph of n vertices with all edges colored red or blue must contain a red or blue triangle?

One might wonder about our original problem whether six is the minimum number of people? As it happens there is a pretty diagram we can draw to show that five is insufficient:

(Creative Commons 2.5 license. Original at )

Given this framework, it is also natural to generalize to the following: Given some k, is there an n such that a party with at least n people must have a collection of either k mutual acquaintances or k mutual strangers? Or, framed in terms of red and blue edges, given some k, is there an n such that any complete graph of n vertices must contain a complete k subgraph with all blue edges or a complete k subgraph with all red edges?

Surprisingly, the answer turns out to be yes. However, to prove it we need to consider a slightly more general question. We define R(r,b) to be the minimal n (if it exists) such that a complete graph of n vertices with edges either red or blue must contain either a complete subgraph of r vertices with all red edges or a complete subgraph with b vertices with all blue edges. In our earlier party notation, R(r,b) counts the minimum number of people needed at a party such that no matter who is at the party there will be r people who are all mutual strangers or b people who are all mutual acquaintances.

We shall show that R(r,b) exists and is bounded for all r and b. Since our original problem is essentially R(k,k), this will answer that question. We first note that we easily have a few small values of R. For example, R(1,1)=2, R(1,n)=R(n,1)=1, and R(2,n)=R(n,2)=n. By our earlier remarks, R(3,3)=6. We need, however, a more general statement.

Lemma: For all positive integers r,b > 1, if R(r-1,b) and R(r, b-1) exist, then
R(r, b) ≤ R(r - 1, b) + R(r, b - 1) + 1.

Proof: Assume as given and consider a complete graph Z with R(r - 1, b) + R(r, b - 1) + 1 with all edges painted either red or blue. Pick a vertex v, and partition the remaining R(r - 1, b) + R(r, b - 1) -1 vertices into two sets X and Y. A vertex m goes into X if v has a blue edge to m and otherwise goes into Y. (For those still using the party analogy, we have picked a specific individual at the party, say Veronica, and now have split the entire party into people who either are friends or are not friends with Veronica).

Now, let us denote how many vertices are in X by |X| and how many vertices are in Y by |Y|. It is clear then that |X|+|Y| ≥ R(r - 1, b) + R(r, b - 1). Thus either |X| has at least R(r, b-1) elements or |Y| has at least R(r-1, b) elements. Assume we are in the first situation (the case in which |Y| is at least R(r,b-1) is essentially identical). Now, since X has at least R(r,b-1) vertices, we know that X either contains a complete subgraph of r vertices with all red edges or contains a complete subgraph of at least b-1 vertices with all blue edges. Now, if X contains a complete subgraph of all red vertices, then we are done (since X is a subgraph of our original graph Z). If X instead contains a complete subgraph of b-1 vertices with blue edges, then that subgraph with v thrown in now has all blue edges (since everything in X connects to v by blue edges) and has b elements. So in either situation we have a graph of the desired size.

From this bound it easily follows that R is always defined and bounded since we can always break R(r,b) into being bounded by values of R for smaller arguments. It isn't that hard to make this bound slightly stronger. It is more difficult to find lower bounds of R(r,b). The first such non-trivial lower bounds (i.e. bounds that were not just linear) were found by Erdos who used a then very clever method that is now a standard technique for these sorts of problems. Erdos constructed an estimate for how likely a random graph was to contain a complete subgraph of all red or all blue vertices. That is, Erdos estimated the probability that a given graph with n edges (with the edges randomly colored blue or red) contained a complete graph with k blue edges or k red edges. Erdos was able to show that if the graph was small, then the probability that a graph did not contain such an object was less than 1. If the probability was less than 1, there had to be some graph with that many edges which did not have the complete graph. Using this, he was able to show that R(k,k) > 2^(k/2).

Note that this procedure is completely non-constructive. It doesn't tell you how to even make such a graph for a given k. It simply says that if you keep randomly picking colors for a graph of that size then you will eventually hit a graph that shows that R(k,k) < n.

R(k,k) grows very quickly with respect to k. Moreover, R(k,k) is not at all easy to calculate. We know for example that R(4,4) =18. But even R(5,5) is unknown. (We know it lies somewhere between 43 and 49). The range is even worse for R(6,6) which we know must lie between 102 and 165. The problem looks at a naive glance like it can be brute forced. However, in order to show that R(k,k) = n, one needs to show that any graph with n vertices has the desired property. It is not hard to show that complete graph with n vertices has n(n-1)/2 edges. Thus, if any edge is colored either red or blue, we would have 2^(n(n-1)/2) different cases to check. Thus, to verify R(4,4)≤18 by brute force one would need to check 2^(14*(13)/2) = 2^91. That's about 10^27 cases. That just isn't doable. And the numbers get even more daunting for larger k.

There is a story that Erdos used to illustrate the comparative difficulty of this problem by constructing the hypothetical of an alien species landing on Earth and threatening to destroy us if we cannot give them the value of R(5,5). In that case, Erdos thought it would make sense to get all the world's mathematicians to drop what they are doing and work on the problem. However, if the aliens instead asked for R(6,6), we would be better served trying to fight the aliens. Calculating the exact values of R is really hard.

Now, one might not like the exact parameters of the problem. One might object that people don't simply know other people or not. They might have never met them before but have heard of them. Or they might recognize their faces but not have talked. Or they may have only talked briefly. Or they may be good friends. Or they may have known each other in the Biblical sense. So what happens if instead of just two possible colors between edges, we generalize to c colors? It turns out that the argument used above can be generalized to any finite number of colors. But the values get bad much faster. Generalizing the notation in the obvious sense, it is possible to show through not too difficult arguments that R(3,3,3) ≤17 and in fact it is known that R(3,3,3)=17. However, R(4,4,4) and R(3,3,3,3) are both unknown.

Some readers may also object to the party statement of the original problem. First, presumably for most parties the host knows everyone present. Does that simplify the problem? While one might guess that assuming that there is a vertex that has all blue edges would simplify things a lot, it turns out to be not substantially different than the original problem.

Another objection is that having a large number of people who don't know each other at a party doesn't make it less awkward, but rather more so. Consider a party with n people, n-1 of whom all know each other and the nth person who doesn't know anyone else. This could be very awkward for the nth person. Alternatively, the nth person could be some sketchy dude who is much older than everyone else and wandered in from the street (he probably has a mustache too). In that case, everything is much more awkward for the n-1.

There is however a practical solution: Every person knows at least one other person. And as long as you have at least one other person to talk to it substantially reduces awkwardness (even if it doesn't minimize it). So if we invite every single human being to the party everyone will have at least one friend to talk to. Therefore, I'm inviting the entire human race to a party next Saturday evening at 7 PM Eastern Standard Time. The venue is planet Earth. It is the third planet in the Sol system. If you have trouble finding it, just look for the big source of radio emissions that isn't the local star.

Please show up. If you don't, your absence might make it very awkward for someone who does come.


Anonymous said...
This comment has been removed by a blog administrator.
Anonymous said...

I don't even know what the first poster said, but i think I agree with them

Joshua said...

It was a spam link from a bot.