• Music:

# Applying the Birthday Paradox to World Cup Group Scores

First, I should apologize. The seventh MLP puzzle I was promising hasn't been finished, and it won't be finished for some time yet. But I do have something to show today.

Alexandre Muñiz posted some combinatorial musings on the World Cup recently, and I had a stab at the main posed problem. Figured out some shortcuts, but it's quite combinatorially tricky.

The question is this: given that the soccer teams in each of several groups are evenly matched, and there is a certain likelihood of each game resulting in a tie, how likely is it that the groups will all have different scores?

The background for this question is on Alex's post, but I'll try to summarize here. Thirty-two teams compete in the FIFA World Cup, and for whatever reason (perhaps to reduce the impact of a single bad day on a team), the first round consists of eight round-robin matches among groups of four teams each. This means each group will have six games among themselves.

Now, after each game, each of the two teams receives 0, 1, or 3 points for its own group score, depending whether they lost, tied, or won against the other team. So after all six games within a group, 40 different group scores (such as 7-5-2-1) are possible, although not all equally likely. Alexandre asks how likely it is that all eight groups have different scores.

Of course, he makes the simplifying assumption that a team in any game will win, lose, or tie with probability 1/3 each, then calculates the answer with a computer program. This gives about 2/5 probability that all group scores will be different. But with that solved, he still has some follow-up questions, one of which is this:

What proportion of ties would maximize the probability that all group scores are different?

(I've been skirting the assumption that every game is a priori identical and symmetrical, but it's something I should point out. Otherwise, the 48 games might have elements of unidentifiable predetermination, whether that's because one team is so much better than the other, or because the teams cooperate to achieve a desired outcome.)

So if each game has an equal chance of going one way or the other, dependent only on how likely a tie is, how does one determine the likelihood of all group scores being different? There are 3^48 possible outcomes if all 48 games are treated as distinct, and I suspect Alexandre's program went through all such outcomes one by one. Pass.

I can save some work by noting that under the assumptions, every outcome with the same number of ties is equally likely. Which isn't to say that every group score with the same number of ties is equally likely; for example, 9-4-4-0, 6-6-4-1, and 6-4-4-3 require one tie each, but they should occur in a ratio of 1:2:3 with each other. So here is how in a given group, the 729 possible outcomes break down into the 40 possible group scores:

9630 24
9333 8
6660 8
6633 24
0 ties: 64

9611 12
9440 12
9431 24
7730 12
7640 24
7631 24
7433 24
6641 24
6443 36
1 tie: 192

9421 24
7711 6
7621 24
7540 24
7531 24
7441 36
7432 24
6541 24
6442 24
5443 24
4444 6
2 ties: 240

9222 4
7521 24
7431 24
7422 24
6522 12
5550 4
5541 24
5532 12
5442 24
4443 8
3 ties: 160

7322 12
5531 12
5522 12
5432 24
4 ties: 60

5332 12
5 ties: 12

3333 1
6 ties: 1

Total: 729

Now, if ties happen with probability either 0 or 1, there will be repeats among eight groups, but otherwise, there are 3003 ways the eight groups could be categorized by number of ties. I think the probabilities of each case occurring have to be calculated individually. But for any given case, the probability of having no repeats is fixed; just multiply the probabilities for how many groups have each number of ties.

Actually, that second part is also very tricky. One or zero groups having a given number of ties won't cause a repeat, but two groups both with 5 or 6 ties will always cause a repeat. If two groups have the same number of ties, the repeat probability is the sum of the squares of the relative individual probabilities; if it's three groups, the probability is three times the sum of the squares, minus two times the sum of the cubes. There are eleven group scores that involve two ties each; if all eight groups managed two ties each, the probability of a repeat would be huge, but very complicated to figure.

Well, that's about it for the main problem. I don't know the answer. But to make sure my methods work, I tried them in a spreadsheet on three groups of three teams each. The probability of no repeats when ties are 1/3 probability is about .53315 or exactly 1166/2187. This reaches a maximum of about .533468755 when ties have a probability of about .34308.
Tags:
• Post a new comment

#### Error

default userpic