ExtraTricky

For many competitive games, the standard competitive format is a series of swiss rounds with a single elimination bracket for the top performers from the swiss stage. Magic and Pokemon both primarily use this format for their tournaments, but there are some differences in execution. In Magic, the single elimination bracket is always exactly 8 players, while in Pokemon it can depend on the size of the tournament, and a cut to top 16 is most common.

Since I started to play competitive Pokemon, I've seen several people express frustrations at cutting to top 16, because this cut is usually in the middle of the X-2s (the competitors with two losses), and which X-2s make it in is then dependent on their tiebreakers (also known as "resistances"). The first tiebreaker is opponent match win percentage: if your opponents did better, then you are ranked higher. On paper, this sounds perfectly fine. If two people go X-2, then the person with the harder path to that X-2 should have priority in moving to the single elimination phase (although when people can earn multiple byes this is often the opposite of what happens).

However, even if this system is fair from an external viewpoint, it produces a pretty poor player experience. Players don't control who they play against, and in games like Magic and Pokemon you can lose even when you play perfectly, so your tiebreakers feels very random. A very reliable first order approximation is that your tiebreakers are best if your losses happen late in the tournament. So what if you happen to play against the eventual winner of the tournament in round 1? Take that one loss and then you more or less need to win every other game to make the top cut.

With a cut to top 16 like in Pokemon, it is usually the case that about half of the top cut went X-2 in swiss, and that those are about one third of the X-2s in the tournament. I'll present more precise numbers later on, but I want to give a sense of why this leads to a bad experience for the competitors. The fact that half of the top cut is X-2 makes it very difficult to seriously say "If you go X-2 and didn't make top cut then you didn't deserve it." There's very little difference between one person's X-2 record and another's, so it feels like if you didn't deserve to make top cut, then that's also true for half of the people who did. On the other hand, if you think of X-2 as being "good enough for top cut", then a huge number of players who are good enough aren't given the chance to compete, due to more or less random effects.

This situation is very different from the situation in Magic, where in a typical top 8, at least 6 of the competitors will be X-1 or better (actually, it's a bit different for in-person tournaments due to the option to intentionally draw so that a pair of opponents can both guarantee that they make top 8, but we'll focus on tournaments where such intentional draws are not allowed, such as the online tournaments). The X-2 slots are pretty much reserved for the people that were undefeated up until the last two rounds and then lost both of those, so it feels much less like a coin flip. With Magic's system, it feels very reasonable to say that you need to X-1 to make top 8, and that X-2 isn't good enough.

That covers the situation as it is. Now I want to talk about what you should be considering in order to make a swiss tournament with a good player experience. It is very easy to get this wrong, and so it's important to be careful when choosing the number of rounds and size of the top cut for such a tournament.

Cutting to Top 8

Let's start with a tournament system like Magic's where we always cut to top 8. In this case, the only things we need to consider are the number of players and the number of rounds in the tournament. We want to make sure that the tournament has enough rounds so that we have a good sense of who should make top 8, but not too many so that the tournament takes too long.

About four years ago, Magic's round cutoffs looked something like this:

17-32 players: 5 rounds
33-64 players: 6 rounds
65-128 players: 7 rounds
129-226 players: 8 rounds
227-409 players: 9 rounds
410-744 players: 10 rounds
745-1365 players: 11 rounds

In practice, the 10 and 11 round thresholds mainly got met for online events, and I was unable to find archives for the webpage where these round thresholds were published. Here's a sample tournament page for an in-person tournament that has approximately these thresholds up to 10 rounds, though. How did these numbers get chosen? There are two goals in mind:

At most one player should be undefeated. This explains the power-of-two progression up to 7 rounds.
There should be at most 8 X-1s or better (so they all make top 8). This determines the numbers for 8+ rounds.

For example, let's look at an 8 round tournament. The tournament structure is independent of what game is actually being played, so let's imagine we're playing a tournament of coin tosses. Then to be undefeated, we'd need to win 8 coin tosses in a row, which is a feat that \( \frac{1}{2^8} \) of players will accomplish. To go X-1, we need to win 7 of the 8 coin tosses. There are 8 possible coin tosses we could lose, and each of those sequences has a \( \frac{1}{2^8} \) chance of happening. Overall, \( \frac{9}{2^8} \) of the players should be X-0 or X-1. In a tournament of 226 players, that means we expect \( \frac{9}{2^8} \cdot 226 = 7.945 \) players to be X-1 or better.

Similarly, the expected number of X-1 or betters for a 9 round 409 player tournament is 7.988, , for a 744 player 10 round tournament it will be 7.992, and for a 1365 player 11 round tournament it will be 7.998. So far, sounds good. And usually these numbers will be fine. However, then there was a very large online tournament (I believe a MOCS) where there were 9 people at 9-1 or better, so one person got left out of top 8 due to their tiebreakers.

This sparked a decent amount of outrage in the Magic Online community. Unfortunately I haven't been able to find discussions of the event since it was several years ago, but I think you can imagine how poor it looks for a tournament to have that happen. What do you say to that person that got left out? "X-1 isn't good enough" is an absurd statement, as only one person did better than that. The only thing that you can really say is that the tournament structure screwed them out of top 8, and there should have been more rounds.

When I heard about this, I ran some simulations of swiss tournaments to see how likely it was that a 9-1 got left out of top 8. Here's what the chances look like for a 744 player 10 round tournament: (Due to there sometimes being an odd number of players at a particular record, there will be people "paired up" and "paired down" so depending on who wins, you can get different distributions at the end of the tournament. I've assumed here that in such a situation both players are equally likely to win).

41.94%: 1 10-0, 7 9-1s, no 9-1s miss cut
18.26%: 1 10-0, 6 9-1s, 1 8-2, with 32 8-2s missing cut.
18.26%: 7 9-1s, 1 8-2, with 31 8-2s missing cut.
5.27%: 1 10-0, 6 9-1s, 1 8-2, with 31 8-2s missing cut.
2.78%: 7 9-1s, 1 8-2, with 30 8-2s missing cut.
2.56%: 1 10-0, 6 9-1s, 1 8-2, with 30 8-2s missing cut.
2.56%: 7 9-1s, 1 8-2, with 29 8-2s missing cut.
2.56%: 1 10-0, 5 9-1s, 2 8-2s, with 30 8-2s missing cut.
2.56%: 6 9-1s, 2 8-2s, with 29 8-2s missing cut.
1.76%: 1 10-0, 7 9-1s, with 1 9-1 missing cut.
0.88%: 8 9-1s, no 9-1s miss cut.
0.29%: 1 10-0, 5 9-1s, 2 8-2s, with 31 8-2s missing cut.
0.29%: 6 9-1s, 2 8-2s, with 30 8-2s missing cut.

The 1.76% is the killer. In roughly 1 out of 56 tournaments of that size, someone who is 9-1 will miss out on the top 8, meaning that these round thresholds can very realistically lead to our goals being failed.

Magic Online changed these round thresholds in 2013. I do not know the exact date of the change because it was put into place quietly, but according to my emails with my friends I noticed it on August 25, 2013. The round thresholds from then are still in effect, and they successfully guarantee that no X-1s miss top 8 (up through several thousand players). Here are the new round thresholds:

17-32 players: 5 rounds
33-64 players: 6 rounds
65-128 players: 7 rounds
129-212 players: 8 rounds
213-384 players: 9 rounds
385-672 players: 10 rounds
673-1248 players: 11 rounds
1249-2272 players: 12 rounds
2273+ players: 13 rounds

When we're closer to the lower end of these brackets, however, we do run into the luck-feeling tiebreakers. In a 385 player tournament with 10 rounds, in more than 90% of cases we will have between 15 and 18 X-2s, with 4 or 5 of them making top 8. So again we run into the situation where half of the top 8 has the same record as twice as many people that did not make it. These brackets were very reasonably constructed based on the more important requirement that no X-1 fail to make top 8, so it serves as a good illustration of how hard it is to make a top cut feel non-random at all tournament sizes.

Different top cut sizes

Now let's take a look at how Pokemon does top cuts. Here's how the number of rounds and top cut sizes were determined for the recent Sheffield regionals:

33-64 players: 6 rounds, cut to top 8
65-128 players: 7 rounds, cut to top 8
129-226 players: 8 rounds, cut to top 8
227-256 players: 8 rounds, cut to top 16
257-409 players: 9 rounds, cut to top 16
410-512 players: 9 rounds, cut to top 32
513+ players: 10 rounds, cut to top 32

Well, right off the bat there's one big cause for alarm. That 226 person tournament that cuts to top 8 after 8 rounds? That's exactly what Magic used to do, until the threshold got changed to 212. In fact, in a 226 person tournament with 8 rounds, there will be 9 people with a 7-1 or better record 2.39% of the time. So there's definitely room for improvement. We can see that the number of rounds is always set so that there is 0 or 1 undefeated player, and then the size of top cut is expanded so that it will cut in the middle of X-2. Let's take a look to see how this pans out.

Other than the snafu at 8 rounds, X-1s are always guaranteed to make top cut. The most interesting thing is the number of X-2s that make top cut out of the total number of X-2s. Here's a brief summary:

For a 227 player tournament, there are 23-25 X-2s, and 7-10 of them make top 16.
For a 256 player tournament, there are 28 X-2s, and 7 of them make top 16.
For a 257 player tournament, there are 17-19 X-2s, and 10-12 of them make top 16.
For a 409 player tournament, there are 26-30 x-2s, and 7-10 of them make top 16.
For a 410 player tournament, there are 26-30 X-2s, and 23-26 of them make top 32 (but there is always an X-2 that misses)
For a 512 player tournament, there are 36 X-2s, and 22 of them make top 32.

These round and top cut determinations almost seem like they're conspiring to make the X-2 situation as random as possible. In most of the tournament sizes, you have somewhere between a \( \frac{1}{4} \) and \( \frac{2}{3} \) chance to make top cut with X-2, though notably in a 410 player tournament almost every X-2 makes it with a few that get screwed.

Contrast this to Magic. In Magic, we saw that near the bottom of the range of players for a given number of rounds, you'd have about a \( \frac{1}{3} \) chance to make top 8 being X-2, while near the top of the range you'd have more like a \( \frac{1}{15} \) chance. That situation encourages the idea that making the top cut with an X-2 record means that you got lucky, and that you were expected to not make it.

On the other hand, the Pokemon system strongly suggests that if you go X-2 you should have to roughly win a coin flip to make top cut. What kind of nonsense is that? It just leads to tournaments feeling arbitrary and even more luck based than the game they're based around. The idea of increasing top cut instead of the number of rounds in order to ensure that all X-1s make top cut is interesting, but in my opinion the current implementation is extremely poor, and the number of rounds for top 16 cuts needs to be much more finely tuned.

Here are my suggestions for how to properly run swiss. First, decide on what sort of record you want to expect from the majority of your top cut. For events that run over two days, like Magic Grand Prixs, you can run many rounds and have most X-3s, and maybe some X-4s make it. For events that are a single day, you're probably looking at the majority of top cut either being X-1s or X-2s, depending on how you structure things.

If you want top cut to be mostly X-1s, then I believe that Magic's system is quite good. It ensures that every X-1 player will make cut, with a few select X-2s to fill out the single elimination bracket. You could cut directly to top 1, but then some players will get byes in the first round of single elimination more or less at random, which I believe would feel worse than having a lucky X-2 player play against them for the spot.

If you want top cut to be mostly X-2s, however, then unless you're ready to increase the number of rounds to get a good cut near a power of two, I would say cut directly to X-2s. Seed the single elimination bracket by tiebreakers, so byes will still get distributed somewhat randomly, but in this case all of the X-1s are guaranteed a bye (the "play in" round is entirely X-2s), but every X-2 has a chance to fight for the championship, even if they have a slightly tougher bracket.

This becomes more complicated when draws are commonplace, such as in paper Magic, but the general ideas still hold, and the principles I've outlined here should extend to those sorts of tournaments as well. I've put the code to compute these probabilities here on GitHub if you are interested in looking into this yourself.