11 June 2014

Imagine you're working on a project and there are many things that can go wrong, and if any of them goes wrong then the project fails. How can you maximize the chance of success? If there are \( n \) possible failure modes each with probabilty \( p \), then the overall probability of failure is somewhere between \( p \) and \( np \). If all the failures were independent, then the total probability of failure would be \( 1 - (1-p)^n \), which for small \( p \) is on the \( np \) side of the spectrum. To maximize the chance of success, we'd want to be on the \( p \) end of the spectrum, which means that when failure happens, everything that could go wrong, does. In other words, we want the failure modes to be correlated with each other.

A great illustration of this idea is this reasonably well-known puzzle.

There are 100 prisoners numbered 1 through 100. They are led to a room with 100 boxes, each containing a slip of paper with one of their numbers on it. Each number is in exactly one box. Each prisoner goes into the room and opens 50 boxes, then returns the room to its original state and leaves. The prisoners cannot communicate during this process but can devise a strategy beforehand. They are set free if every prisoner opens the box with his or her number in it. How can they maximize their chance of going free?

The solution is quite beautiful. Prisoner \( i \) starts by opening the \( i \)th box, and then when they find the number \( j \), they move to the \( j \)th box and repeat. In this way, prisoner \( i \) will find their number exactly if \( i \) is in a cycle of length at most 50. Notice that this means that if one prisoner fails to find their number, that means that every other prisoner in the same cycle will also fail, so when the prisoners lose, it is because at least 51 of them failed to find their number! By correlating their failure modes in this way, even though each of the 100 prisoners only has a 50% chance of succeeding individually, their collective chance of success is still over 30%!

I used this idea in a "real world" problem a few months ago. The problem arises in the context of speedrunning Pokemon Red. Very early on, we want to catch a level 3 or 4 NidoranM. How can we find one as fast as possible? There are two things that need to happen for us to succeed. First, we need to find a wild pokemon. Second, that wild pokemon needs to be a NidoranM. If these two events were independent, then there wouldn't be anything we can do, except to find the fastest way to encounter wild pokemon. However, that ends up not being the case.

The game maintains two random bytes H_RAND1 and H_RAND2. They are updated with the following routine:

GenRandom_: ; 13a8f (4:7a8f)
; generate a random 16-bit integer and store it at $FFD3,$FFD4
    ld a,[rDIV]
    ld b,a
    ld a,[H_RAND1]
    adc b
    ld [H_RAND1],a
    ld a,[rDIV]
    ld b,a
    ld a,[H_RAND2]
    sbc b
    ld [H_RAND2],a
    ret

rDIV is a hardware register that increases by one every 256 clock cycles. Fewer than 256 clock cycles pass between the two times that rDIV is loaded in this segment, so the value of H_RAND1 + H_RAND2 stays almost constant, only changing if rDIV changed in that region or if the addition sets the carry flag. In the pokemon world, H_RAND1 + H_RAND2 is known as the DSum.

Now let's look at the code for generating an encounter

    ; b contains the current encounter rate
    ld a, [H_RAND1] ; $FF00+$d3
    cp b
    jr nc, .asm_13912
    ld a, [H_RAND2] ; $FF00+$d4
    ld b, a
    ld hl, $7918
.asm_138d0
    ld a, [hli]
    cp b
    jr nc, .asm_138d7
    inc hl
    jr .asm_138d0

This code is less self-explanatory. First, the game loads H_RAND1 and decides to give you an encounter if it is below the encounter rate (so an encounter rate of 25 corresponds to a 25/256 chance that you get an encounter, for example). Then it chooses what type of encounter you get with the value of H_RAND2. The bytes at $7918 are alternately a threshold for the slot and an offset into the list of species and level that you can find in the grass. The end result is that H_RAND2 maps to one of ten encounter slots in the following way:

Slot H_RAND2
1 0-50
2 51-101
3 102-140
4 141-165
5 166-190
6 191-215
7 216-228
8 229-241
9 242-252
10 253-255

In our case, the level 3 NidoranM is encounter slot 2, the level 4 is slot 4, and the encounter rate is 25. Now consider what happens if the DSum is 220. If we get an encounter, then H_RAND1 is between 0 and 24, so H_RAND2 is therefore between 196 and 220. That means that we can only possibly get encounter slots 6 or 7, neither of which is the NidoranM that we want!

Of course, when playing the game we don't know what the DSum is exactly, but we can turn this idea on its head to figure it out quite closely. First, we walk in the grass until we get an encounter. Then since we know H_RAND1 is between 0 and 24 and the type of pokemon we find tells us a range for H_RAND2, that gives us a range for the DSum.

After measuring how much the DSum changes over the course of running from an encounter and how quickly it changes over time, to find the desired NidoranM, we do the following procedure.

  1. Get an encounter
  2. If it's a level 3 or 4 NidoranM, we're done.
  3. Otherwise, run from it and start alternately walking in and out of the grass, in a pattern that depends on what encounter we got.
  4. Repeat

Alternatively, rather than walking out of the grass you can simply stop moving, but since each step takes the same amount of time, continuously walking makes for a convenient timer. The result is that since getting an encounter and running from it is significantly longer than the time it takes for the DSum to cycle back around, by using this idea the time it takes to find an appropriate NidoranM was drastically reduced.