31 August 2016

Last week I wrote a bit about how I structured the game logic for the board game Splendor. I left off before talking about how to represent getting input from a player. Today I'll be talking about two possible approaches and why you might prefer one over the other.

Explicit State Machine

In addition to having fairly obvious game state (what's on the board, players' hands, etc), board games tend to have fairly explicit decision points. For example, Dominion has three phases to a turn: the Action phase, the Buy phase, and the Clean-up phase.

With a detailed enough description of phases, you can think of them as representing the game as a state machine. Each state is a time where the game needs input from a player, and then the transition function uses that input to advance the state until the next input is required, or the end of the game.

For Splendor specifically, there are only a few types of actions that you can ever take, and these fall into three classes based on when you take them:

Actions that you choose as your turn

  • You can take one chip of three distinct non-gold types (or fewer if there aren't three colors available).
  • You can take two chips of the same color.
  • You can buy a card that's either on the board or in your reserve.
  • You can reserve a card from the board.
  • You can reserve the top card of one of the decks.

When you have too many chips

  • You select chips to discard so that you have 10 chips remaining.

When you qualify for more than one noble at the same time

  • You select one of the nobles to gain.

So we could represent the player actions and possible wait-points as a couple of types:

data Action
    = Take3 (Maybe Color) (Maybe Color) (Maybe Color)
    | Take2 Color
    | Reserve CardId
    | ReserveTop Int
    | Buy CardId
    | Discard (Map ChipType Int)
    | SelectNoble NobleId

data ActionRequestType
    = TurnRequest
    | DiscardChipRequest Int
    | SelectNobleRequest

data ActionRequest
    = ActionRequest
        { _requestPlayer :: Int
        , _requestType :: ActionRequestType
        }

Then by adding an ActionRequest field to GameState, we have essentially written the state for our state machine. What's left is the transition function, which would be a function that looks like:

runAction :: Int -> Action -> GameState -> Maybe (GameResult, GameState)
runAction idx a =
    runStateT $ do
        curTurn <- use (currentRequest . requestPlayer)
        when (idx /= curTurn) illegal
        curReqType <- use (currentRequest . requestType)
        case curReqType of
            TurnRequest -> do
                case a of
                    Take3 c1 c2 c3 -> do
                        ...
                    Take2 c -> do
                        ...
                    Reserve cId -> do
                        ...
                    ReserveTop tier -> do
                        ...
                    Buy cId -> do
                        ...
                    _ -> illegal
            DiscardChipRequest n -> do
                case a of
                    Discard chips -> do
                        ...
                    _ -> illegal
            SelectNobleRequest -> do
                case a of
                    SelectNoble nId -> do
                        ...
                    _ -> illegal

This works, but it doesn't look at all like how the rules are usually presented. The TurnRequest case isn't too far, but generally we don't think of discarding chips or selecting nobles as distinct phases of a turn, but as things you do conditionally at the end of your turn.

Thinking about Dominion again, I mentioned that the turn is composed of three phases, but there's a plenty of game actions that feel similar to discarding chips and selecting nobles. For example, how would you implement the reaction part of Moat? The choice to reveal a reaction card isn't like the normal choices that you take during a turn. Instead, it's a choice you have when a different player plays an attack action. It's definitely possible to introduce a reaction phase and to pingpong back and forth between the action phase and the reaction phase, but it feels a little bit weird.

Again, we might think about the simple case of writing a local game where we can just suspend the program to wait for user input. We probably wouldn't be inclined to write code that looks like the mass of case statements above, and instead something like:

runTurn = do
    turnAction <- getTurnAction curPlayer
    case turnAction of
        Take3 c1 c2 c3 -> do
            ...
        Take2 c -> do
            ...
        Reserve cId -> do
            ...
        ReserveTop tier -> do
            ...
        Buy cId -> do
            ...
    chips <- use (playerStates . ix curPlayer . heldChips)
    when (sum chips > 10) $ do
        discards <- getDiscards (sum chips - 10) curPlayer
        ...
    matchingNobles <- computeMatchingNobles
    when (length matchingNobles > 1) $ do
        selectedNoble <- getSelectedNoble curPlayer
        ...

If you were writing code this way, the line of code you're on effectively functions as the state in the state machine representation. So while it makes the structure less explicit, it presents the structure in a way that is more along the lines of how we describe board game rules to other humans.

The Free Monad Approach

We can represent this kind of computation that might suspend to wait for a certain input as a data type.

data SplendorM r
    = Finished r
    | WaitingForTurn Int (TurnAction -> SplendorM r)
    | WaiitngForDiscard Int (Map ChipType Int -> SplendorM r)
    | WaitingForSelectNoble Int (NobleId -> SplendorM r)

instance Functor SplendorM where
    fmap f m =
        case m of
            Finished r -> Finished (f r)
            WaitingForTurn p cont -> WaitingForTurn p (fmap f . cont)
            WaitingForDiscard p cont -> WaitingForDiscard p (fmap f . cont)
            WaitingForSelectNoble p cont -> WaitingForSelectNoble p (fmap f . cont)

instance Applicative SplendorM where
    pure r = Finished r
    mf <*> mx = mf >>= (\f -> mx >>= f)

instance Monad SplendorM where
    mx >>= f =
        case mx of
            Finished x -> f x
            WaitingForTurn p cont -> WaitingForTurn p (\a -> cont a >>= f)
            WaitingForDiscard p cont -> WaitingForDiscard p (\d -> cont d >>= f)
            WaiitngForSelectNoble p cont -> WaitingForSelectNoble p (\n -> cont n >>= f)

The data type SplendorM r represents a computation within the game of Splendor that eventually yields a value of type r. Since we're operating within the game, it supports the ability to ask a player to take their turn, discard chips, or select a noble. It encodes this by including the rest of the computation as a continuation every time it pauses for input. So for example, the value WaitingForTurn p cont is asking for player p to take their turn, and then once they've provided a TurnAction, which we'll name a, the rest of the computation is given by cont a.

This allows us to write game logic code in more or less the sequential way that we thought about earlier:

getTurnAction :: Int -> SplendorM TurnAction
getTurnAction player = WaitingForTurn player Finished

getDiscards :: Int -> SplendorM (Map ChipType Int)
getDiscards player = WaitingForDiscard player Finished

getSelectedNoble :: Int -> SplendorM NobleId
getSelectedNoble player = WaitingForSelectNoble player Finished

runTurn player = do
    turnAction <- getTurnAction player
    case turnAction of
        Take3 c1 c2 c3 -> do
            ...
        Take2 c -> do
            ...
        Reserve cId -> do
            ...
        ReserveTop tier -> do
            ...
        Buy cId -> do
            ...
    chips <- use (playerStates . ix curPlayer . heldChips)
    when (sum chips > 10) $ do
        discards <- getDiscards player
        ...
    matchingNobles <- do
        ...
    when (length matchingNobles > 1) $ do
        selectedNoble <- getSelectedNoble curPlayer
        ...

This is really cool, and you could generalize the same construction for more classes of requests and their corresponding return types. This style of code feels very natural to write and to read because it generally follows the temporal progression of the game.

Which one to use?

In the end, I decided that I preferred the explicit state machine model in this case. While the free monad approach allows you to write more sequential code, using the state machine approach means that the statefulness of your game is all explicit in your data types. Additionally, the free monad approach has the downside that your intermediate states have functions in them. Function types don't serialize nicely, so if later on I decide that I want to be able to write down all the games currently in progress, restart the server, and then have the server load them all back up, that will be much easier with the explicit approach.

On the other hand, this feels a little bit disappointing for developing new games. When writing the Splendor server, I had the advantage of knowing exactly what the rules of the game are ahead of time. When you're developing games, you don't know exactly what the rules are going to be, and adding or removing various action steps should be quick.

So the ideal solution, to me, would be one where you can write code without needing to explicitly write down all of your state, but then there code builds up that explicit state machine for you so that you retain those benefits like being able to serialize the state and read it back. You could get halfway with a typeclass approach, where both a free monad type and a type including explicit states have matching instances. In that case, you could do your development with the free monad implementation, and then once the rules are solidified, swap it out for the explicit type.

On the other hand, that's still a lot of manual work to enumerate all of the possible actions that you might be taking, and the complexity will ramp up quickly, especially if the game responds differently in different cases. Sometime in the future, if I'm designing a board game or any other type of game that follows this discrete state machine pattern, I will consider the problem more and see if there's a way to get the best of both world.