## How to win at the game Pickomino

Pickomino (known as
*Regenwormen* in Dutch, *Heckmeck* in German) is a dice game in which players
try to get as many worms as possible. It is largely a game of chance, but there
are some tactics involved, which always leaves me wondering: did I make the
optimal choice? Only one way to find out: write an AI player that knows how to
play optimally.

The code is up on GitHub. I wrote this in C++ because I wanted to get some practice with C++11, and also because I once tried to do a Python version that turned out to be too slow. Getting back into C++ after years of neglect turned out to be interesting, but that’s a topic for another day.

### The rules

The game contains a set of 16 tiles, each with a point value and up to 4 worms drawn on them. They all start out up for grabs in the middle of the table:

The player who gathers the most worms wins.

The point value on the tile shows how many points you need to roll with the dice in order to take that tile. There are 8 dice, and instead of a 6 they have a picture of a worm on them, which is worth 5:

You repeatedly roll the dice, initially all 8 of them. After a roll, you can choose which “side” of the dice you want to keep, for example all fives, all threes or all worms. You set all these aside, and keep a running count of the number of points they’re worth, where a worm counts for 5 points. But that sum is only worth something if it contains at least one worm. You cannot take a side that you’ve already taken before.

You then roll the remaining dice again, and continue in this way until you decide to quit. You cannot take a side which you already set aside before, so it might be wise to pass on a 5 in case you roll two 5s later.

After taking some dice, you can decide whether to quit. When you quit, you can take a tile from the table worth at most the point value of the dice you have collected (again, as long as you have at least one worm). You put that tile on top of your stack; ordering is important here.

You cannot quit after you’ve rolled, so if you end up unable to take any dice (e.g. you rolled 3 and 4, but you have already taken 3s and 4s), you lose that turn and have to return the top tile of your stack back to the table. (Additionally, the highest tile on the table gets turned over and is out of the game. This makes sure that the game will end at some point, but isn’t very important for the rest of this article.)

Instead of taking a tile from the centre of the table, it’s also possible to steal other people’s tiles; more about that later.

### Breaking up the problem

There are some interesting decisions to be made during your turn. For example, when your initial roll contains three 5s and two worms, which do you take? The 5s are worth more, but you eventually need the worm in order to be allowed to take any tile. Or, suppose you’ve ended up with a point value of 23, enough to take a tile, and three dice remain, do you roll again to try for a tile with more worms on it, or do you quit?

We can decompose the problem as follows. For each possible point value (from 0 all the way to 8x5 = 40 points, we can calculate how many worms you’d get. In the first turn, the table looks like this:

```
points 0 .. 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40
#worms 0 .. 0 1 1 1 1 2 2 2 2 3 3 3 3 4 4 4 4 4 4 4 4
```

However, the table changes when tiles have been taken from the table. And, more
interestingly, also when you have some tiles: when you can’t take anything
(i.e. 0 worms), you actually lose your topmost tile, so you get *negative*
worms. For example, once the tiles marked `*`

have been taken, and your topmost
tile has 2 worms on it, the table would look like this:

```
taken * * * * * * * *
points 0 .. 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40
#worms -2 .. -2 -2 -2 -2 1 1 1 1 2 3 3 3 3 3 4 4 4 4 4 4 4
```

Note how some values have been reduced, e.g. 27, because this tile is no longer available and the next lowest tile has only 1 worm on it.

This table provides us with a mapping from the number of dice points to the number of worms we get if we land exactly on that number of dice points. So from here on out, we can stop worrying about the existence and location of tiles, and just try to aim for the number of dice points that optimizes the number of worms as listed in this table.

### Recursion

My initial approach to come up was naive recursion. We enumerate all possible dice rolls along with their probability (calculated using the multinomial coefficient). For each possible roll, we consider all possible choices we can legally make: take all 1s, take all 2s, and so on. For each of these choices, we add up our points and look at the table above; this is the expected value if we were to quit now. We also consider the possibility of rolling again, by entering the recursive calculation from the top with the reduced set of dice.

How fast would this be? These are the possible rolls with 1 through 8 dice:

```
#dice 1 2 3 4 5 6 7 8
#rolls 6 21 56 126 252 462 792 1287
```

That doesn’t look so bad! However, consider that in the worst case, we remove
only one die each roll; so we need to multiply all these numbers together to
produce an upper bound on how many cases we need to crunch through. That number
is 105505563669682176, or about 10^{17}; far too large to crunch
through for a desktop computer in reasonable time.

But there is some hope still. There’s no way you can remove only one die each time, because there are only 6 sides and so you can remove dice up to 6 times, not 8. Also, especially in the first roll, you will often remove more than one die at once. Furthermore, many branches of the tree get pruned because you lose and there’s no way to continue.

My hope was that this would make the probability tree small enough to be computable in a second or two, but after coding it up it quickly became clear that this wasn’t the case. With 6 dice, the answer took about a second to appear, with 7 it took longer than I cared to wait for, and I didn’t even try with all 8 of them. Back to the drawing board.

### Dynamic programming

When confronted with a recursive problem that is too slow, many a computer scientist will say “I know, I’ll use dynamic programming!” I believe this is technique is quite mis-named; the easiest way to implement it is just as the same recursive algorithm, but with memoization. For each recursive call, we store the eventual answer, and when we get that call again with the same arguments, we just return the answer from the cache.

In this case, that works well, because the only thing that represents our “state” is the (multi)set of dice we have already taken. From this, we know both our current score and the number of remaining dice, so there’s no need to keep track of those explicitly. Moreover, it doesn’t matter in which order those dice were taken, so we cut out a lot of repeated calculations. Our “cache” becomes a mapping from the dice taken to the expected number of worms in that situation.

With this approach, the program became fast enough that two AI players could finish five entire games against each other per second. Here’s a typical turn that the bot would take:

```
Alice's turn
"When rolling, I expect 1.13115 worms."
"When quitting, I expect -3 worms."
Alice rolled 11112345
"When taking 1, I expect -2.77323 worms."
"When taking 2, I expect 0.119708 worms."
"When taking 3, I expect 0.214761 worms."
"When taking 4, I expect 0.0548401 worms."
"When taking 5, I expect -0.0356373 worms."
Alice took 3, now has 3
"When rolling, I expect 0.214761 worms."
"When quitting, I expect -3 worms."
Alice rolled 11335ww
"When taking 1, I expect -1.90285 worms."
"When taking 5, I expect -1.05044 worms."
"When taking w, I expect 0.246571 worms."
Alice took w, now has 3ww
"When rolling, I expect 0.246571 worms."
"When quitting, I expect -3 worms."
Alice rolled 12234
"When taking 1, I expect -0.632553 worms."
"When taking 2, I expect -0.691101 worms."
"When taking 4, I expect -0.537772 worms."
Alice took 4, now has 34ww
"When rolling, I expect -0.537772 worms."
"When quitting, I expect -3 worms."
Alice rolled 145w
"When taking 1, I expect -0.503086 worms."
"When taking 5, I expect -0.532793 worms."
Alice took 1, now has 134ww
"When rolling, I expect -0.503086 worms."
"When quitting, I expect -3 worms."
Alice rolled 23w
"When taking 2, I expect -1.47222 worms."
Alice took 2, now has 1234ww
"When rolling, I expect -1.47222 worms."
"When quitting, I expect -3 worms."
Alice rolled 34
Alice died
Alice scored 0 but had no worms
Alice returned 30|WWW to the table
Turning over highest tile, 32|WWW
Remaining tiles: 25|WW 30|WWW 31|WWW
```

And how does that look against a mere human? Well, here are the scores of one fairly representative game:

```
Alice 10 26|WW 25|WW 21|W 27|WW 22|W 23|W 24|W
Thomas 3 29|WWW
```

I’m not willing to play for long enough to get a statistically significant result, but it certainly looks like the bot can beat me.

### What about stealing?

As alluded to, you can also steal worms from other players. You can only steal
the top tile of their stack, and only if you land *exactly* on that number. But
there’s a problem: if our AI just tries to maximize the number of worms it
gets, how do we account for the number of worms that the other player loses?
Surely that must count for something?

In theory, it should be preferred to steal from players who are ahead of you, so as to overtake them and increase your chance of winning. Stealing from players who are behind is less useful. Modelling this is not trivial, and would probably require simulating an entire game instead of just one turn.

So I chose to make a simplification, based on the following observation. Suppose you’re playing a two-player game against Alice. Stealing 2 worms from Alice increases your score by 2, while decreasing Alice’s score by 2. The difference between Alice and you is increased by 4, so this is essentially the same as taking a tile from the table worth 4 worms. If we have a chance to steal from Alice, we should consider that worth double the points.

When Bob is also joining the game, stealing 2 worms from Alice still reduces
Alice’s score, but it doesn’t reduce Bob’s score. On average, their scores are
reduced by only 1, so we should consider this move worth 3 points. In general,
in a `p`

-player game, stealing `w`

worms from another player is worth
`w + w / (p - 1)`

points to us.

### Observations

I made the bot print out its “thoughts” while it ran, and it gave me some interesting insights into the game. I wonder if the game designers were aware of all the probabilities and expected values. Probably.

The first thing to note is that the expected value at the start of the game, before the first roll, is about 1.6 worms. This matches my observations in real games. The tiles with 3 and 4 worms are there to lure you, but they are not often taken.

When you have a high-value tile at the top of your stack, it is quite common for the expected value to go negative. In effect, the only winning move is not to play. Sounds familiar?

A similar thing happens when the lower-valued tiles have disappeared from the table. You frequently see this in real games when people play cautiously: first everyone happily gathers up the low tiles, then they all lose them because only high tiles are left.

The effect of the way I modelled stealing appears to be a bot that plays extremely aggressively, especially in a two-player game. More often than not, whenever I managed to get a tile, the bot would steal it from me in its next turn. That’s what I get for making it too clever!

### A heuristic

After making a bot that plays a perfect game, I wondered if it could help me improve my strategy as a human. What I wanted was a simple heuristic, no more than a few rules of thumb, that would result in near-optimal play. Would such a thing be possible?

I called this second bot Bob. As a first attempt, I gave him the following strategy:

- Take the highest die side that you can (favour worms over 5s, 5s over 4s, and so on).
- Quit as soon as you can take a tile.

Out of 100 games:

```
Alice wins 79 times
Bob wins 18 times
There were 3 ties
```

You don’t need a statistical analysis to see that Bob isn’t quite so good. One of his problems is that when he rolls three 5s and one worm, he takes the worm. So let’s try this instead:

- Take the die side that contributes the most points (favour two 4s over one worm, two 3s over one 5, and so on).
- Quit as soon as you can take a tile.

Now Alice’s victory looks slightly less crushing already:

```
Alice wins 71 times
Bob wins 24 times
There were 5 ties
```

But while in the previous version Bob would greedily take worms, now he might end up without them, so unable to take any tile. Let’s try to fix that:

- Take worms if you can.
- Otherwise, take the die side that contributes the most points.
- Quit as soon as you can take a tile.

This actually didn’t make much of a difference:

```
Alice wins 70 times
Bob wins 26 times
There were 4 ties
```

The cause is that Bob will now take worms too greedily. Maybe he should try to push his luck while he still has many dice, but start aiming for worms when the situation becomes dire:

- On or after the third roll, take worms if you can.
- Otherwise, take the die side that contributes the most points.
- Quit as soon as you can take a tile.

This seems to help a bit:

```
Alice wins 62 times
Bob wins 34 times
There were 4 ties
```

That’s how good I managed to get it with a heuristic of this shape. The obvious remaining problem is the last rule, which makes Bob far too conservative, especially in the case he has nothing to lose. To factor this in, we need to look at the number of worms we stand to lose. Here’s what I tried:

- On or after the third roll, take worms if you can.
- Otherwise, take the die side that contributes the most points.
- Quit as soon as the number of worms on top of your stack plus the number of worms you’d get from quitting is at least 2.

This will make Bob more risk-seeking when he has less to lose. Does it work?

```
Alice wins 73 times
Bob wins 22 times
There were 5 ties
```

Not quite.

While I myself play the game, I sometimes calculate the probability of losing on the next roll, based on the number of dice sides taken and the number of remaining dice. But I decided that goes too far for a simple heuristic. There may be some simple way to estimate it, but I didn’t look into that.

I’d love to try some kind of genetic algorithm to evolve small programs that become increasingly better, and can still be written down in a sentence or two; unfortunately, I don’t have time for that either.

If you have ideas for heuristics to try, post them in the comments!

### Does it matter who starts?

In the above simulations, I took care to alternate the starting turn between the two bots, in case the starting player has an advantage. But does it really matter? Here, I had two identical bots play 1000 games against each other, both with the optimal strategy, but Alice always plays first:

```
Alice wins 501 times
Bob wins 456 times
There were 43 ties
```

It seems that Alice might have a slight advantage, but I’m not sure that this result is statistically significant. However, it would make sense: Alice gets either the same number of turns, or one more than Bob, so she should have a better chance at winning.

### A game of chance?

While trying to find such a heuristic, I also tried to another question: is Pickomino essentially a game of chance, or of skill? In other words, to what extent do your choices (as long as they are reasonable) affect the eventual outcome? If the outcome depended little on the chosen strategy, I could classify it as a game of chance. But if a better player could consistently beat a worse one, skill would have been proven to make a difference.

Given that a very simple heuristic managed to beat a perfect AI a third of the time, I’m willing to bet four worms on the hypothesis that a slightly better tweaked, but still very simple, heuristic would beat it nearly half of the time. Any player can be told that heuristic in ten seconds and will be able to play a near-perfect game. Therefore, I would like to conclude that Pickomino is probably a game of chance.