## Bayesian Dice

This article is available in a Jupyter notebook: click here to run it on Colab.

I’ve been enjoying Aubrey Clayton’s new book *Bernoulli’s Fallacy*. The first chapter, which is about the historical development of competing definitions of probability, is worth the price of admission alone.

One of the examples in Chapter 1 is a simplified version of a problem posed by Thomas Bayes. The original version, which I wrote about here, involves a billiards (pool) table; Clayton’s version uses dice:

Your friend rolls a six-sided die and secretly records the outcome; this number becomes the target

T. You then put on a blindfold and roll the same six-sided die over and over. You’re unable to see how it lands, so each time your friend […] tells youonlywhether the number you just rolled was greater than, equal to, or less thanT.Suppose in one round of the game we had this sequence of outcomes, with G representing a greater roll, L a lesser roll, and E an equal roll:

G, G, L, E, L, L, L, E, G, L

Clayton,Bernoulli’s Fallacy, pg 36.

Based on this data, what is the posterior distribution of *T*?

## Computing likelihoods

There are two parts of my solution; computing the likelihood of the data under each hypothesis and then using those likelihoods to compute the posterior distribution of *T*.

To compute the likelihoods, I’ll demonstrate one of my favorite idioms, using a meshgrid to apply an operation, like `>`

, to all pairs of values from two sequences.

In this case, the sequences are

`hypos`

: The hypothetical values of*T*, and`outcomes`

: possible outcomes each time we roll the dice

hypos = [1,2,3,4,5,6] outcomes = [1,2,3,4,5,6]

If we compute a meshgrid of `outcomes`

and `hypos`

, the result is two arrays.

import numpy as np O, H = np.meshgrid(outcomes, hypos)

The first contains the possible outcomes repeated down the columns.

O

array([[1, 2, 3, 4, 5, 6], [1, 2, 3, 4, 5, 6], [1, 2, 3, 4, 5, 6], [1, 2, 3, 4, 5, 6], [1, 2, 3, 4, 5, 6], [1, 2, 3, 4, 5, 6]])

The second contains the hypotheses repeated across the rows.

H

array([[1, 1, 1, 1, 1, 1], [2, 2, 2, 2, 2, 2], [3, 3, 3, 3, 3, 3], [4, 4, 4, 4, 4, 4], [5, 5, 5, 5, 5, 5], [6, 6, 6, 6, 6, 6]])

If we apply an operator like `>`

, the result is a Boolean array.

O > H

array([[False, True, True, True, True, True], [False, False, True, True, True, True], [False, False, False, True, True, True], [False, False, False, False, True, True], [False, False, False, False, False, True], [False, False, False, False, False, False]])

Now we can use `mean`

with `axis=1`

to compute the fraction of `True`

values in each row.

(O > H).mean(axis=1)

array([0.83333333, 0.66666667, 0.5 , 0.33333333, 0.16666667, 0. ])

The result is the probability that the outcome is greater than *T*, for each hypothetical value of *T*. I’ll name this array `gt`

:

gt = (O > H).mean(axis=1)

The first element of the array is 5/6, which indicates that if *T* is 1, the probability of exceeding it is 5/6. The second element is 2/3, which indicates that if *T* is 2, the probability of exceeding it is 2/3. And do on.

Now we can compute the corresponding arrays for less than and equal.

lt = (O < H).mean(axis=1)

array([0. , 0.16666667, 0.33333333, 0.5 , 0.66666667, 0.83333333])

eq = (O == H).mean(axis=1)

array([0.16666667, 0.16666667, 0.16666667, 0.16666667, 0.16666667, 0.16666667])

In the next section, we’ll use these arrays to do a Bayesian update.

## The Update

In this example, computing the likelihoods was the hard part. The Bayesian update is easy. Since *T* was chosen by rolling a fair die, the prior distribution for *T* is uniform. I’ll use a Pandas `Series`

to represent it.

import pandas as pd pmf = pd.Series(1/6, hypos) pmf

1 0.166667 2 0.166667 3 0.166667 4 0.166667 5 0.166667 6 0.166667

Now here’s the sequence of data, encoded using the likelihoods we computed in the previous section.

data = [gt, gt, lt, eq, lt, lt, lt, eq, gt, lt]

The following loop updates the prior distribution by multiplying by each of the likelihoods.

for datum in data: pmf *= datum

Finally, we normalize the posterior.

pmf /= pmf.sum()

1 0.000000 2 0.016427 3 0.221766 4 0.498973 5 0.262834 6 0.000000

Here’s what it looks like.

As an aside, you might have noticed that the values in `eq`

are all the same. So when the value we roll is equal to *T*, we don’t get any new information about *T*. We could leave the instances of `eq`

out of the data, and we would get the same answer.