Predictable Poverty in Sequential Decision Problems

Wells (forthcoming) has an really nice example of a sequential decision problem in which an evidential decision theorist will end up predictably poorer than a causal decision theorist. Wells thinks that this case shows that we should reject evidential decision theory. I agree that we should reject evidential decision theory, but I don’t think that a proponent of CDT should use Wells’s case to argue for this conclusion.

The reason is that there are sequential decision problems in which a causal decision theorist will end up predictably poorer than an evidential decision theorist, even when both the causal decision theorist and the evidential decision theorist face this decision problem in the same circumstances. If predictable relative poverty like this gives a sufficient reason to reject EDT, then it likewise gives a sufficient reason to reject CDT. (I think we should tollens—though I won’t be making that case here.)

Consider the following decision problem, adapted from Hunter & Richter (1978):


Hunter Richter
You are given the opportunity to play a game. You can, if you wish, take either box $A$, box $B$, or box $C$. Or you can decide to not play and not take any box ($N$). A reliable predictor made a prediction about how you would choose, and allocated prizes in the boxes according to the following rules: if they predicted that you would take $A$, then they put 100 dollars in box $A$ and left a bill for 100 dollars in boxes $B$ and $C$ (so that, if you were to pick either boxes $B$ or $C$, then you'd lose 100 dollars). If they predicted that you would take $B$, then they put 100 dollars in $B$ and left a bill for 100 dollars in boxes $A$ and $C$. If they predicted that you would take $C$, then they put 100 dollars in box $C$ and left a bill for 100 dollars in boxes $A$ and $B$. If they predicted that you wouldn't play, then they left all boxes empty.

Using ‘$K_A$’ to stand for the proposition that the predictor predicted you would take box $A$, and likewise for ‘$K_B$‘, ‘$K_C$‘, and $K_N$, we have the following decision matrix:

For simplicity, suppose that your utilities are linear in dollars, and suppose that the predictor is 100% reliable—that is, conditional on you selecting act $X$, the probability that the predictor predicted you would select act $X$ is 100% (this is a harmless simplification, since we could run the same case with 70% reliability, but it would make the math more complicated than it needs to be). Then, the evidential decision theorist tells you to select either $A$, $B$, or $C$ (it does’t matter which), since $A$, $B$, and $C$ each have an expected value of 100, and $N$ has an expected value of 0. (Because the predictor is perfectly reliable, the evidential decision theorist is only interested in the diagonal entries in the decision matrix.)

On the other hand, what the causal decision theorist tells you to do will depend upon how confident you are that you will end up selecting $A$, $B$, or $C$, (since this makes a difference with respect to how confident you are that the predictor predicted you’d choose $A$, $B$, $C$, or $N$). If your credence that you take $A$ is $a$, your credence that you take $B$ is $b$, and your credence that you take $C$ is $c$, then the causal decision theorist’s $U$-value for the acts $A$, $B$, $C$, and $N$ are as shown below. $$ \begin{aligned} U(A) &= 100 (a-b-c) \\
U(B) &= 100 (b-a-c) \\
U( C ) &= 100 (c - a- b) \\
U(N) &= 0 \end{aligned} $$ Suppose that $a=b=c=$ 25%—that is, you think you’re equally likely to select any of the available acts. Then, $U(A) = U(B) = U( C ) = -25$, while $U(N) = 0$, so CDT will advise you to not play the game.

Consider now a slightly different decision problem.


Two Stage Hunter Richter
At stage one, you are give a choice: either play, $\sim N$, or don't. If you choose to not play, then you walk away without gaining or losing any money. If you choose to play, then you must either select box $A$, box $B$, or box $C$. As in *Hunter Richter*, if the predictor predicted you would not play, the boxes were left empty. If they predicted you would take box $X$, then 100 dollars were left in box $X$ and bills for 100 dollars were left in the other boxes.

In Two Stage Hunter Richter, causal decision theory will tell you to play, $\sim N$, no matter how likely you think you are to play or not, and no matter how likely you think you are to pick $A$, $B$, or $C$, given that you play.

By way of explanation: label the factors which you are not in a position to affect “$K$”, and those which you are in a position to affect “$C$”. Then, the causal decision theorist says to maximize the $U$-value of your act, where \begin{aligned} U(A) = \sum_K \Pr(K) \cdot \sum_C \Pr(C \mid KA) \cdot V(KCA) \end{aligned} ($V$ is your value function. This is Skyrm’s formulation of CDT, but the same point would go through with other formulations.) Since it conditions the probability of each downstream factor $C$ on the performance of your act, $\Pr(C \mid KA)$, $U$-value is sensitive to correlations between your acts and the goods that they cause. Since it additionally conditions the probability of each $C$ on each $K$, $U$-value is additionally sensitive to correlations between factors out of your control and factors causally downstream of your act. So if one of the downstream causal consequences of your acts is a subsequent act of yours—as in Two Stage Hunter Richter—then $U$-value will be sensitive to correlations between subsequent acts of yours and factors over which you have no control.

Applied to Two-Stage Hunter Richter: deciding to play causes you to take either box $A$, $B$, or $C$. So, when you are deciding whether to play or not, causal decision theory tells you to take into consideration correlations between your subsequent decision (take $A$, $B$, or $C$), and the predictor’s prediction about which box you would take. Since these correlations are perfect, $$ \Pr(A \mid K_A \sim N) = \Pr(B \mid K_B \sim N) = \Pr(C \mid K_C \sim N) = 1 $$ the $U$-value of playing will be \begin{aligned} U(\sim N) &= \Pr(K_A) \cdot 100 + \Pr(K_B) \cdot 100 + \Pr(K_C) \cdot 100 + \Pr(K_N) \cdot 0\\
&= 100 (a + b + c) \end{aligned} So: as long as you aren’t certain that you won’t play the game, CDT advises you to play.

Let’s alter the case once more. Here’s a three-stage version of Hunter-Richter:


Three Stage Hunter Richter

At stage one, you are given a choice: you may either pay 90 dollars, $P$, or pay nothing, $\sim P$. If you pay nothing, then you go on to play the original Hunter Richter game. If you pay 90 dollars, then you go on to play Two Stage Hunter Richter.


What does causal decision theory say to do in Three Stage Hunter Richter? Again, it depends upon how likely you think you are to end up selecting $A$, $B$, $C$, or $N$ in the final stage. Let’s suppose, as before, that you think each outcome is equally likely. Now: suppose you don’t pay the 90 dollars, $\sim P$. Then, you’ll face the original Hunter Richter game. And we know what CDT will advise you to do there: it will tell you to not play, $N$. So you’ll walk away with nothing. Suppose that you do pay the 90 dollars, $P$. Then, you’ll face the Two Stage Hunter Richter game. And we know what CDT will advise you to do there: it will tell you to play, and you’re certain that you’ll end up making 100 dollars. Minus the 90 you paid up front, you’ll end up with a net 10 dollars. 10 dollars is better than 0 dollars, so CDT tells you to pay the 90 up front.

So: if a causal decision theorist plays this game, they’ll walk away with 10 dollars.

What about the evidential decision theorist? If they play this three stage game, then they know that they’ll choose to take either $A$, $B$, or $C$ in the original Hunter Richter game, so they will see no reason to pay 90 dollars at stage one.

So: if an evidential decision theorist plays this game, they’ll walk away with 100 dollars.

Notice that the evidential decision theorist didn’t have more money provided to them by the predictor. Both the causalist and the evidentialist had 100 dollars placed in one box and bills in the other two—we can even suppose that it was the very same box for each. And both had the choice to pay or not. But the evidentialist ended up 90 dollars richer than the causalist.