Local & Global Experts

Contemporary epistemology is replete with principles of expert deference. Epistemologists have claimed that you should treat the chances, your future selves, your rational self, and your epistemic peers as experts. What this means is that you should try to align your credences with theirs.

There are lots of ways you might try to align your credences with those of some expert function. (That expert function could be the chances, or it could be your future credences, or something else altogether. The particular function won’t matter, so I’ll just call the expert function, whatever it is, ‘$\mathscr{E}$‘.) My focus here will be on just two ways of aligning your credences with $\mathscr{E}$’s: 1) by treating it as a local expert; and 2) by treating it as a global expert.

Local Expert
You treat $\mathscr{E}$ as a local expert iff, for all propositions $a$, and all numbers $n \in [0, 1]$, $$ C(a \mid \langle \mathscr{E}(a) = n \rangle) = n, \,\, \text{if defined} $$
Global Expert
You treat $\mathscr{E}$ as a global expert iff, for all propositions $a$, and all potential credence functions $E$, $$ C(a \mid \langle \mathscr{E} = E \rangle) = E(a), \,\, \text{ if defined} $$

In these definitions, $C$ is your own credence function. You should read ‘$\mathscr{E}$’ as a definite description, along the lines of ‘the credence function of the expert’. This definite description may refer to different credence functions at different worlds. And I am using the brackets ‘$\langle \,\, \rangle$’ to denote propositions. Thus, ‘$\langle \mathscr{E}(a) = n \rangle$’ is the propositions that the expert’s credence that a is n. It is true at those worlds where $\mathscr{E}$’s credence in the proposition $a$ is $n$. And $\langle \mathscr{E} = E \rangle$ is the proposition that $E$ is the expert’s entire credence function, true at those worlds $w$ such that $\mathscr{E}_w = E$ (‘$\mathscr{E}_w$’ is $\mathscr{E}$’s credence function at world $w$).

It’s not immediately obvious what the relationship is between these two different ways of treating a function as an expert. You might think that they are equivalent, in the sense that you will treat $\mathscr{E}$ as a local expert if and only if you treat them as a global expert. In fact, they are not equivalent. Treating $\mathscr{E}$ as a global expert entails treating $\mathscr{E}$ as a local expert, but the converse is not true. (Throughout, by the way, I’m assuming probabilism and I’m assuming that your credences are defined over a finite number of worlds).

Proposition 1
If you treat $\mathscr{E}$ as a global expert, then you treat them as a local expert as well. However, you may treat $\mathscr{E}$ as a local expert without treating them as a global expert.

Proof. Note that ${ \langle \mathscr{E} = E \rangle \mid E(a) = n }$ is a partition of $\langle \mathscr{E}(a) = n \rangle$. If you treat $\mathscr{E}$ as a global expert, then for each $E$ such that $E(a) = n$, $C(a \mid \langle \mathscr{E} = E \rangle) = n$. It then follows from conglomerability (which follows from the probability axioms when the number of worlds is finite) that $C(a \mid \langle \mathscr{E}(a) = n \rangle) = n$.

To see that you may treat $\mathscr{E}$ as a local expert without treating them as a global expert, suppose that there are three possible worlds, $w_1$, $w_2$, and $w_3$, and that the expert’s credence function at each of those worlds is as shown in figure 1 (the example originates from Gaifman’s 1988 article “A Theory of Higher-Order Probabilities”). (In figure 1, by the way, the $i$th row gives $\mathscr{E}$’s credence distribution over $w_1, w_2$ and $w_3$ at the world $w_i$.)

Figure 1: A function which may be treated as a local expert, but not a global expert.

Figure 1: A function which may be treated as a local expert, but not a global expert.

And suppose that your own credence distribution over $w_1, w_2,$ and $w_3$ is such that $C(w_i) =$ 13, for $i = 1, 2, 3$. Then, for every proposition $a$ and every number $n$, $C(a \mid \langle \mathscr{E}(a) = n \rangle) = n$. For instance, if $a = { w_1, w_2 }$ and $n = 0.5$, then

$ \begin{align} C(\{ w_1, w_2 \} \mid \langle \mathscr{E}(\{ w_1, w_2 \}) = 0.5 \rangle) &= C(\{ w_1, w_2 \} \mid \{ w \mid \mathscr{E}_w(\{ w_1, w_2 \})=0.5 \} ) \\ &= C(\{ w_1, w_2 \} \mid \{ w_2, w_3 \}) \\ &= 0.5 \end{align} $

And the same is true for every other choice of $a$ and $n$, as you may check for yourself. Nevertheless, it is impossible to treat $\mathscr{E}$ as a global expert, since, so long as $C$ is a probability function,

$ C(\{ w_1 \} \mid \langle \mathscr{E} = \mathscr{E}_{w_1} \rangle) =C(\{ w_1 \} \mid \{ w_1 \}) = 1 $

But $\mathscr{E}_{w_1}({ w_1 }) = 0.5 \neq 1$. QED.

So a principle of local deference is strictly weaker than a principle of global deference. Or, a perhaps better way of thinking about things: there are strictly more functions which can be treated as local experts than there are functions which can be treated as global experts.

This is a prima facie exciting observation, since a common objection to principles of global deference is that it is possible to treat $\mathscr{E}$ as a global expert if and only if $\mathscr{E}$ is certain of what their own credences are (because the focus is usually on certain ideal credence functions, certainty about your own credences is generally called immodesty). That is to say, it is possible to treat $\mathscr{E}$ as a global expert if and only if they are immodest—if and only if, for every world $w$, $\mathscr{E}_w(\langle \mathscr{E} = \mathscr{E}_w \rangle) = 1$. For suppose that $\mathscr{E}$ were modest—that is, suppose that, for some world $w$, $\mathscr{E}_w(\langle \mathscr{E} = \mathscr{E}_w \rangle) \neq 1$. And suppose that you treat $\mathscr{E}$ as a global expert. Then, substituting $\langle \mathscr{E} = \mathscr{E}_w \rangle$ in for $a$ and $\mathscr{E}_w$ in for $E$ in the definition of Global Expert, we have

$ C(\langle \mathscr{E} = \mathscr{E}_w \rangle \mid \langle \mathscr{E} = \mathscr{E}_w \rangle ) = \mathscr{E}_w(\langle \mathscr{E} = \mathscr{E}_w \rangle ) \neq 1 $

But the probability axioms require $C(a \mid a)$ to be 1 (or undefined) for all $a$.

So: if you think functions which aren’t certain of their own values should nevertheless be treated as experts, then you will think that we need a characterization of “treating a function as an expert” which goes beyond Global Expert. A common suggestion is to treat $\mathscr{E}$ as a modest expert.

Modest Expert
You treat $\mathscr{E}$ as a modest expert if and only if, for all propositions $a$ and all potential credence functions $E$,

$ C(a \mid \langle \mathscr{E} = E \rangle) = E(a \mid \langle \mathscr{E} = E \rangle) $

But perhaps the move to such principles is too hasty. Perhaps we can get by just with principles of local deference. For note that the expert shown in figure 1 is modest; yet they can be treated as a local expert. So there are at least some modest functions which can be treated as local experts. And perhaps these are all the modest experts we need.

For this reason, the relationship between local and global experts is dialectically important to some debates in epistemology. For instance, Christensen endorses the claim that you should treat your currently rational self as a local expert. Elga criticizes this position on the grounds that it requires certainty that you are rational—however, in order to argue for this conclusion, he must first re-present Christensen’s principle as the claim that you should treat your rational self as a global expert (note: Elga recognizes that the second principle is stronger than the first). Perhaps, in the face of these criticisms, Christensen should hold tight to his original principle; perhaps it affords all the modesty we need.

No such luck, I’m afraid. Although there are some functions which can be treated as local experts but not global experts, these functions are incredibly singular. In fact, there is a good sense in which the function shown in figure 1 is the only kind of function which can be treated at a local, but not global, expert.

Given a function $\mathscr{E}$, from worlds to probability distributions over those same worlds, we can generate a Kripke frame $< \mathscr{W}, R >$ from $\mathscr{E}$ as follows: $\mathscr{E}_w({ x }) \neq 0$ if and only if $w$ bears the relation $R$ to $x$ (or, as I shall say, if and only if $w$ sees $x$).

Let’s say that a Kripke frame $< \mathscr{W}, R >$ is cyclic iff

  1. Every world $w \in \mathscr{W}$ sees itself and exactly one other world.
  2. Every world $w \in \mathscr{W}$ is seen by exactly one distinct world.
  3. There are no two worlds $w, x$ such that $w$ sees $x$ and $x$ sees $w$.
A sample cyclic frame is shown in figure 2.

Figure 2: A cyclic frame (reflexive arrows have been suppressed).

Figure 2: A cyclic frame (reflexive arrows have been suppressed).

Note that the function from figure 1 will generate a cyclic frame in which, for each $w \in \mathscr{W}$, $\mathscr{E}_w(w) =$ 12. Let’s call any function like this a uniform cyclic function (‘uniform’ because at every world $\mathscr{E}$ gives equal probability to its actual world and the one other possible world it sees).

Uniform Cyclicity
A function $\mathscr{E}$ is uniform cyclic if and only if $\mathscr{E}$ generates a cyclic frame and, for every $w \in \mathscr{W}$, $\mathscr{E}_w(\{ w\}) =$ 1/2.

Now, it turns out that the functions which may be treated as local experts, but which may not be treated as global experts, are precisely the uniform cyclic ones. If a function is uniform cyclic, then you may treat it as a local expert, but not as a global expert. And if a function $\mathscr{E}$ is not uniform cyclic, then you can treat $\mathscr{E}$ as a local expert if and only if you can treat it as a global expert.

Proposition 2
It is possible to treat a function $\mathscr{E}$ as a local expert but not possible to treat them as a global expert when and only when $\mathscr{E}$ is uniform cyclic. The only credences which treat such a function as a local expert are those which are uniform over the worlds in each cycle.

The proof of this proposition is quite long and tedious, so I’m putting it in a separate document here.

What Proposition 2 means, I think, is that we don’t have to fret about the difference between the local and global formulations of various principles of expert deference. For what the proposition tells us is that nobody should endorse a principle of local deference without thereby endorsing a principle of global deference. To endorse a principle of local deference without endorsing a principle of global deference is to say that uniformly cyclic functions are deserving of epistemic deference, but no other immodest function is. This strikes me as entirely unmotivated.

If we think that you should treat the probability function which generates the cyclic frame in figure 2 as an expert, then we should also think that you should treat the probability function which generates the frame shown in figure 3 as an expert.

Figure 3: No probability function generating this frame may be treated as a local expert.

Figure 3: No probability function generating this frame may be treated as a local expert.

After all, the only difference between the frame in figure 2 and the frame in figure 3 is that we have taken the single possibility $w_1$ in figure 2 and divided it into two sub-possibilities $w_1$ and $w_1’$ in figure 3. We could suppose that, at all worlds in figure 3, $\mathscr{E}$ gives the proposition ${ w_1, w_1’ }$ precisely the same probability it gave the singleton proposition ${ w_1 }$ in figure 2. If that’s so, then say that $\mathscr{E}$ reduces to uniform cyclicity. After all, if we just collapse the possibilities $w_1$ and $w_1’$, then we get back a uniform cyclic function. The difference between a uniform cyclic function and a function which merely reduces to uniform cyclicity ought not make any difference with respect to whether some supposed expert is deserving of epistemic deference, nor how that deference ought to be shown. However, Proposition 2 assures us that such minor changes in representation do make a difference with respect to whether we can treat the function as a local expert. So, if we’re in for treating some function as a local expert, we shouldn’t demure from treating them as a global expert as well.

So I think that Christensen, e.g., effectively has committed himself to the view that your rational self must be immodest. While his claim that you should treat your rational self as a local expert does not on its own entail this conclusion, it follows with the rather weak assumption that, if a uniform cyclic function is deserving of epistemic deference, then so too is a function which merely reduces to uniform cyclicity. Unless Christensen believes that 1) our rational selves could be uniform cyclic, but 2) they could not merely reduce to uniform cyclicity, he should also think that you should treat your rational self as a global expert. And this entails that your rational self is immodest.

TL;DR: you might have thought that principles of local deference are equivalent to principles of global deference. They’re not. Principles of local deference are weaker than principles of global deference. But they’re really not much weaker—just slightly. And there’s really no good reason to treat any function as a local expert but not a global expert. So, while they’re ever-so-slightly different, really, you shouldn’t ever worry about the differences.