2017, Oct 2

Singular Causation and Model Reduction

In my previous post I tried to get clear about when variables could be safely removed from a causal model without affecting what the model is capable of telling us about singular causal relations. There, I endorsed two principles stating when causal models may be reduced by excising variables in a particular way. If we endorse these principles, and we want to give a theory of singular causation formulated in terms of correct causal models, then we should want that theory to give the very same verdicts before and after model reduction. The point of today’s post is that there is a wide family of theories of causation which run afoul of this constraint. Those theories will say that two variable values are causally related in one model, but reverse this judgment when the model is reduced.

1. Counterfactual Counterfactual Theories of Singular Causation

1.1 Counterfactuals in Causal Models

Causal models allow us to evaluate certain causal counterfactual conditionals. For instance, recall the causal model describing the relations of causal determination between whether the switch is up, whether the power is on, and whether the light is illuminated.
$$ \begin{aligned} L &:= S \wedge P \\\
P &:= S \end{aligned} $$ Suppose that, actually, the switch is down, $S=1$, so that the power is on and the light is illuminated. If we want to evaluate the counterfactual conditional $P = 0 \hspace{4pt}\Box\hspace{-4pt}\to L = 1$ (were the power off, the light would be illuminated), we mutilate the model $\mathbb{M}$ by removing $P$'s equation, severing $P$'s dependence upon $S$, and setting its value to $0$ directly. That is, we exogenize the variable $P$, and add the assignment $P=0$ to the context $\vec{u}$. Graphically, we cut the arrow going into $P$, but leave all other arrows intact.

Call the resulting mutilated model “$\mathbb{M}[P \to 0]$”. The semantics for counterfactuals tells us that $P = 0 \hspace{4pt}\Box\hspace{-4pt}\to L = 1$ is true in the model $\mathbb{M}$ iff $L=1$ is true in the mutilated model $\mathbb{M}[P \to 1]$. $$ \mathbb{M} \models P = 0 \hspace{4pt}\Box\hspace{-4pt}\to L = 1 \quad \iff \quad \mathbb{M}[P \to 0] \models L =1 $$ Since $L=0$ in the mutilated model $\mathbb{M}[P \to 0]$, this tells us that the counterfactual $P = 0 \hspace{4pt}\Box\hspace{-4pt}\to L = 1$ is false in the original model $\mathbb{M}$.

More generally, if $\vec{X}$ is a vector of variables in $\mathbb{U} \cup \mathbb{V}$ and $\vec{x}$ is some assignment of values to those variables, then we may define $\mathbb{M}[\vec{X} \to \vec{x}]$ to be the mutilated model that you get by going through each variable $ X \in \vec{X}$ and, if $X$ is endogenous, removing $X$'s structural equation $\phi_X$ from $\mathbb{E}$, moving $X$ from $\mathbb{V}$ to $\mathbb{U}$, and adding the assignment $\vec{x}(X)$ to the context $\vec{u}$. (By the way, “$\vec{x}(X)$” is the value which $\vec{x}$ assigns to the variable $X$.) On the other hand, if $X \in \vec{X}$ is exogenous, then you simply change the context so that $\vec{u}(X) = \vec{x}(X)$. Then, for any $\phi$, we have that $$ \mathbb{M} \models \vec{X} = \vec{x} \hspace{4pt}\Box\hspace{-4pt}\to \phi \quad\iff \quad \mathbb{M}[\vec{X} \to \vec{x}] \models \phi $$

1.2 Counterfactual Counterfactual Depdendence

Many contemporary theories of causation fit into the following general schema, which we can call “Counterfactual Counterfactual":


Counterfactual Counterfactual. $C=c$ caused $E=e$ in causal model $\mathbb{M}$ iff there is some value of $C$, $c'$, such that $$ \mathbb{M}[\vec{G}\to\vec{g}] \models C = c’ \hspace{4pt}\Box\hspace{-4pt}\to E \neq e $$ for some suitable vector of variable $\vec{G}$ and a suitable assignment of values $\vec{g}$.


According to Counterfactual Counterfactual, causation is not counterfactual dependence; rather, it is counterfactual dependence in some counterfactual scenario, $\vec{G} = \vec{g}$. Assuming that the empty vector of variables counts as suitable, Counterfactual Counterfactual will entail that counterfactual dependence is sufficient for causation.

We will get different theories of causation depending upon which vectors of variables, and which assignments of values, we take to be suitable. For instance, the account of Hitchcock (2001) tells us that $\vec{G}$ and $\vec{g}$ are suitable iff, in $\mathbb{M}$, there is some directed path leading from the variable $C$ to the variable $E$, $C \to V_1 \to V_2 \to \dots \to V_N \to E$, such that, in the counterfactual model $\mathbb{M}[\vec{G} \to \vec{g}]$, every variable $V$ along this path retains its actual value in the original model, $\vec{u}(V)$.


Hitchcock (2001). $\vec{G}$ and $\vec{g}$ are suitable iff there some some path from $C$ to $E$ such that, for every variable $V$ along this path, $$ \mathbb{M}[\vec{G} \to \vec{g}] \models V = \vec{u}(V) $$


There are some cases in which Hitchcock looks too strong (e.g., the Voting Machine case from appendix A.2 of Halpern & Pearl (2005)). These and other cases were taken to motivate a move to the following weaking.


Halpern and Pearl (2005). $\vec{G}$ and $\vec{g}$ are suitable iff, for all vectors of variables $\vec{P}$ not in $\vec{G}$, and any subvector $\vec{H}$ of $\vec{G}$, $$ \mathbb{M}[\vec{H} \to \vec{g}(\vec{H}), \vec{P} \to \vec{u}(\vec{P}), C \to c] \models E=e $$


Notice that, if a counterfactual setting $\vec{G} = \vec{g}$ is suitable according to Hitchcock (2001), then it will automatically be suitable according to Halpern and Pearl (2005). So, if $C=c$ caused $E=e$ according to Hithcock (2001), then $C=c$ caused $E=e$ according to Halpern and Pearl (2005).

Both of these accounts of causation face a problem with cases of what’s come to be known as bogus prevention, illustrated by the neuron diagram below.

In this neuron diagram, $C$'s firing does not prevent $E$ from firing (that is: $C$'s firing did not cause $E$ to not fire). However, both Hitchcock (2001) and Halpern and Pearl (2005) get the verdict that $C$'s firing prevented $E$ from firing. That’s because both of them rule the singleton vector of variables $\vec{G} = (A)$, with the assignment $\vec{g}=1$, suitable. And, in this counterfactual setting, whether $E=0$ counterfactually depends upon whether $C = 1$.

In response to cases like these, there has been further emendation of the Halpern and Pearl account to incorporate standards of normality, or typicality. Halpern (2008) emends the Halpern and Pearl (2005) account like so:


Halpern (2008). $\vec{G}$ and $\vec{g}$ are suitable iff, for all vectors of variables $\vec{P}$ not in $\vec{G}$, and any subvector $\vec{H}$ of $\vec{G}$, $$ \mathbb{M}[\vec{H} \to \vec{g}(\vec{H}), \vec{P} \to \vec{u}(\vec{P}), C \to c] \models E=e $$ and, in addition, there is some assignment of values to the variables in the model such that, in that assignment, $\vec{G} = \vec{g}$ and $C = c'$, and that assignment is at least as normal, or typical as the variable assignment of the original model $\mathbb{M}$.


This definition requires us to outfit our causal models with a ranking over assignments of values to all of the variables in $\mathbb{U} \cup \mathbb{V}$. There will be complicated questions about which variable values are more normal than which others; however, it we restrict our attention to simple neuron diagrams, we can at least rest assured that everybody seems to agree that it is more normal or typical for a neuron to not fire than it is for it to fire. If we assume that $A$'s not firing is more normal that $A$'s firing, then Halpern (2008) tells us that the counterfactual setting $A=1$ in Bogus Prevention is not suitable; and, therefore, that $C=1$ did not cause $E=1$.

Notice that, if a counterfactual setting $\vec{G} = \vec{g}$ is suitable according to Halpern (2008), then it will automatically be suitable according to Halpern and Pearl (2005). So, if $C=c$ caused $E=e$ according to Halpern (2008), then $C=c$ caused $E=e$ according to Halpern and Pearl (2008).

For further discussion of these accounts, see chapters 7 and 8 of my Seminar Notes for Causality

2.Counterfactual Counterfactual Accounts Reverse Causal Judgments in Model Reductions

Recall the Lewisian neuron diagram of a case of preemption.

We may model this neuron diagram with the following system of structural equations (where the variables have the natural interpretation, with $1$ corresponding to firing and $0$ corresponding to not firing):

$$ \begin{aligned} E &:= B \vee D \\\
D &:= C \\\
B &:= A \wedge \neg C \end{aligned} $$

(The context is just $C=1$ and $A=1$.) Let’s call this model “$\mathbb{M}$”. In the original neuron diagram, $C$'s firing is a cause of $E$'s firing. So we should want our theory of singular causation to tell us that, in this causal model, $C=1$ is a cause of $E=1$. Getting cases like this right is non-negotiable for a theory of singular causation. And, fortunately, counterfactual counterfactual accounts like Hitchcock’s (2001) and Halpern and Pearl’s (2005) are capable of saying that $C=1$ is a cause of $E=1$ in the causal model above. To deliver this verdict, those theories let $\vec{G} = (B)$, with $\vec{g} = (0)$. As you may verify for yourself, Hitchcock (2001), Halpern and Pearl (2005), and Halpern (2008) all deem this choice suitable. But then, $$ \mathbb{M}[B \to 0] \models C = 0 \hspace{4pt}\Box\hspace{-4pt}\to E = 0 $$ That is: in the counterfactual scenario where $B$'s value is held fixed at $0$, had $C$ not fired, $E$ would not have fired either. So, Counterfactual Counterfactual deems $C=1$ a cause of $E=1$.

Note that the following is an exogenous reduction of this model in which we have excised the exogenous variable $A$ by substituting $1$ for $A$ in $B$'s structural equation. $$ \begin{aligned} E &:= B \vee D \\\
D &:= C \\\
B &:= \neg C \end{aligned} $$ Call the resulting model “$\mathbb{M}_A$”. The endogenous variable set of $\mathbb{M}_A$ is non-empty and the equation for $B$ is still surjective, so this is a valid exogenous reduction. By our principle Valid Exogenous Reduction Preserves Correctness (see the previous post), $\mathbb{M}_A$ is correct if the original model $\mathbb{M}$ was.

Given $\mathbb{M}_A$, we may excise the endogenous variable $B$ by removing $B$'s structural equation and substituting $\neg C$ for $B$ in $E$'s structural equation. $$ \begin{aligned} E &:= \neg C \vee D \\\
D &:= C
\end{aligned} $$ Call the resulting model “${\mathbb{M}_A}_B$”. $B$ is not a collider in $\mathbb{M}_A$, so this is a valid endogenous reduction of $\mathbb{M}_A$. By our principle Valid Endogenous Reduction Preserves Correctness (see the previous post), ${\mathbb{M}_A}_B$ is correct if $\mathbb{M}_A$ was.

However, just considering this model, Counterfactual Counterfactual tells us that $C=1$ is not cause of $E=1$. For the only possible choices of $\vec{G}$ are the empty vector and the singleton vector $(D)$. Since $E=1$ does not counterfactually depend upon $C=1$, the empty vector does not witness $C=1$'s causing $E=1$. And $$ {\mathbb{M}_A}_B[D \to 1] \models C=0 \hspace{4pt}\Box\hspace{-4pt}\to E=1 $$ So there is no counterfactual dependence between $E=1$ and $C=1$ in the counterfactual scenario in which $D$ is held fixed at $1$. And $$ {\mathbb{M}_A}_B[D \to 0] \models C=0 \hspace{4pt}\Box\hspace{-4pt}\to E=1 $$ So there is no counterfactual dependence between $E=1$ and $C=1$ in the counterfactual scenario in which $D$ is held fixed at $0$. So there is no counterfactual dependence between $E=1$ and $C=1$ period. So they are not causally related, according to Counterfactual Counterfactual.

What we’ve just seen is that, if we accept the principles on valid model reduction from the previous post, then the verdicts of a theory like Counterfactual Counterfactual vary from correct model to correct model. Above, we relied upon both the principle Valid Exogenous Reduction Preserves Correctness and Valid Endogenous Reduction Preserves Correctness. However, we can get Halpern (2008) to flip its verdict by just excising an exogenous variable from a correct causal model.

Consider the following neuron diagram.

Here’s how to read this diagram: if $B$ fires at $t_1$, then it will cancel out any one signal sent from $A$ or $C$. So, if $B$ fires and exactly one of $A$ and $C$ fire, then $E$ will not fire. If $B$ fires and both $A$ and $C$ fire, then $E$ will fire. And, if $B$ doesn’t fire, then $E$ will fire iff at least one of $A$ and $C$ fire.

We can represent this neuron diagram with the following structural equation. $$ E := (\neg B \wedge (A \vee C)) \vee (B \wedge (A \wedge C)) $$ (The context is $A=C=1$ and $B=0$.) Call the causal model containing these variables, this context, and this equation “$\mathbb{M}$”. Given given the natural assumption that not firing is more normal, or typical, than firing, Halpern (2008) tells us that $C$'s firing ($C=1$) is a cause of $E$'s firing ($E=1$) in $\mathbb{M}$. That’s because the variable assignment in which none of the neurons fire is more normal that the actual variable assignment, and this is an assignment in which $A=C=0$. So, Halpern (2008) tells us that the counterfactual setting $A=0$ is suitable; and, in this counterfactual setting, whether $E=1$ counterfactually depends upon whether $C=1$. $$ \mathbb{M}[A=0] \models C = 0 \hspace{4pt}\Box\hspace{-4pt}\to E = 0 $$

So, $C=1$ caused $E=1$. Now, I don’t think that this verdict is a desideratum of a theory of causation. I, like Lewis and Mackie, am content with an account which says that, while neither $A=1$ nor $C=1$ individually caused $E=1$, the disjunction (or the fusion, or what-have-you) of $A=1$ and $C=1$ did. However, I am also content with an account according to which $C=1$ was a cause of $E=1$. And that is how Halpern (2008), like Hitchcock (2001) and Halpern and Pearl (2005), comes down on this case.

Suppose that we excise the exogenous variable $A$ from this model. This gives us the new model $\mathbb{M}_A$, which contains the variables $C, B,$ and $E$, the structural equation $$ E := \neg B \vee C $$ The resulting endogenous variable set is non-empty, and the resulting structural equation is surjective, so this exogenous reduction is valid. By our principle Valid Exogenous Reduction Preserves Correctness, $\mathbb{M}_A$ is correct if $\mathbb{M}$ was.

Now, while Halpern (2008) said that $C=1$ caused $E=1$ in $\mathbb{M}$, it reverses this judgment in $\mathbb{M}_A$. That’s because, in the actual context $B=0$, whether $E=1$ does not counterfactually dependend upon whether $C=1$. And while, in the counterfactual setting $B=1$, whether $E=1$ does counterfactually depend upon whether $C=1$, $$ \mathbb{M}_A[B \to 1] \models C=0 \hspace{4pt}\Box\hspace{-4pt}\to E=0 $$ This counterfactual setting is not suitable, according to Halpern (2008). For having $B$ fire is less normal than having $B$ not fire. (Or, if we reject this normality ranking, for whatever reason, this calls into question whether the account is capable of getting the right verdict in Bogus Prevention.)

So, again, if we accept the principle that Valid Exogenous Reduction Preserves Correctness, then the verdicts of Halpern (2008) vary from correct model to correct model.

2017, Oct 1

when can variables be safely removed from a causal model?

Much of our causal talk consists of sentences of the form “c caused e”, where both c and e are token, non-repeatable events or facts or what-have-you (there will be disagreement about what kinds of things ‘c’ and ‘e’ denote, but for now, I’ll just call them ‘events’). Let’s call the kinds of causal relations we’re talking about with sentences like those ‘singular causal relations’. The topic of causation is not exhausted by singular causal relations. There are other interesting causal notions which are clearly distinct from (though they may bear interesting relations to) singular causation. For instance, “Smoking causes cancer” is not a singular causal claim, but rather a general causal claim, relating not token events but rather general types of events.

Many have become convinced that the best way to theorize about singular causation is by understanding it withing the context of some explicitly represented system of causal determination. Causal determination is a third causal notion, distinct from both singular and general causation. Even though it is incorrect to say that the power being on caused the light to not be illuminated, it is nevertheless true that whether the light is illuminated is causally determined by whether the power is on and the switch is up. And, though one could infer that the power is on from the fact that the light is illuminated, it would be incorrect to say that the whether the power is on is causally determined by whether the light is illuminated. To say this would be to get the direction of causal determination the wrong way ‘round.

Those who think that systems of causal determination have an important role to play in a theory of singular causation typically think that these systems of causal determination may be represeted with systems of structural equations. A system of structural equations is a particular kind of model of a network of causal determination. The model consists of a vector of variables (see section 1 of this post for more on how to think about variables), together with a vector of structural equations. For instance, we may introduce a variable $P$ for whether the power is on (at the relevant place and the relevant time). This variable takes on the value $1$ if the power is on, takes on the value $0$ if the power is off, and is undefined otherwise. We may similarly introduce a variable $L$ for whether the light is illuminated—a variable which takes on the value $1$ if the light is illuminated, takes on the value $0$ if the light is not illuminated, and takes on the value $0$ otherwise (if, e.g., the light doesn’t exist). And we may introduce a variable $S$ for whether the switch is up or not (again, $1$ if it’s up, $0$ if it’s down, undefined otherwise). A structural equation then tells us how the value of $L$ is causally determined by the values of $P$ and $S$. In particular, it tells us that \begin{equation}
L := P \wedge S \label{1}\tag{1} \end{equation} (Here, “$\wedge$” is just the truth-function ‘and’.) So, $L$ will take on the value $1$ iff both $P$ and $S$ take on the value $1$. If either $P$ or $S$ is $0$, then $L$ will take on the value $0$ as well. When we combine multiple strutctural equations, we can get a system of structural equations. These systems of equations represent networks of causal determination out in the world. For instance, suppose that whether the power is on is structurally determined by whether the light switch is up. If the light switch is up, then the power is on, and if the light switch is down, then the power is off.
\begin{equation} P := S \tag{2}\label{2} \end{equation} Combining the structural equations \eqref{1} and \eqref{2} gives us a system of structural equations \begin{aligned} L &:= P \wedge S \\\
P &:= S
\end{aligned}

What makes this system of equations structural is that we are interpreting them causally. The equations don’t just say that there is a certain relationship between the values of $L, P,$ and $S$. They additionally says that the value of $L$ is causally determined by the values of $P$ and $S$; and that the value of $P$ is causally determined by the value of $S$. It is for this reason that I use the asymmetric relation “$:=$”, and not the symmetric relation “$=$”. For instance, it follows from the system of equations consisting of \eqref{1} and \eqref{2} that $S=1$ iff $L=1$; so the equation $$ S = L $$ will be true if the system of strutural equations $($\eqref{1}, \eqref{2}$)$ is. However, it will be false that $$ S := L $$ For, even though the value of $S$ must match the value of $L$, it is not the case that the value of $S$ is causally determined by the value of $L$. It is this additional information which is conveyed by the symbol $:=$.

In a structural equation, there is exactly one, dependent variable on the left-hand-side of the equation, and at least one independent variable on the right-hand-side. I’ll use “$\mathbf{PA}(V)$” to represent a vector of the independent variables on the right-hand-side of $V$'s structural equation. (It is common to refer to these variables as $V$'s causal parents.) Then, a structural equation is of the form $$ V := \phi_V(\mathbf{PA}(V)) $$ where $\phi_V$ is some function from the values of the variables in $\mathbf{PA}(V)$ to the values of $V$. I will insist, by the way, that $\phi_V$ be surjective if we are to interpret it causally. I will use “$\phi_V$” to represent the entire structural equation $V := \phi_V(\mathbf{PA}(V))$. If a variable appears on the left-hand-side of a structural equation, then that variable is endogenous. Otherwise, it is exogenous.

What I will call a causal model, $\mathbb{M}$, consists of a vector of exogenous variables $\mathbb{U}$, a vector of endogenous variables $\mathbb{V}$, a vector of structural equations $\mathbb{E}$, and a context, $\vec{u}$, which is an assignment of values to the exogenous variables in $\mathbb{U}$.


Causal Model A causal model $\mathbb{M}$ is a 4-tuple $$ \mathbb{M} = \langle \mathbb{U}, \mathbb{V}, \mathbb{E}, \vec{u} \rangle $$ of

  1. A (non-empty) vector $\mathbb{U}$ of exogenous variables, $( U_1, U_2, \dots, U_M )$.
  2. A (non-empty) vector $\mathbb{V}$ of endogenous variables, $ ( V_1, V_2, \dots, V_N)$.
  3. A vector $\mathbb{E}$ of structural equations, $ ( \phi_1, \phi_2, \dots, \phi_N) $, one for each endogenous variables $V_i \in \mathbb{V}$.
  4. A context $\vec{u} = ( u_1, u_2, \dots, u_M )$, which assigns a value to each exogenous variable $U_i \in \mathbb{U}$.

(This is a slightly non-standard presentation. Normally, the context is not taken to be a part of the causal model.)

Given a causal model, we may generate a causal graph by creating a node for every variable and placing an arrow (a directed edge) between two variables $U$ and $V$, with its tail at $U$ and its head at $V$, $U \to V$, iff $U$ appears on the right-hand-side of $V$'s structural equation. For instance, the causal model of the light, the power, and the switch, determines this causal graph:

For a more careful and thorough introduction to causal models, and a theory of when they are correct—that is, when they correctly represent relations of causal determination out in the world—see section 2 of this paper.

When Removing Exogenous Variables Preserves Correctness

Suppose that we have the causal model introduced above, with the context $S=1$ (the switch is actually up). It appears that we can excise the exogenous variable $S$ from this model entirely. We may simply take $S$'s actual value $1$ and plug it into all the structural equations in which the variable $S$ appeared. When we do this, the structural equation associated with $P$ no longer depends upon any variables, and simply says that $P := 1$. That is: the effect of removing the exogenous variable $S$ has been to render $P$ exogenous. And, when we remove the exogenous $S$, the structural equation associated with $L$ becomes $L := P \wedge 1$, or just $L := P$.

We therefore get the causal model $\mathbb{M}$ with the exogenous variable $S$ excised. This is $$ \mathbb{M}_{S} = \langle (P), (L), (L := P), (1) \rangle. $$ That is: $\mathbb{M}_S$ consists of the vector of exogenous variables $\mathbb{U} = (P)$, the vector of endogenous variables $\mathbb{V}=(L)$, the vector of structural equations $\mathbb{E} = (L := P)$, and the exogenous assignment $\vec{u} = (1)$ to $P$.

I think that, if the original model $\mathbb{M}$ was correct, then so too is $\mathbb{M}_S$. This follows from a counterfactual understanding of what makes a causal models correct, since the counterfactuals entailed by the new model $\mathbb{M}_S$ are a proper subset of the counterfactuals entailed by the old model $\mathbb{M}$. Given some plausible assumptions, it also follows from my own preferred way of understanding what makes a causal model correct.

No causal model represents all of the features of reality which could potentially make a difference with respect to the values of the variables in the model. In every causal model, we will be taking for granted certain features of, or causal precursors to, the system being modeled. If I want to model the causal determinants of the forest fire, I needn’t explicitly include a variable for the presense of oxygen. So long as there is plenty of oxygen in the atmosphere, it may be true that whether there is a fire is causally determined by whether the lightning struck. Similarly, so long as the light switch is actually up, whether the light is illuminated is causally determined by whether the power is on or off.

In general, if $U$ is an exogenous variable in the causal model $\mathbb{M}$, we can define the $U$-reduction of $\mathbb{M}$ to be what you get when you remove $U$ from $\mathbb{U}$, put into $\mathbb{U}$ any variables in the model which were causally determined by $U$ alone (and remove those variables from $\mathbb{V}$), replace $U$ for its value in the context $\vec{u}$ within every equation in $\mathbb{E}$ (except, of course, for those endogenous variables $V$ which were causally determined by $U$ alone), and update the context $\vec{u}$ appropriately.


Exogenous $U$-Reduction. Given a causal model $\mathbb{M} = \langle \mathbb{U, V, E}, \vec{u} \rangle$, and some $U \in \mathbb{U}$, the $U$-reduction of $\mathbb{M}$, $\mathbb{M}_U$, is $\langle \mathbb{U}_U, \mathbb{V}_U, \mathbb{E}_U, \vec{u}_U \rangle$, where

  1. $\mathbb{U}_U$ is the vector of previously exogenous variables, minus $U$, and plus any endogenous variables whose values were determined by $U$ alone.
  2. $\mathbb{V}_U$ is the vector of previously endogenous variables, minus any whose values were determined by $U$ alone.
  3. $\mathbb{E}_U$ is a vector of structural equations. For each endogenous variable $V$ in $\mathbb{V}_U$, there is exactly one structural equation, which is the result of taking $V$'s old structural equation in $\mathbb{E}$, and replacing the variable $U$ wherever it appears (if at all) with $U$'s value in the context $\vec{u}$
  4. $\vec{u}_U$ is an assignment of values to the variables in $\mathbb{U}_U$ which matches $\vec{u}$ for all exogenous variables previously in $\mathbb{U}$; for those newly exogenous variables, $V$, the assignment in $\vec{u}_U$ is the one determined by taking $V$'s old structural equation in $\mathbb{E}$ and replacing the variable $U$ with $U$'s value in the old context $\vec{u}$.

However, while we can safely remove the exogenous variable $S$ from $\mathbb{M}$ in the context $S=1$, we cannot remove $S$ in the context $S=0$. If we try to do so, we will end up with the structural equation $L := P \wedge 0$. But this equation tells us that $L$'s value does not depend upon $P$'s value. No matter what value $P$ takes on, $L$ will take on the value $0$. So the resulting model would say, falsely, that there $P$ does not causally determine $L$.

The right way to think about this, I believe, is that some $U$-reductions will lead to models which violate necessary conditions on the correctness of causal models. In particular, in order for a structural equation $\phi_V$ to be correct, every value of the right-hand-side variable $V$ must be in the image of $\phi_V$. That is to say: only surjective functions may appear in correct structural equations. And, in order for a causal model to be correct, all of the structural equations it contains must be correct. So, in the context $S=0$, removing the exogenous variable $S$ renders the structural equation $L := P \wedge 0$ non-surjective. Such $U$-reductions are not valid.

Similarly, in order for a causal model to be correct, the vector of endogenous variables $\mathbb{V}$ must be non-empty. Some $U$-reductions will violate this necessary condition on correctness. For instance, consider the $S$-reduced model discussed above. If we try to $U$-reduce this model by excising the exogenous variable $P$, the resulting model, $\mathbb{M}_{S, P}$, will have no endogenous variables. $U$-reductions like these are not valid, either.

In general, we may say that a $U$-reduction is valid iff (1) the resulting endogenous variable set is non-empty, and (2) the resulting structural equations are all surjective.

If a $U$-reduction is valid, then the $U$-reduced model is correct if the original model was. Valid $U$-reduction preserves correctness.


Valid Exogenous Reduction Preserves Correctness. If $\mathbb{M}$ is a correct causal model, and $\mathbb{M}_U$ is a valid exogenous $U$-reduction of $\mathbb{M}$ (i.e., if $\mathbb{M}_U$ is both correct and a $U$-reduction of $\mathbb{M}$), then $\mathbb{M}_U$ is a correct causal model, too.


I have previously laid down conditions for the correctness of causal models. Valid $U$-Reduction Preserves Correctness is not intended as a conjecture about those correctness conditions. I know that my account, as it stands now, violates this principle (the curious may consider the $H$-reduction of the causal model in figure 8 of that paper). Valid $U$-Reduction Preserves Correctness is intended to supplement that account. The principle allows you to move from the correctness of one causal model to the correctness of a certain sub-model, even if the sub-model was not previously deemed correct on its own.

When Removing Endogenous Variables Preserves Correctness

Go back to our original causal model of the light switch, the power, and the light, \begin{aligned} L &:= P \wedge S \\\
P &:= S
\end{aligned} Just as it appeared that we could excise the exogenous variable $S$ from this model, so too does it appear that we may excise the variable $P$ from this model. Since we know that the power turns on whenever the light switch is on; and since we know that, if both the power and the switch are on, the light will be illuminated, it appears that we may conclude straightaway that, if the switch is on, then the light will be illuminated. Moreover, the switch’s being on appears to causally determine the light’s being illuminated. So it seems that, if the original causal model was correct, then so too should be the model $$ \mathbb{M}_P = \langle (S), (L), (L := S), (1) \rangle $$ This is the model containing the sole exogenous variable $S$, the sole endogenous variable $L$, the sole structural equation $L := S$, and the exogenous assignment $1$ to $S$. Call this model the endogenous $P$-reduction of $\mathbb{M}$. We got $\mathbb{M}_P$ from $\mathbb{M}$ by simply replacing the variable $P$ in $L$'s structural equation with the right-hand-side of $P$'s structural equation, giving $L := S \wedge S$. And this function is equivalent to $L := S$.

What’s more, it appears as though we can carry out this endogenous reduction of $\mathbb{M}$ whatever the value of $S$ happens to be. Even if $S = 0$, it will still be the case that $L$'s value will be causally determined to match $S$'s value.

In general, if $V$ is an endogenous variable in the causal model $\mathbb{M}$, we can define the $V$-reduction of $\mathbb{M}$ to be what you get when you remove $V$ from $\mathbb{V}$, and replace $V$, every time it appears on the right-hand-side of a structural equation, with the right-hand-side of $V$'s own structural equation.


Endogenous $V$-Reduction. Given a causal model $\mathbb{M} = \langle \mathbb{U, V, E}, \vec{u} \rangle$, and some $V \in \mathbb{V}$, the $V$-reduction of $\mathbb{M}$, $\mathbb{M}_V$, is $\langle \mathbb{U}, \mathbb{V}_V, \mathbb{E}_V, \vec{u} \rangle$, where

  1. $\mathbb{V}_V$ is the original vector of endogenous variable $\mathbb{V}$, minus the variable $V$.
  2. $\mathbb{E}_V$ is just like the original vector of structural equations, except that it is lacking $V$'s structural equation $V := \phi_V( \mathbf{PA}(V) )$, and every occurrence of $V$ on the right-hand-side of the remaining equations is replaced with $\phi_V( \mathbf{PA}(V) )$.

While we can safely remove the endogenous variable $P$ in our model of the light and the switch, we may not always do this. While some endogenous $V$-reductions are valid, other are not. For instance, consider the Lewisian neuron diagram shown below.

The neuron diagram displays a case of what’s known in the literature as early preemption. Neuron $A$'s firing would have caused $E$ to fire, but it was preempted by neuron $C$'s firing. As things actually shook out, it was $C$, and not $A$, that caused $E$ to fire. I’ll suppose that this neuron diagram may be represented with a causal model containing a binary variable for every neuron, where those variables take the value $1$ if the neuron fires at its designated time, and takes the value $0$ if the neuron does not fire at its designated time. Then, we will end up with the following system of structural equations.

$$\begin{aligned} E &:= B \vee D \\\
B &:= A \wedge \neg C \\\
D &:= C \end{aligned}$$

The endogenous $D$-reduction of this causal model is

$$\begin{aligned} E &:= B \vee C \\\
B &:= A \wedge \neg C \\\
\end{aligned}$$

And the endogenous $B$-reduction of this causal model is

$$ E := (A \wedge \neg C) \vee C $$

Or, equivalently,

$$ E := A \vee C $$

But this model treats $A$ and $C$ symmetrically. And both $A$ and $C$ take on the value $1$. This means that any theory of singular causation which looks only at the patterns of counterfactual dependence in a causal model (including, perhaps, information about which variable values are default and which are deviant) will, when applied to this model, say that $A=1$ caused $E=1$ iff $C=1$ caused $E=1$. But this would be a disasterous result—for $A=1$ did not cause $E=1$; while $C=1$ did cause $E=1$.

Lesson: if we want to use correct causal models to uncover relations of singular causation, then we had better not think that endogenous reduction always preserves correctness.

A similar lesson follows when we look at cases of preemptive prevention like the one shown below.

Here, $B$'s firing prevents $E$ from firing. However, had $B$ not fired, $A$ would have prevented $E$ from firing. So, $B$'s firing preempted $A$'s prevention. We can represent this neuron diagram with the following system of equations (where the variables are given the natural interpretation, and take on the value $1$ if the assocaited neuron fires, and take on the value $0$ if the associated neuron does not fire).

$$\begin{aligned} E &:= C \wedge \neg (B \vee D ) \\\
D &:= A \wedge \neg B \end{aligned}$$

The endogenous $D$-reduction of this causal model gives the sole structural equation $$ E := C \wedge \neg (B \vee (A \wedge \neg B)) $$ Or, equivalently, $$ E := C \wedge \neg A \wedge \neg B $$ However, this reduced model treats $A$ and $B$ symmetrically, and both $A$ and $B$ take on the value $1$; any theory which looks only at patterns of counterfactual dependence in correct causal models will therefore say that $A=1$ prevented $E=1$ iff $B=1$ prevented $E=1$. But $B=1$ prevented $E=1$ while $A=1$ did not. So, again, if we want to use correct causal models to uncover relations of singular causation, then we had better not think that endogenous reduction always preserves correctness.

I’d like to suggest that precisely the same thing goes wrong in both of the foregoing cases of endogenous variable reduction. In the first case—the case of preemption—the endogenous $B$-reduction took us to a model in which $E$'s value is determined directly by both $A$ and $C$. In the associated causal graph of the $B$-reduced model, there is one arrow leading from $A$ to $E$, and another arrow leading from $C$ to $E$. The model presents these causal pathways as autonomous, with both $A$ and $C$ determining $E$'s value in a way that is independent of the other’s influence. However, $A$'s determination of $E$'s value is not autonomous of $C$'s. In fact, both $A$ and $C$ determines $E$'s by way of a common variable, $B$.

Similarly, in the case of preemptive prevention, endogenous $D$-reduction brought us to a model in which $E$'s value is determined directly and autonomously by both $A$ and $B$. But the way that $A$ determines $E$'s value is not autonomous of the way that $B$ determines $E$'s value. In fact, both $A$ and $B$ determine $E$'s value by way of a common variable, $D$.

In the original causal models, variables like $B$ (in Preemption) and $D$ (in Preemptive Prevenvtion) are called colliders. What makes a variable in a causal model a collider is that there are two distinct arrows leading into that variable. Equivalently, a variable is a collider iff it has more than one causal parent. (Note: “collider” is usually defined to be a path-relative notion; as I’m using the notion here, a variable is a collider iff it is a collider along some path or other.)

Reflection on cases like the foregoing leads to the following constraint on valid endogenous $V$-reduction: if the endogenous variable $V$ is a collider, then $V$-reduction is not valid. Colliders may not be removed in the manner specified in Endogenous $V$-Reduction. Now, I believe that this is the only constraint on valid endogenous reduction. So long as $V$ is not a collider, $V$ may be excised from the causal graph in the manner specified in Endogenous $V$-Reduction.

Moreover, I believe this to be the only constraint on valid endogenous reduction. So we may say that, in general, a $V$-reduction is valid iff $V$ is a non-collider.

If a $V$-reduction is valid, then theh $V$-reduced model is correct if the original model was. Valid $V$-reduction preserves correctness.


Valid Endogenous Reduction Preserves Correctness. If $\mathbb{M}$ is a correct causal model, and $\mathbb{M}_V$ is a valid $V$-reduction of $\mathbb{M}$ (i.e., if $V$ is not a collider in $\mathbb{M}$), then $\mathbb{M}_V$ is a correct causal model, too.


In the case of exogenous $U$-reduction, the corresponding principle (Valid Exogenous Reduction Preserves Correctness) carried with it a genuine extension of the conditions for correctness of causal models which I endorsed previsouly. In the case of endogenous $V$-reduction, reflection on cases like preemption and preemptive prevention call for a corresponding constriction in those conditions. There are causal models, like the $D$-reduced model of figure 3, which my previous account deems correct but which are not correct. Valid Endogenous Reduction Preserves Correctness does not yet rule out models like those. To do so, we should additionally endorse:


Invalid Endogenous Reduction Destroys Correctness. If there is some correct causal model $\mathbb{M} = \langle \mathbb{U, V, E}, \vec{u} \rangle$, with a collider $V \in \mathbb{V}$, then the $V$-reduction of $\mathbb{M}$, $\mathbb{M}_V$, is not correct.