# An equivalent formulation of the GCH

The continuum hypothesis CH is the assertion that the size of the power set of a countably infinite set $\aleph_0$ is the next larger cardinal $\aleph_1$, or in other words, that $2^{\aleph_0}=\aleph_1$. The generalized continuum hypothesis GCH makes this same assertion about all infinite cardinals, namely, that the power set of any infinite cardinal $\kappa$ is the successor cardinal $\kappa^+$, or in other words, $2^\kappa=\kappa^+$.

Yesterday I received an email from Geoffrey Caveney, who proposed to me the following axiom, which I have given a name.   First, for any set $F$ of cardinals, define the $F$-restricted power set operation $P_F(Y)=\{X\subseteq Y\mid |X|\in F\}$ to consist of the subsets of $Y$ having a cardinality allowed by $F$.  The only cardinals of $F$ that matter are those that are at most the cardinality of $Y$.

The Alternative GCH is the assertion that for every cardinal number $\kappa$, there is a set $F$ of cardinals such that the $F$-restricted power set $P_F(\kappa)$ has size $\kappa^+$.

Caveney was excited about his axiom for three reasons. First, a big part of his motivation for considering the axiom was the observation that the equation $2^\kappa=\kappa^+$ is simply not correct for finite cardinals $\kappa$ (other than $0$ and $1$) — and this is why the GCH makes the assertion only for infinite cardinals $\kappa$ — whereas the alternative GCH axiom makes a uniform statement for all cardinals, including the finite cardinals, and it gets the right answer for the finite cardinals. Specifically, for any natural number $n$, we can let $F=\{0,1\}$, and then note that $n$ has exactly $n+1$ many subsets of size in $F$. Second, Caveney had also observed that the GCH implies his axiom, since as we just mentioned, it is true for the finite cardinals and for infinite $\kappa$ we can take $F=\{\kappa\}$, using the fact that every infinite cardinal $\kappa$ has $2^\kappa$ many subsets of size $\kappa$ (we are working in ZFC). Third, Caveney had noticed that his axiom implies the continuum hypothesis, since in the case that $\kappa=\aleph_0$, there would be a family $F$ for which $P_F(\aleph_0)$ has size $\aleph_1$. But since there are only countably many finite subsets of $\aleph_0$, it follows that $F$ must include $\aleph_0$ itself, and so this would mean that $\aleph_0$ has only $\aleph_1$ many infinite subsets, and this implies CH.

To my way of thinking, the natural question to consider was whether Caveney’s axiom was actually weaker than GCH or not. At first I noticed that the axiom implies $2^{\aleph_1}=\aleph_2$ and similarly $2^{\aleph_n}=\aleph_{n+1}$, getting us up to $\aleph_\omega$. Then, after a bit I noticed that we can push the argument through all the way.

Theorem. The alternative GCH is equivalent to the GCH.

Proof. We’ve already argued for the converse implication, so it remains only to show that the alternative GCH implies the GCH. Assume that the alternative GCH holds.

We prove the GCH by transfinite induction. For the anchor case, we’ve shown already above that the GCH holds at $\aleph_0$, that is, that CH holds. For the successor case, assume that the GCH holds at some $\delta$, so that $2^\delta=\delta^+$, and consider the case $\kappa=\delta^+$. By the alternative GCH, there is a family $F$ of cardinals such that $|P_F(\kappa)|=\kappa^+$. If every cardinal in $F$ is less than $\kappa$, then $P_F(\kappa)$ has size at most $\kappa^{<\kappa}=(\delta^+)^\delta=2^\delta=\delta^+=\kappa$, which is too small. So $\kappa$ itself must be in $F$, and from this it follows that $\kappa$ has at most $\kappa^+$ many subsets of size $\kappa$, which implies $2^\kappa=\kappa^+$. So the GCH holds at $\kappa$, and we’ve handled the successor case. For the limit case, suppose that $\kappa$ is a limit cardinal and the GCH holds below $\kappa$. So $\kappa$ is a strong limit cardinal. By the alternative GCH, there is a family $F$ of cardinals for which $P_F(\kappa)=\kappa^+$. It cannot be that all cardinals in $F$ are less than the cofinality of $\kappa$, since in this case all the subsets of $\kappa$ in $P_F(\kappa)$ would be bounded in $\kappa$, and so it would have size at most $\kappa$, since $\kappa$ is a strong limit. So there must be a cardinal $\mu$ in $F$ with $\newcommand\cof{\text{cof}}\cof(\kappa)\leq\mu\leq\kappa$. But in this case, it follows that $\kappa^\mu=\kappa^+$, and this implies $\kappa^{\cof(\kappa)}=\kappa^+$, since by König’s theorem it is always at least $\kappa^+$, and it cannot be bigger if $\kappa^\mu=\kappa^+$. Finally, since $\kappa$ is a strong limit cardinal, it follows easily that $2^\kappa=\kappa^{\cof(\kappa)}$, since every subset of $\kappa$ is determined by it’s initial segments, and hence by a $\cof(\kappa)$-sequence of bounded subsets of $\kappa$, of which there are only $\kappa$ many. So we have established that $2^\kappa=\kappa^+$ in the limit case, completing the induction. So we get all instances of the GCH.
QED

# The finite axiom of symmetry

At the conclusion of my talk today for the CUNY Math Graduate Student Colloquium, Freiling’s axiom of symmetry Or, throwing darts at the real line, I had assigned an exercise for the audience, and so I’d like to discuss the solution here.

The axiom of symmetry asserts that if you assign to each real number $x$ a countable set $A_x\subset\mathbb{R}$, then there should be two reals $x,y$ for which $x\notin A_y$ and $y\notin A_x$.

Informally, if you have attached to each element $x$ of a large set $\mathbb{R}$ a certain comparatively small subset $A_x$, then there should be two independent points $x,y$, meaning that neither is in the set attached to the other.

The challenge exercise I had made is to prove a finite version of this:

The finite axiom of symmetry. For each finite number $k$ there is a sufficiently large finite number $n$ such that for any set $X$ of size $n$ and any assignment $x\mapsto A_x$ of elements $x\in X$ to subsets $A_x\subset X$ of size $k$, there are elements $x,y\in X$ such that $x\notin A_y$ and $y\notin A_x$.

Proof. Suppose we are given a finite number $k$. Let $n$ be any number larger than $k^2+k$. Consider any set $X$ of size $n$ and any assignment $x\mapsto A_x$ of elements $x\in X$ to subsets $A_x\subset X$ of size at most $k$. Let $x_0,x_1,x_2,\dots,x_k$ be any $k+1$ many elements of $X$. The union $\bigcup_{i\leq k} A_{x_i}$ has size at most $(k+1)k=k^2+k$, and so by the choice of $n$ there is some element $y\in X$ not in any $A_{x_i}$. Since $A_y$ has size at most $k$, there must be some $x_i$ not in $A_y$. So $x_i\notin A_y$ and $y\notin A_{x_i}$, and we have fulfilled the desired conclusion. QED

Question. What is the optimal size of $n$ as a function of $k$?

It seems unlikely to me that my argument gives the optimal bound, since we can find at least one of the pair elements inside any $k+1$ size subset of $X$, which is a stronger property than requested. So it seems likely to me that the optimal bound will be smaller.

# Every function can be computable!

I’d like to share a simple proof I’ve discovered recently of a surprising fact:  there is a universal algorithm, capable of computing any given function!

Wait, what? What on earth do I mean? Can’t we prove that some functions are not computable?  Yes, of course.

What I mean is that there is a universal algorithm, a Turing machine program capable of computing any desired function, if only one should run the program in the right universe. There is a Turing machine program $p$ with the property that for any function $f:\newcommand\N{\mathbb{N}}\N\to\N$ on the natural numbers, including non-computable functions, there is a model of arithmetic or set theory inside of which the function computed by $p$ agrees exactly with $f$ on all standard finite input. You have to run the program in a different universe in order that it will compute your desired function $f$.
Theorem There is a Turing machine program $p$, carrying out the specific algorithm described in the proof, such that for any function $f:\N\to\N$, there is a model of arithmetic $M\models\PA$, or indeed a model of set theory $M\models\ZFC$ or more (if consistent), such that the function computed by program $p$ inside $M$ agrees exactly with $f$ on all standard finite input.

The proof is elementary, relying essentially only on the ideas of the classical proof of the Gödel-Rosser theorem. To briefly review, for any computably axiomatized theory $T$ extending $\PA$, there is a corresponding sentence $\rho$, called the Rosser sentence, which asserts, “for any proof of $\rho$ in $T$, there is a smaller proof of $\neg\rho$.” That is, by smaller, I mean that the Gödel-code of the proof is smaller. One constructs the sentence $\rho$ by a simple application of the Gödel fixed-point lemma, just as one constructs the usual Gödel sentence that asserts its own non-provability. The basic classical facts concerning the Rosser sentence include the following:

• If $T$ is consistent, then so are both $T+\rho$ and $T+\neg\rho$
• $\PA+\Con(T)$ proves $\rho$.
• The theories $T$, $T+\rho$ and $T+\neg\rho$ are equiconsistent.
• If $T$ is consistent, then $T+\rho$ does not prove $\Con(T)$.

The first statement is the essential assertion of the Gödel-Rosser theorem, and it is easy to prove: if $T$ is consistent and $T\proves\rho$, then the proof would be finite in the meta-theory, and so since $T$ would have to prove that there is a smaller proof of $\neg\rho$, that proof would also be finite in the meta-theory and hence an actual proof, contradicting the consistency of $T$. Similarly, if $T\proves\neg\rho$, then the proof would be finite in the meta-theory, and so $T$ would be able to verify that $\rho$ is true, and so $T\proves\rho$, again contradicting consistency. By internalizing the previous arguments to PA, we see that $\PA+\Con(T)$ will prove that neither $\rho$ nor $\neg\rho$ are provable in $T$, making $\rho$ vacuously true in this case and also establishing $\Con(T+\rho)$ and $\Con(T+\neg\rho)$, for the second and third statements. In particular, $T+\Con(T)\proves\Con(T+\rho)$, which implies that $T+\rho$ does not prove $\Con(T)$ by the incompleteness theorem applied to the theory $T+\rho$, for the fourth statement.

Let’s now proceed to the proof of the theorem. To begin, we construct what I call the Rosser tree over a c.e. theory $T$. Namely, we recursively define theories $R_s$ for each finite binary string $s\in 2^{{<}\omega}$, placing the initial theory $R_{\emptyset}=T$ at the root, and then recursively adding either the Rosser sentence $\rho_s$ for the theory $R_s$ or its negation $\neg\rho_s$ at each stage to form the theories at the next level of the tree.
$$R_{s\concat 1}=R_s+\rho_s$$
$$R_{s\concat 0}=R_s+\neg\rho_s$$
Each theory $R_s$ is therefore a finite extension of $T$ by successively adding the appropriate Rosser sentences or their negations in the pattern described by $s$. If the initial theory $T$ is consistent, then it follows by induction using the Gödel-Rosser theorem that all the theories $R_s$ in the Rosser tree are consistent. Extending our notation to the branches through the tree, if $f\in{}^\omega 2$ is an infinite binary sequence, we let $R_f=\bigcup_n R_{f\upharpoonright n}$ be the union of the theories arising along that branch of the Rosser tree. In this way, we have constructed a perfect set of continuum many distinct consistent theories.

I shall now describe a universal algorithm for the case of computing binary functions. Consider the Rosser tree over the theory $T=\PA+\neg\Con(\PA)$. This is a consistent theory that happens to prove its own inconsistency. By considering the Gödel-codes in order, the algorithm should begin by searching for a proof of the Rosser sentence $\rho_{\emptyset}$ or its negation in the initial theory $R_{\emptyset}$. If such a proof is ever found, then the algorithm outputs $0$ or $1$ on input $0$, respectively, depending on whether it was the Rosser sentence or its negation that was found first, and moves to the next theory in the Rosser tree by adding the opposite statement to the current theory. Then, it starts searching for a proof of the Rosser sentence of that theory or its negation. At each stage in the algorithm, there is a current theory $R_s$, depending on which prior proofs have been found, and the algorithm searches for a proof of $\rho_s$ or $\neg\rho_s$. If found, it outputs $0$ or $1$ accordingly (on input $n=|s|$), and moves to the next theory in the Rosser tree by adding the opposite statement to the current theory.

If $f:\N\to 2=\{0,1\}$ is any binary function on the natural numbers, then let $R_f$ be the theory arising from the corresponding path through the Rosser tree, and let $M\models R_f$ be a model of this theory. I claim that the universal algorithm I just described will compute exactly $f(n)$ on input $n$ inside this model. The thing to notice is that because $\neg\Con(\PA)$ was part of the initial theory, the model $M$ will think that all the theories in the Rosser tree are inconsistent. So the model will have plenty of proofs of every statement and its negation for any theory in the Rosser tree, and so in particular, the function computed by $p$ in $M$ will be a total function. The question is which proofs will come first at each stage, affecting the values of the function. Let $s=f\restrict n$ and notice that $R_s$ is true in $M$. Suppose inductively that the function computed by $p$ has worked correctly below $n$ in $M$, and consider stage $n$ of the procedure. By induction, the current theory will be exactly $R_s$, and the algorithm will be searching for a proof of $\rho_s$ or its negation in $R_s$. Notice that $f(n)=1$ just in case $\rho_s$ is true in $M$, and because of what $\rho_s$ asserts and the fact that $M$ thinks it is provable in $R_s$, it must be that there is a smaller proof of $\neg\rho_s$. So in this case, the algorithm will find the proof of $\neg\rho_s$ first, and therefore, according to the precise instructions of the algorithm, it will output $1$ on input $n$ and add $\rho_s$ (the opposite statement) to the current theory, moving to the theory $R_{s\concat 1}$ in the Rosser tree. Similarly, if $f(n)=0$, then $\neg\rho_s$ will be true in $M$, and the algorithm will therefore first find a proof of $\rho_s$, give output $0$ and add $\neg\rho_s$ to the current theory, moving to $R_{s\concat 0}$. In this way, the algorithm finds the proofs in exactly the right way so as to have $R_{f\restrict n}$ as the current theory at stage $n$ and thereby compute exactly the function $f$, as desired.

Basically, the theory $R_f$ asserts exactly that the proofs will be found in the right order in such a way that program $p$ will exactly compute $f$ on all standard finite input. So every binary function $f$ is computed by the algorithm in any model of the theory $R_f$.

Let me now explain how to extend the result to handle all functions $g:\N\to\N$, rather than only the binary functions as above. The idea is simply to modify the binary universal algorithm in a simple way. Any function $g:\N\to \N$ can be coded with a binary function $f:\N\to 2$ in a canonical way, for example, by having successive blocks of $1$s in $f$, separated by $0$s, with the $n^{\rm th}$ block of size $g(n)$. Let $q$ be the algorithm that runs the binary universal algorithm described above, thereby computing a binary sequence, and then extract from that binary sequence a corresponding function from $\N$ to $\N$ (this may fail, if for example, the binary sequence is finite or if it has only finitely many $0$s). Nevertheless, for any function $g:\N\to \N$ there is a binary function $f:\N\to 2$ coding it in the way we have described, and in any model $M\models R_f$, the binary universal algorithm will compute $f$, causing this adapted algorithm to compute exactly $g$ on all standard finite input, as desired.

Finally, let me describe how to extend the result to work with models of set theory, rather than models of arithmetic. Suppose that $\ZFC^+$ is a consistent c.e. extension of ZFC; perhaps it is ZFC itself, or ZFC plus some large cardinal axioms. Let $T=\ZFC^++\neg\Con(\ZFC^+)$ be a slightly stronger theory, which is also consistent, by the incompleteness theorem. Since $T$ interprets arithmetic, the theory of Rosser sentences applies, and so we may build the corresponding Rosser tree over $T$, and also we may undertake the binary universal algorithm using $T$ as the initial theory. If $f:\N\to 2$ is any binary function, then let $R_f$ be the theory arising on the corresponding branch through the Rosser tree, and suppose $M\models R_f$. This is a model of $\ZFC^+$, which also thinks that $\ZFC^+$ is inconsistent. So again, the universal algorithm will find plenty of proofs in this model, and as before, it will find the proofs in just the right order that the binary universal algorithm will compute exactly the function $f$. From this binary universal algorithm, one may again design an algorithm universal for all functions $g:\N\to\N$, as desired.

One can also get another kind of universality. Namely, there is a program $r$, such that for any finite $s\subset\N$, there is a model $M$ of $\PA$ (or $\ZFC$, etc.) such that inside the model $M$, the program $r$ will enumerate the set $s$ and nothing more. One can obtain such a program $r$ from the program $p$ of the theorem: just let $r$ run the universal binary program $p$ until a double $0$ is produced, and then interprets the finite binary string up to that point as the set $s$ to output.

Let me now also discuss another form of universality.

Corollary
There is a program $p$, such that for any model $M\models\PA+\Con(\PA)$ and any function $f:M\to M$ that is definable in $M$, there is an end-extension of $M$ to a taller model $N\models\PA$ such that in $N$, the function computed by program $p$ agrees exactly with $f$ on input in $M$.

Proof
We simply apply the main theorem inside $M$. The point is that if $M$ thinks $\Con(\PA)$, then it can build what it thinks is the tree of Rosser extensions, and it will think that each step maintains consistency. So the theory $T_f$ that it constructs will be consistent in $M$ and therefore have a model (the Henkin model) definable in $M$, which will therefore be an end-extension of $M$.
QED

This last application has a clear affinity with a theorem of Woodin’s, recently extended by Rasmus Blanck and Ali Enayat. See Victoria Gitman’s posts about her seminar talk on those theorems: Computable processes can produce arbitrary outputs in nonstandard models, continuation.

Alternative proof.  Here is an alternative elegant proof of the theorem based on the comments below of Vadim Kosoy. Let $T$ be any consistent computably axiomatizable theory interpreting PA, such as PA itself or ZFC or what have you. For any Turing machine program $e$, let $q(e)$ be a program carrying out the following procedure: on input $n$, search systematically for a finite function $h:X\to\mathbb{N}$, with $X$ finite and $n\in X$, and for a proof of the statement “program $p$ does not agree with $h$ on all inputs in $X$,” using the function $h$ simply as a list of values for this assertion. For the first such function and proof that is found, if any, give as output the value $h(n)$.

Since the function $e\mapsto q(e)$ is computable, there is by Kleene’s recursion theorem a program $p$ for which $p$ and $f(p)$ compute the same function, and furthermore, $T$ proves this.  So the program $p$ is searching for proofs that $p$ itself does not behave in a certain way, and then it is behaving in that way when such a proof is found.

I claim that the theory $T$ does not actually prove any of those statements, “program $p$ does not agree with $h$ on inputs in $X$,” for any particular finite function $h:X\to\mathbb{N}$. If it did prove such a statement, then for the smallest such function and proof, the output of $p$ would indeed be $h$ on all inputs in $X$, by design. Thus, there would also be a proof that the program did agree with this particular $h$, and so $T$ would prove a contradiction, contrary to our assumption that it was consistent. So $T$ actually proves none of those statements. In particular, the program $p$ computes the empty function in the standard model of arithmetic. But also, for any particular finite function $h:X\to\mathbb{N}$, we may consistently add the assertion “program $p$ agrees with $h$ on inputs in $X$” to $T$, since $T$ did not refute this assertion.

For any function $f:\mathbb{N}\to\mathbb{N}$, let $T_f$ be the theory $T$ together with all assertions of the form “program $p$ halts on input $n$ with value $k$”, for the particular value $k=f(n)$.  I claim that this theory is consistent, for if it is not, then by compactness there would be finitely many of the assertions that enable the inconsistency, and so there would be a finite function $h:X\to\mathbb{N}$, with $h=f\upharpoonright X$, such that $T$ proved the program $p$ does not agree with $h$ on inputs in $X$. But in the previous paragraph, we proved that this doesn’t happen. And so the theory $T_f$ is consistent.

Finally, note that in any model of $T_f$, the program $p$ computes the function $f$ on standard input, because these assertions are exactly made in the theory. QED

# Famous quotations in their original first-order language

Historians everywhere are shocked by the recent discovery that many of our greatest thinkers and poets had first expressed their thoughts and ideas in the language of first-order predicate logic, and sometimes modal logic, rather than in natural language. Some early indications of this were revealed in the pioneering historical research of Henle, Garfield and Tymoczko, in their work Sweet Reason:

We now know that the phenomenon is widespread!  As shown below, virtually all of our cultural leaders have first expressed themselves in the language of first-order predicate logic, before having been compromised by translations into the vernacular.

$\neg\lozenge\neg\exists s\ G(i,s)$

$(\exists x\ x=i)\vee\neg(\exists x\ x=i)$

$\left(\strut\neg\exists t\ \exists d\ \strut D(d)\wedge F(d)\wedge S_t(i,d)\right)\wedge\left(\strut\neg\exists t\ w\in_t \text{Ro}\right)\wedge\left(\strut \text{Ru}(i,y)\to \lozenge\text{C}(y,i,qb)\wedge \text{Ru}(i)\wedge\text{Ru}(i)\wedge\text{Ru}(i)\wedge\text{Ru}(i)\right)$

$\neg B_i \exists g\ G(g)$

$\forall b\ \left(\strut G(b)\wedge B(b)\to \exists x\ (D(b,x)\wedge F(x))\right)$

$(\exists!w\ W_1(w)\wedge W_2(w)), \ \ \exists w\ W_1(w)\wedge W_2(w)\wedge S(y,w)$?

$\exists s\ Y(s)\wedge S(s)\wedge \forall x\ L(x,s)$

$\exists p\ \left[\forall c\ (c\neq p\to G(c))\right]\wedge\neg G(p)$

$\exists l\ \left[L(l)\wedge \boxdot_l\left({}^\ulcorner\,\forall g\ \text{Gl}(g)\to \text{Gd}(g){}^\urcorner\right)\wedge\exists s\ \left(SH(s)\wedge B(l,s)\right)\right]$

$(\forall p\in P\ \exists c\in\text{Ch}\ c\in p)\wedge(\forall g\in G\ \exists c\in\text{Cr}\ c\in g)$

$\forall x (F(w,x)\to x=F)$

$B\wedge \forall x\ \left[S(x)\wedge T(x)\to \exists!w\ W(w)\wedge\text{Gy}(x,w)\wedge\text{Gi}(x,w)\right]$

$\exists!x\ D(x)\wedge D(\ {}^\ulcorner G(i){}^\urcorner\ )$

$\forall f\ \forall g\ \left(\strut H(f)\wedge H(g)\to f\sim g\right)\wedge\forall f\ \forall g\ \left(\strut\neg H(f)\wedge \neg H(g)\to \neg\ f\sim g\right)$

$\exists w\ \left(\strut O(w)\wedge W(w)\wedge\exists s\ (S(s)\wedge L(w,s))\right)$

$C(i)\to \exists x\ x=i$

$\neg\neg\left(\strut H(y)\wedge D(y)\right)$

$\neg (d\in K)\wedge\neg (t\in K)$

$W(i,y)\wedge N(i,y)\wedge\neg\neg\lozenge L(i,y)\wedge \left(\strut \neg\ \frac23<0\to\neg S(y)\right)$

$\lozenge \text{CL}(i)\wedge\lozenge C(i)\wedge \lozenge (\exists x\ x=i)\wedge B(i)$

$\forall x\ K_x({}^\ulcorner \forall m\ \left[M(m)\wedge S(m)\wedge F(m)\to\boxdot\ \exists w\ M(m,w)\right]{}^\urcorner)$

$\forall e\forall h\ \left(\strut G(e)\wedge E(e)\wedge H(h)\to \neg L(i,e,h)\right)$

$\forall p\ \boxdot\text{St}(p)$

$\lozenge^w_i\ \forall g\in G\ \lozenge (g\in C)$

$\forall m\ (a\leq_C m)$

$\forall t\ (p\geq t)\wedge \forall t\ (p\leq t)$

$\forall x\ (F(x)\iff x=h)$

$(\forall x\ \forall y\ x=y)\wedge(\exists x\ \exists y ([\![x=x]\!]>[\![y=y]\!]))$

$\forall p\ \left(\strut\neg W(p)\to \neg S(p)\right)$

$\forall p \left(\strut E(p)\to \forall h\in H\ A(p,h)\right)$

Dear readers, in order to assist with this important historical work, please provide translations into ordinary English in the comment section below of any or all of the assertions listed above. We are interested to make sure that all our assertions and translations are accurate.

In addition, any readers who have any knowledge of additional instances of famous quotations that were actually first made in the language of first-order predicate logic (or similar) are encouraged to post comments below detailing their knowledge. I will endeavor to add such additional examples to the list.

Thanks to Philip Welch, to my brother Jonathan, and to Ali Sadegh Daghighi (in the comments) for providing some of the examples, and to Timothy Gowers for some improvements.

# Diamond on the ordinals

I was recently surprised to discover that if there is a definable well-ordering of the universe, then the diamond principle on the ordinals holds for definable classes, automatically. In fact, the diamond principle for definable classes is simply equivalent in ZFC to the existence of a definable well-ordering of the universe. It follows as a consequence that the diamond principle for definable classes, although seeming to be fundamentally scheme-theoretic, is actually expressible in the first-order language of set theory.

In set theory, the diamond principle asserts the existence of a sequence of objects $A_\alpha$, of growing size, such that any large object at the end is very often anticipated by these approximations.  In the case of diamond on the ordinals, what we will have is a definable sequence of $A_\alpha\subseteq\alpha$, such that for any definable class of ordinals $A$ and any definable class club set $C$, there are ordinals $\theta\in C$ with $A\cap\theta=A_\theta$.  This kind of principle typically allows one to undertake long constructions that will diagonalize against all the large objects, by considering and reacting to their approximations $A_\alpha$. Since every large object $A$ is often correctly approximated that way, this enables many such constructions to succeed.


Theorem. In $\ZFC$, if there is a definable well-ordering of the universe, then $\Diamond_{\Ord}$ holds for definable classes. That is, there is a $p$-definable sequence $\langle A_\alpha\mid\alpha<\Ord\rangle$, such that for any definable class $A\of\Ord$ and any definable closed unbounded class of ordinals $C\of\Ord$ (allowing parameters), there is some $\theta\in C$ with $A\cap\theta=A_\theta$.

Proof. The theorem is proved as a theorem scheme; namely, I shall provide a specific definition for the sequence $\vec A=\langle A_\alpha\mid\alpha<\Ord\rangle$, using the same parameter $p$ as the definition of the global well-order and with a definition of closely related syntactic complexity, and then prove as a scheme, a separate statement for each definable class $A\of\Ord$ and class club $C\of\Ord$, that there is some $\alpha\in C$ with $A\cap\alpha=A_\alpha$. The definitions of the classes $A$ and $C$ may involve parameters and have arbitrary complexity.

Let $\lhd$ be the definable well-ordering of the universe, definable by a specific formula using some parameter $p$. I define the $\Diamond_{\Ord}$-sequence $\vec A=\langle A_\alpha\mid\alpha<\Ord\rangle$ by transfinite recursion. Suppose that $\vec A\restrict\theta$ has been defined. I shall let $A_\theta=\emptyset$ unless $\theta$ is a $\beth$-fixed point above the rank of $p$ and there is a set $A\of\theta$ and a closed unbounded set $C\of\theta$, with both $A$ and $C$ definable in the structure $\langle V_\theta,\in\rangle$ (allowing parameters), such that $A\cap\alpha\neq A_\alpha$ for every $\alpha\in C$. In this case, I choose the least such pair $(A,C)$, minimizing first on the maximum of the logical complexities of the definitions of $A$ and of $C$, and then minimizing on the total length of the defining formulas of $A$ and $C$, and then minimizing on the Gödel codes of those formulas, and finally on the parameters used in the definitions, using the well-order $\lhd\restrict V_\theta$. For this minimal pair, let $A_\theta=A$. This completes the definition of the sequence $\vec A=\langle A_\alpha\mid\alpha\in\Ord\rangle$.

Let me remark on a subtle point, since the meta-mathematical issues loom large here. The definition of $\vec A$ is internal to the model, and at stage $\theta$ we ask about subsets of $\theta$ definable in $\langle V_\theta,\in\rangle$, using the truth predicate for this structure. If we were to run this definition inside an $\omega$-nonstandard model, it could happen that the minimal formula we get is nonstandard, and in this case, the set $A$ would not actually be definable by a standard formula. Also, even when $A$ is definable by a standard formula, it might be paired (with some constants), with a club set $C$ that is defined only by a nonstandard formula (and this is why we minimize on the maximum of the complexities of the definitions of $A$ and $C$ together). So one must give care in the main argument keeping straight the distinction between the meta-theoretic natural numbers and the internal natural numbers of the object theory $\ZFC$.

Let me now prove that the sequence $\vec A$ is indeed a $\Diamond_{\Ord}$-sequence for definable classes. The argument follows in spirit the classical proof of $\Diamond$ in $L$, subject to the mathematical issues I mentioned. If the sequence $\vec A$ is not a diamond sequence, then there is some definable class $A\of\Ord$, defined in $\langle V,\in\rangle$ by a specific formula $\varphi$ and parameter $z$, and definable club $C\of\Ord$, defined by some $\psi$ and parameter $y$, with $A\cap\alpha\neq A_\alpha$ for every $\alpha\in C$. We may assume without loss that these formulas are chosen so as to be minimal in the sense of the construction, so that the maximum of the complexities of $\varphi$ and $\psi$ are as small as possible, and the lengths of the formulas, and the Gödel codes and finally the parameters $z,y$ are $\lhd$-minimal, respectively, successively. Let $m$ be a sufficiently large natural number, larger than the complexity of the definitions of $\lhd$, $A$, $C$, and large enough so that the minimality condition we just discussed is expressible by a $\Sigma_m$ formula. Let $\theta$ be any $\Sigma_m$-correct ordinal above the ranks of the parameters used in the definitions. It follows that the restrictions $\lhd\restrict V_\theta$ and also $A\cap\theta$ and $C\cap\theta$ and $\vec A\restrict\theta$ are definable in $\langle V_\theta,\in\rangle$ by the same definitions and parameters as their counterparts in $V$, that $C\cap\theta$ is club in $\theta$, and that $A\cap\theta$ and $C\cap\theta$ form a minimal pair using those definitions with $A\cap\alpha\neq A_\alpha$ for any $\alpha\in C\cap\theta$. Thus, by the definition of $\vec A$, it follows that $A_\theta=A\cap\theta$. Since $C\cap\theta$ is unbounded in $\theta$ and $C$ is closed, it follows that $\theta\in C$, and so $A_\theta=A\cap\theta$ contradicts our assumption about $A$ and $C$. So there are no such counterexample classes, and thus $\vec A$ is a $\Diamond_{\Ord}$-sequence with respect to definable classes, as claimed.
QED

Theorem. The following are equivalent over $\ZFC$.

1. There is a definable well-ordering of the universe, using some set parameter $p$.
2. $V=\HOD_{\{p\}}$, for some set $p$.
3. $\Diamond_{\Ord}$ holds for definable classes. That is, there is a set parameter $p$ and a definable sequence $\vec A=\langle A_\alpha\mid\alpha<\Ord\rangle$, such that for any definable class $A\of\Ord$ and definable class club $C\of\Ord$, there is some $\alpha\in C$ with $A\cap\alpha=A_\alpha$.

Proof. Let me first give the argument, and then afterward discuss some issues about the formalization, which involves some subtle issues.

($1\to 2$) $\newcommand\rank{\text{rank}}$Suppose that $\lhd$ is a $p$-definable well-ordering of $V$, which means that every set has a $\lhd$-minimal element. Let us refine this order by defining $x\lhd’ y$, just in case $\rank(x)<\rank(y)$ or $\rank(x)=\rank(y)$ and $x\lhd y$. The new order is also a well-order, which now respects rank. In particular, the order $\lhd’$ is set-like, and so every object $x$ is the $\alpha^{th}$ element with respect to the $\lhd’$-order, for some ordinal $\alpha$. Thus, every object is definable from $p$ and an ordinal, and so $V=\HOD_{\{p\}}$, as desired.

($2\to 1$) If $V=\HOD_{\{p\}}$, then we have the canonical well-order of $\HOD$ using parameter $p$, similar to how one shows that the axiom of choice holds in $\HOD$. Namely, define $x\lhd y$ if and only if $\rank(x)<\rank(y)$, or the ranks are the same, but $x$ is definable from $p$ and ordinal parameters in some $V_\theta$ with a smaller $\theta$ than $y$ is, or the ranks are the same and the $\theta$ is the same, but $x$ is definable in that $V_\theta$ by a formula with a smaller Gödel code, or with the same formula but smaller ordinal parameters. It is easy to see that this is a $p$-definable well-ordering of the universe.

($1\to 3$) This is the content of the theorem above.

($3\to 1$) If $\vec A$ is a $p$-definable $\Diamond_{\Ord}$-sequence for definable classes, then it is easy to see that if $A$ is a set of ordinals, then $A$ must arise as $A_\alpha$ for unboundedly many $\alpha$. In $\ZFC$, using the axiom of choice, it is a standard fact that every set is coded by a set of ordinals. So let us define that $x\lhd y$, just in case $x$ is coded by a set of ordinals that appears earlier on $\vec A$ than any set of ordinals coding $y$. This is clearly a well-ordering, since the map sending $x$ to the ordinal $\alpha$ for which $A_\alpha$ codes $x$ is an $\Ord$-ranking of $\lhd$. So there is a $p$-definable well-ordering of the universe.
QED

An observant reader will notice some meta-mathematical issues concerning the previous theorem. The issue is that statements 1 and 2 are known to be expressible by statements in the first-order language of set theory, as single statements, but for statement 3 we have previously expressed it only as a scheme of first-order statements. So how can they be equivalent? The answer is that the full scheme-theoretic content of statement 3 follows already from instances in which the complexity of the definitions of $A$ and $C$ are bounded. Basically, once one gets the global well-order, then one can construct a $\Diamond_{\Ord}$-sequence that works for all definable classes. In this sense, we may regard the diamond principle $\Diamond_{\Ord}$ for definable classes as not really a scheme of statements, but rather equivalent to a single first-order assertion.

Lastly, let me consider the content of the theorems in Gödel-Bernays set theory or Kelley-Morse set theory. Of course, we know that there can be models of these theories that do not have $\Diamond_{\Ord}$ in the full second-order sense. For example, it is relatively consistent with ZFC that an inaccessible cardinal $\kappa$ does not have $\Diamond_\kappa$, and in this case, the structure $\langle V_\kappa,\in,V_{\kappa+1}\rangle$ will satisfy GBC and even KM, but it won’t have $\Diamond_{\Ord}$ with respect to all classes, even though it has a definable well-ordering of the universe (since there is such a well-ordering in $V_{\kappa+1}$). But meanwhile, there will be a $\Diamond_{\Ord}$-sequence that works with respect to classes that are definable from that well-ordering and parameters, simply by following the construction given in the theorem.

This leads to several extremely interesting questions, about which I am currently thinking, concerning instances where we can have $\Diamond_\kappa$ for definable classes in $V_\kappa$, even when the full $\Diamond_\kappa$ fails. Stay tuned!

# Set-theoretic mereology: the theory of the subset relation is decidable


To give a few examples, if $\newcommand\HF{\text{HF}}\HF$ is the set of hereditarily finite sets, then $\langle\HF,\of\rangle$, using the usual subset relation, is an unbounded atomic locally Boolean lattice. More generally, if $V$ is any model of set theory (even a very weak theory is sufficient), then $\langle V,\of\rangle$ is an unbounded atomic locally Boolean lattice.

I should like to prove here that the theory of unbounded atomic locally Boolean lattice orders is decidable, and furthermore admits elimination of quantifiers down to the language including the Boolean operations and the relations expressing the height or size of an object, $|x|\geq n$ and $|x|=n$.

Theorem. Every formula in the language of lattices is equivalent, over the theory of unbounded atomic locally Boolean lattices, to a quantifier-free formula in the language of the order $a\of b$, equality $a=b$, meet $a\intersect b$, join $a\union b$, relative complement $a-b$, constant $0$, the unary relation $|x|\geq n$, and the unary relation $|x|=n$, where $n$ is respectively any natural number.

Proof. We prove the result by induction on formulas. The collection of formulas equivalent to a quantifier-free formula in that language clearly includes all atomic formulas and is closed under Boolean combinations. So it suffices to eliminate the quantifier in a formula of the form $\exists x\, \varphi(x,\ldots)$, where $\varphi(x,\ldots)$ is quantifier-free in that language. Let us make a number of observations that will enable various simplifying assumptions about the form of $\varphi$.

Because equality of terms is expressible by the identity $a=b\iff a\of b\of a$, we do not actually need $=$ in the language (and here I refer to the use of equality in atomic formulas of the form $s=t$ where $s$ and $t$ are terms, and not to the incidental appearance of the symbol $=$ in the unary predicate $|x|=n$, which is an unrelated use of this symbol, a mere stylistic flourish). Similarly, in light of the equivalence $a\of b\iff |a-b|=0$, we do not need to make explicit reference to the order $a\of b$. So we may assume that all atomic assertions in $\varphi$ have the form $|t|\geq n$ or $|t|=n$ for some term $t$ in the language of meet, join, relative complement and $0$. We may omit the need for explicit negation in the formula by systematically applying the equivalences:
$$\neg(|t|\geq n)\iff \bigvee_{k<n}|t|=k\quad\text{ and}$$
$$\neg(|t|=n)\iff (|t|\geq n+1)\vee\bigvee_{k<n}|t|=k.$$
So we have reduced to the case where $\varphi$ is a positive Boolean combination of expressions of the form $|t|\geq n$ and $|t|=n$.

Let us consider the form of the terms $t$ that may arise in the formula. List all the variables $x=x_0,x_1,\ldots,x_N$ that arise in any of the terms appearing in $\varphi$, and consider the Venn diagram corresponding to these variables. The cells of this Venn diagram can each be described by a term of the form $\bigwedge_{i\leq N} \pm x_i$, which I shall refer to as a cell term, where $\pm x_i$ means that either $x_i$ appears or else we have subtracted $x_i$ from the other variables. Since we have only relative complements in a locally Boolean lattice, however, and not absolute complements, we need only consider the cells where at least one variable appears positively, since the exterior region in the Venn diagram is not actually represented by any term. In this way, every term in the language of locally Boolean lattices is a finite union of such cell terms, plus $\emptyset$ (which I suppose can be viewed as an empty union). Note that distinct cell terms are definitely representing disjoint objects in the lattice.

Next, by considering the possible sizes of $s-t$, $s\intersect t$ and $t-s$, observe that
$$|s\union t|\geq n\iff \bigvee_{i+j+k=n}(|s|\geq i+j)\wedge(|s\intersect t|\geq j)\wedge(|t|\geq j+k).$$
Through repeated application of this, we may reduce any assertion about $|t|$ for a term to a Boolean combination of assertions about cell terms. (Note that size assertions about $\emptyset$ are trivially settled by the theory and can be eliminated.)

Let us now focus on the quantified variable $x$ separately from the other variables, for it may appear either positively or negatively in such a cell term. More precisely, each cell term in the variables $x=x_0,x_1,\ldots,x_N$ is equivalent to $x\intersect c$ or $c-x$, for some cell term $c$ in the variables $x_1,\ldots,x_N$, that is, not including $x$, or to the term $x-(x_1\union\cdots\union x_N)$, which is the cell term for which $x$ is the only positive variable.

We have reduced the problem to the case where we want to eliminate the quantifier from $\exists x\, \varphi$, where $\varphi$ is a positive Boolean combination of size assertions about cell terms. We may express $\varphi$ in disjunctive normal form and then distribute the quantifier over the disjunct to reduce to the case where $\varphi$ is a conjunction of size assertions about cell terms. Each cell term has the form $x\intersect c$ or $c-x$ or $x-(x_1\union\cdots x_N)$, where $c$ is a cell term in the list of variables without $x$. Group the conjuncts of $\varphi$ that use the same cell term $c$ in this way together. The point now is that assertions about whether there is an object $x$ in the lattice such that certain cell terms obey various size requirements amount to the conjunction of various size requirements about cells in the variables not including $x$. For example, the assertion $$\exists x\,(|x\intersect c|\geq 3)\wedge(|x\intersect c|\geq 7)\wedge(|c-x|=2)$$ is equivalent (over the theory of unbounded atomic locally Boolean lattices) to the assertion $|c|\geq 9$, since we may simply let $x$ be all but $2$ atoms of $c$, and this will have size at least $7$, which is also at least $3$. If contradictory assertions are made, such as $\exists x\, (|x\intersect c|\geq 5\wedge |x\intersect c|=3)$, then the whole formula is equivalent to $\perp$, which can be expressed without quantifiers as $0\neq 0$.

Next, the key observation of the proof is that assertions about the existence of such $x$ for different cell terms in the variables not including $x$ will succeed or fail independently, since those cell terms are representing disjoint elements of the lattice, and so one may take the final witnessing $x$ to be the union of the witnesses for each piece. So to eliminate the quantifier, we simply group together the atomic assertions being made about the cell terms in the variables without $x$, and then express the existence assertion as a size requirement on those cell terms. For example, the assertion $$\exists x\, (|c\intersect x|\geq 5)\wedge(|c-x|=6)\wedge (|d\intersect x|\geq 7),$$ where $c$ and $d$ are distinct cell terms, is equivalent to $$(|c|\geq 11)\wedge(|d|\geq 7),$$ since $c$ and $d$ are disjoint and so we may let $x$ be the appropriate part of $c$ and a suitable piece of $d$. The only remaining complication concerns instances of the term $x-(x_1\union\cdots\union x_N)$. But for these, the thing to notice is that any single positive size assertion about this term is realizable in our theory, since we have assumed that the lattice is unbounded, and so there will always be as many atoms as desired disjoint from any finite list of objects. But we must again pay attention to whether the requirements expressed by distinct clauses are contradictory.

Altogether, I have provided a procedure for eliminating quantifiers from any assertion in the language of locally Boolean lattices, down to the language augmented by unary predicates expressing the size of an object. This procedure works in any unbounded atomic locally Boolean lattice, and so the theorem is proved. QED

Corollary. The theory of unbounded atomic locally Boolean lattices is complete.

Proof. Every sentence in this theory is equivalent by the procedure to a quantifier-free sentence in the stated language. But since such sentences have no variables, they must simply be a Boolean combination of trivial size assertions about $0$, such as $(|0|\geq 2)\vee \neg(|0|=5)$, whose truth value is settled by the theory. QED

Corollary. The structure of hereditarily finite sets $\langle\HF,\of\rangle$ is an elementary substructure of the entire set-theoretic universe $\langle V,\of\rangle$, with the inclusion relation.

Proof. These structures are both unbounded atomic locally Boolean lattices, and so they each support the quantifier-elimination procedure. But they agree on the truth of any quantifier-free assertion about the sizes of hereditarily finite sets, and so they they must agree on all truth assertions about objects in $\HF$. QED

Corollary. The structure $\langle V,\of\rangle$ has a decidable theory. The structure $\langle\HF,\of\rangle$ has a decidable elementary diagram, and hence a computably decidable presentation.

Proof. The theory is the theory of unbounded atomic locally Boolean lattices. Since the structure $\langle\HF,\of\rangle$ has a computable presentation via the Ackerman coding of hereditarily finite sets, for which the subset relation and the size relations are computable, it follows that we may also compute the truth of any formula by first reducing it to a quantifier-free assertions of those types. So this is a computably decidable presentation. QED

# Being HOD-of-a-set is invariant throughout the generic multiverse

$\newcommand\HOD{\text{HOD}}$The axiom $V=\HOD$, introduced by Gödel, asserts that every set is ordinal definable. This axiom has a subtler foundational aspect than might at first be expected. The reason is that the general concept of “object $x$ is definable using parameter $p$” is not in general first-order expressible in set theory; it is of course a second-order property, which makes sense only relative to a truth predicate, and by Tarski’s theorem, we can have no first-order definable truth predicate. Thus, the phrase “definable using ordinal parameters” is not directly meaningful in the first-order language of set theory without further qualification or explanation. Fortunately, however, it is a remarkable fact that when we allow definitions to use arbitrary ordinal parameters, as we do with $\HOD$, then we can in fact make such qualifications in such a way that the axiom becomes first-order expressible in set theory. Specifically, we say officially that $V=\HOD$ holds, if for every set $x$, there is an ordinal $\theta$ with $x\in V_\theta$, for which which $x$ is definable by some formula $\psi(x)$ in the structure $\langle V_\theta,{\in}\rangle$ using ordinal parameters. Since $V_\theta$ is a set, we may freely make reference to first-order truth in $V_\theta$ without requiring any truth predicate in $V$. Certainly any such $x$ as this is also ordinal-definable in $V$, since we may use $\theta$ and the Gödel-code of $\psi$ also as parameters, and note that $x$ is the unique object such that it is in $V_\theta$ and satisfies $\psi$ in $V_\theta$. (Note that inside an $\omega$-nonstandard model of set theory, we may really need to use $\psi$ as a parameter, since it may be nonstandard, and $x$ may not be definable in $V_\theta$ using a meta-theoretically standard natural number; but fortunately, the Gödel code of a formula is an integer, which is still an ordinal, and this issue is the key to the issue.) Conversely, if $x$ is definable in $V$ using formula $\varphi(x,\vec\alpha)$ with ordinal parameters $\vec\alpha$, then it follows by the reflection theorem that $x$ is defined by $\varphi(x,\vec\alpha)$ inside some $V_\theta$. So this formulation of $V=HOD$ is expressible and exactly captures the desired second-order property that every set is ordinal-definable.

Consider next the axiom $V=\HOD(b)$, asserting that every set is definable from ordinal parameters and parameter $b$. Officially, as before, $V=\HOD(b)$ asserts that for every $x$, there is an ordinal $\theta$, formula $\psi$ and ordinals $\vec \alpha<\theta$, such that $x$ is the unique object in $V_\theta$ for which $\langle V_\theta,{\in}\rangle\models\psi(x,\vec\alpha,b)$, and the reflection argument shows again that this way of defining the axiom exactly captures the intended idea.

The axiom I actually want to focus on is $\exists b\,\left( V=\HOD(b)\right)$, asserting that the universe is $\HOD$ of a set. (I assume ZFC in the background theory.) It turns out that this axiom is constant throughout the generic multiverse.

Theorem. The assertion $\exists b\, (V=\HOD(b))$ is forcing invariant.

• If it holds in $V$, then it continues to hold in every set forcing extension of $V$.
• If it holds in $V$, then it holds in every ground of $V$.

Thus, the truth of this axiom is invariant throughout the generic multiverse.

Proof. Suppose that $\text{ZFC}+V=\HOD(b)$, and $V[G]$ is a forcing extension of $V$ by generic filter $G\subset\mathbb{P}\in V$. By the ground-model definability theorem, it follows that $V$ is definable in $V[G]$ from parameter $P(\mathbb{P})^V$. Thus, using this parameter, as well as $b$ and additional ordinal parameters, we can define in $V[G]$ any particular object in $V$. Since this includes all the $\mathbb{P}$-names used to form $V[G]$, it follows that $V[G]=\HOD(b,P(\mathbb{P})^V,G)$, and so $V[G]$ is $\HOD$ of a set, as desired.

Conversely, suppose that $W$ is a ground of $V$, so that $V=W[G]$ for some $W$-generic filter $G\subset\mathbb{P}\in W$, and $V=\HOD(b)$ for some set $b$. Let $\dot b$ be a name for which $\dot b_G=b$. Every object $x\in W$ is definable in $W[G]$ from $b$ and ordinal parameters $\vec\alpha$, so there is some formula $\psi$ for which $x$ is unique such that $\psi(x,b,\vec\alpha)$. Thus, there is some condition $p\in\mathbb{P}$ such that $x$ is unique such that $p\Vdash\psi(\check x,\dot b,\check{\vec\alpha})$. If $\langle p_\beta\mid\beta<|\mathbb{P}|\rangle$ is a fixed enumeration of $\mathbb{P}$ in $W$, then $p=p_\beta$ for some ordinal $\beta$, and we may therefore define $x$ in $W$ using ordinal parameters, along with $\dot b$ and the fixed enumeration of $\mathbb{P}$. So $W$ thinks the universe is $\HOD$ of a set, as desired.

Since the generic multiverse is obtained by iteratively moving to forcing extensions to grounds, and each such movement preserves the axiom, it follows that $\exists b\, (V=\HOD(b))$ is constant throughout the generic multiverse. QED

Theorem. If $V=\HOD(b)$, then there is a forcing extension $V[G]$ in which $V=\HOD$ holds.

Proof. We are working in ZFC. Suppose that $V=\HOD(b)$. We may assume $b$ is a set of ordinals, since such sets can code any given set. Consider the following forcing iteration: first add a Cohen real $c$, and then perform forcing $G$ that codes $c$, $P(\omega)^V$ and $b$ into the GCH pattern at uncountable cardinals, and then perform self-encoding forcing $H$ above that coding, coding also $G$ (see my paper on Set-theoretic geology for further details on self-encoding forcing). In the final model $V[c][G][H]$, therefore, the objects $c$, $b$, $P(\omega)^V$, $G$ and $H$ are all definable without parameters. Since $V\subset V[c][G][H]$ has a closure point at $\omega$, it satisfies the $\omega_1$-approximation and cover properties, and therefore the class $V$ is definable in $V[c][G][H]$ using $P(\omega)^V$ as a parameter. Since this parameter is itself definable without parameters, it follows that $V$ is parameter-free definable in $V[c][G][H]$. Since $b$ is also definable there, it follows that every element of $\HOD(b)^V=V$ is ordinal-definable in $V[c][G][H]$. And since $c$, $G$ and $H$ are also definable without parameters, we have $V[c][G][H]\models V=\HOD$, as desired. QED

Corollary. The following are equivalent.

1. The universe is $\HOD$ of a set: $\exists b\, (V=\HOD(b))$.
2. Somewhere in the generic multiverse, the universe is $\HOD$ of a set.
3. Somewhere in the generic multiverse, the axiom $V=\HOD$ holds.
4. The axiom $V=\HOD$ is forceable.

Proof. This is an immediate consequence of the previous theorems. $1\to 4\to 3\to 2\to 1$. QED

Corollary. The axiom $V=\HOD$, if true, even if true anywhere in the generic multiverse, is a switch.

Proof. A switch is a statement such that both it and its negation are necessarily possible by forcing; that is, in every set forcing extension, one can force the statement to be true and also force it to be false. We can always force $V=\HOD$ to fail, simply by adding a Cohen real. If $V=\HOD$ is true, then by the first theorem, every forcing extension has $V=\HOD(b)$ for some $b$, in which case $V=\HOD$ remains forceable, by the second theorem. QED

# Different models of set theory with the same subset relation

Recently Makoto Kikuchi (Kobe University) asked me the following interesting question, which arises very naturally if one should adopt a mereological perspective in the foundations of mathematics, placing a focus on the parthood relation rather than the element-of relation. In set theory, this perspective would lead one to view the subset or inclusion relation $\subseteq$ as the primary fundamental relation, rather than the membership $\in$ relation.

Question. Can there be two different models of set theory, with the same inclusion relation?

We spent an evening discussing it, over delicious (Rokko-michi-style) okonomiyaki and bi-ru, just like old times, except that we are in Tokyo at the CTFM 2015, and I’d like to explain the answer, which is yes, this always happens in every model of set theory.

Theorem. In any universe of set theory $\langle V,\in\rangle$, there is a definable relation $\in^*$, different from $\in$, such that $\langle V,\in^*\rangle$ is a model of set theory, in fact isomorphic to the original universe $\langle V,\in\rangle$, for which the corresponding inclusion relation $$u\subseteq^* v\iff \forall a\, (a\in^* u\to a\in^* v)$$ is identical to the usual inclusion relation $u\subseteq v$.

Proof. Let $\theta:V\to V$ be any definable non-identity permutation of the universe, and let $\tau:u\mapsto \theta[u]=\{\ \theta(a)\mid a\in u\ \}$ be the function determined by pointwise image under $\theta$. Since $\theta$ is bijective, it follows that $\tau$ is also a bijection of $V$ to $V$, since every set is the $\theta$-image of a unique set. Furthermore, $\tau$ is an automorphism of $\langle V,\subseteq\rangle$, since $$u\subseteq v\iff\theta[u]\subseteq\theta[v]\iff\tau(u) \subseteq\tau(v).$$ I had used this idea a few years ago in my answer to the MathOverflow question, Is the inclusion version of Kunen inconsistency theorem true?, which shows that there are nontrivial $\subseteq$ automorphisms of the universe. Note that since $\tau(\{a\})=\{\theta(a)\}$, it follows that any instance of nontriviality $\theta(a)\neq a$ in $\theta$ leads immediately to an instance of nontriviality in $\tau$.

Using the map $\tau$, define $a\in^* b\iff\tau(a)\in\tau(b)$. By definition, therefore, $\tau$ is an isomorphism of $\langle V,\in^*\rangle\cong\langle V,\in\rangle$. Let us show that $\in^*\neq \in$. Since $\theta$ is nontrivial, there is an $\in$-minimal set $a$ with $\theta(a)\neq a$. By minimality, $\theta[a]=a$ and so $\tau(a)=a$. But as mentioned, $\tau(\{a\})=\{\theta(a)\}\neq\{a\}$. So we have $a\in\{a\}$, but $\tau(a)=a\notin\{\theta(a)\}=\tau(\{a\})$ and hence $a\notin^*\{a\}$. So the two relations are different.

Meanwhile, consider the corresponding subset relation. Specifically, $u\subseteq^* v$ is defined to mean $\forall a\,(a\in^* u\to a\in^* v)$, which holds if and only if $\forall a\, (\tau(a)\in\tau(u)\to \tau(a)\in\tau(v))$; but since $\tau$ is surjective, this holds if and only if $\tau(u)\subseteq \tau(v)$, which as we observed at the beginning of the proof, holds if and only if $u\subseteq v$. So the corresponding subset relations $\subseteq^*$ and $\subseteq$ are identical, as desired.

Another way to express what is going on is that $\tau$ is an isomorphism of the structure $\langle V,{\in^*},{\subseteq}\rangle$ with $\langle V,{\in},{\subseteq}\rangle$, and so $\subseteq$ is in fact that same as the corresponding inclusion relation $\subseteq^*$ that one would define from $\in^*$. QED

Corollary. One cannot define $\in$ from $\subseteq$ in a model of set theory.

Proof. The map $\tau$ is a $\subseteq$-automorphism, and so it preserves every relation definable from $\subseteq$, but it does not preserve $\in$. QED

Nevertheless, I claim that the isomorphism type of $\langle V,\in\rangle$ is implicit in the inclusion relation $\subseteq$, in the sense that any other class relation $\in^*$ having that same inclusion relation is isomorphic to the $\in$ relation.

Theorem. Assume ZFC in the universe $\langle V,\in\rangle$. Suppose that $\in^*$ is a class relation for which $\langle V,\in^*\rangle$ is a model of set theory (a weak set theory suffices), such that the corresponding inclusion relation $$u\subseteq^* v\iff\forall a\,(a\in^* u\to a\in^* v)$$is the same as the usual inclusion relation $u\subseteq v$. Then the two membership relations are isomorphic $$\langle V,\in\rangle\cong\langle V,\in^*\rangle.$$

Proof. Since the singleton set $\{a\}$ has exactly two subsets with respect to the usual $\subseteq$ relation — the empty set and itself — this must also be true with respect to the inclusion relation $\subseteq^*$ defined via $\in^*$, since we have assumed $\subseteq^*=\subseteq$. Thus, the object $\{a\}$ is also a singleton with respect to $\in^*$, and so there is a unique object $\eta(a)$ such that $x\in^* a\iff x=\eta(a)$. By extensionality and since every object has its singleton, it follows that $\eta:V\to V$ is both one-to-one and onto. Let $\theta=\eta^{-1}$ be the inverse permutation.

Observe that $a\in u\iff \{a\}\subseteq u\iff \{a\}\subseteq^* u\iff\eta(a)\in^* u$. Thus, $$b\in^* u\iff \theta(b)\in u.$$

Using $\in$-recursion, define $b^*=\{\ \theta(a^*)\mid a\in b\ \}$. The map $b\mapsto b^*$ is one-to-one by $\in$-recursion, since if there is no violation of this for the elements of $b$, then we may recover $b$ from $b^*$ by applying $\theta^{-1}$ to the elements of $b^*$ and then using the induction assumption to find the unique $a$ from $a^*$ for each $\theta(a^*)\in b^*$, thereby recovering $b$. So $b\mapsto b^*$ is injective.

I claim that this map is also surjective. If $y_0\neq b^*$ for any $b$, then there must be an element of $y_0$ that is not of the form $\theta(b^*)$ for any $b$. Since $\theta$ is surjective, this means there is $\theta(y_1)\in y_0$ with $y_1\neq b^*$ for any $b$. Continuing, there is $y_{n+1}$ with $\theta(y_{n+1})\in y_n$ and $y_{n+1}\neq b^*$ for any $b$. Let $z=\{\ \theta(y_n)\mid n\in\omega\ \}$. Since $x\in^* u\iff \theta(x)\in u$, it follows that the $\in^*$-elements of $z$ are precisely the $y_n$’s. But $\theta(y_{n+1})\in y_n$, and so $y_{n+1}\in^* y_n$. So $z$ has no $\in^*$-minimal element, violating the axiom of foundation for $\in^*$, a contradiction. So the map $b\mapsto b^*$ is a bijection of $V$ with $V$.

Finally, we observe that because $$a\in b\iff\theta(a^*)\in b^*\iff a^*\in^* b^*,$$ it follows that the map $b\mapsto b^*$ is an isomorphism of $\langle V,\in\rangle$ with $\langle V,\in^*\rangle$, as desired. QED

The conclusion is that although $\in$ is not definable from $\subseteq$, nevertheless, the isomorphism type of $\in$ is implicit in $\subseteq$, in the sense that any other class relation $\in^*$ giving rise to the same inclusion relation $\subseteq^*=\subseteq$ is isomorphic to $\in$.

Meanwhile, I do not yet know what the situation is when one drops the assumption that $\in^*$ is a class with respect to the $\langle V,\in\rangle$ universe.

Question. Can there be two models of set theory $\langle M,\in\rangle$ and $\langle M,\in^*\rangle$, not necessarily classes with respect to each other, which have the same inclusion relation $\subseteq=\subseteq^*$, but which are not isomorphic?

(This question is now answered! See my joint paper with Kikuchi at Set-theoretic mereology.)

# Upward countable closure in the generic multiverse of forcing to add a Cohen real

I’d like to discuss my theorem that the collection of models $M[c]$ obtained by adding an $M$-generic Cohen real $c$ over a fixed countable transitive model of set theory $M$ is upwardly countably closed, in the sense that every increasing countable chain has an upper bound.

I proved this theorem back in 2011, while at the Young Set Theory Workshop in Bonn and continuing at the London summer school on set theory, in a series of conversations with Giorgio Venturi. The argument has recently come up again in various discussions, and so let me give an account of it.

We consider the collection of all forcing extensions of a fixed countable transitive model $M$ of ZFC by the forcing to add a Cohen real, models of the form $M[c]$, and consider the question of whether every countable increasing chain of these models has an upper bound. The answer is yes!  (Actually, Giorgio wants to undertake forcing constructions by forcing over this collection of models to add a generic upward directed system of models; it follows from this theorem that this forcing is countably closed.) This theorem fits into the theme of my earlier post, Upward closure in the toy multiverse of all countable models of set theory, where similar theorems are proved, but not this one exactly.

Theorem. For any countable transitive model $M\models\text{ZFC}$, the collection of all forcing extensions $M[c]$ by adding an $M$-generic Cohen real is upward-countably closed. That is, for any countable tower of such forcing extensions
$$M[c_0]\subset M[c_1]\subset\cdots\subset M[c_n]\subset\cdots,$$
we may find an $M$-generic Cohen real $d$ such that $M[c_n]\subset M[d]$ for every natural number $n$.

Proof. $\newcommand\Add{\text{Add}}$Suppose that we have such a tower of forcing extensions $M[c_0]\subset M[c_1]\subset\cdots$, and so on. Note that if $M[b]\subset M[c]$ for $M$-generic Cohen reals $b$ and $c$, then $M[c]$ is a forcing extension of $M[b]$ by a quotient of the Cohen-real forcing. But since the Cohen forcing itself has a countable dense set, it follows that all such quotients also have a countable dense set, and so $M[c]$ is actually $M[b][b_1]$ for some $M[b]$-generic Cohen real $b_1$. Thus, we may view the tower as having the form:
$$M[b_0]\subset M[b_0\times b_1]\subset\cdots\subset M[b_0\times b_1\times\cdots\times b_n]\subset\cdots,$$
where now it follows that any finite collection of the reals $b_i$ are mutually $M$-generic.

Of course, we cannot expect in general that the real $\langle b_n\mid n<\omega\rangle$ is $M$-generic for $\Add(\omega,\omega)$, since this real may be very badly behaved. For example, the sequence of first-bits of the $b_n$’s may code a very naughty real $z$, which cannot be added by forcing over $M$ at all. So in general, we cannot allow that this sequence is added to the limit model $M[d]$. (See further discussion in my post Upward closure in the toy multiverse of all countable models of set theory.)

We shall instead undertake a construction by making finitely many changes to each real $b_n$, resulting in a real $d_n$, in such a way that the resulting combined real $d=\oplus_n d_n$ is $M$-generic for the forcing to add $\omega$-many Cohen reals, which is of course isomorphic to adding just one. To do this, let’s get a little more clear with our notation. We regard each $b_n$ as an element of Cantor space $2^\omega$, that is, an infinite binary sequence, and the corresponding filter associated with this real is the collection of finite initial segments of $b_n$, which will be an $M$-generic filter through the partial order of finite binary sequences $2^{<\omega}$, which is one of the standard isomorphic copies of Cohen forcing. We will think of $d$ as a binary function on the plane $d:\omega\times\omega\to 2$, where the $n^{th}$ slice $d_n$ is the corresponding function $\omega\to 2$ obtained by fixing the first coordinate to be $n$.

Now, we enumerate the countably many open dense subsets for the forcing to add a Cohen real $\omega\times\omega\to 2$ as $D_0$, $D_1$, and so on. There are only countably many such dense sets, because $M$ is countable. Now, we construct $d$ in stages. Before stage $n$, we will have completely specified $d_k$ for $k<n$, and we also may be committed to a finite condition $p_{n-1}$ in the forcing to add $\omega$ many Cohen reals. We consider the dense set $D_n$. We may factor $\Add(\omega,\omega)$ as $\Add(\omega,n)\times\Add(\omega,[n,\omega))$. Since $d_0\times\cdots\times d_{n-1}$ is actually $M$-generic (since these are finite modifications of the corresponding $b_k$’s, which are mutually $M$-generic, it follows that there is some finite extension of our condition $p_{n-1}$ to a condition $p_n\in D_n$, which is compatible with $d_0\times\cdots\times d_{n-1}$. Let $d_n$ be the same as $b_n$, except finitely modified to be compatible with $p_n$. In this way, our final real $\oplus_n d_n$ will contain all the conditions $p_n$, and therefore be $M$-generic for $\Add(\omega,\omega)$, yet every $b_n$ will differ only finitely from $d_n$ and hence be an element of $M[d]$. So we have $M[b_0]\cdots[b_n]\subset M[d]$, and we have found our upper bound. QED

Notice that the real $d$ we construct is not only $M$-generic, but also $M[c_n]$-generic for every $n$.

My related post, Upward closure in the toy multiverse of all countable models of set theory, which is based on material in my paper Set-theoretic geology, discusses some similar results.

# Every ordinal has only finitely many order-types for its final segments

I was recently asked an interesting elementary question about the number of possible order types of the final segments of an ordinal, and in particular, whether there could be an ordinal realizing infinitely many different such order types as final segments.  Since I found it interesting, let me write here how I replied.

The person asking me had noted that every nonempty final segment of the first infinite ordinal $\omega$ is isomorphic to $\omega$ again, since if you start counting from $5$ or from a million, you have just as far to go in the natural numbers. Thus, if one includes the empty final segment, there are precisely two order-types that arise as final segments of $\omega$, namely, $0$ and $\omega$ itself. A finite ordinal $n$, in contrast, has precisely $n+1$ many final segments, corresponding to each of the possible cuts between any of the elements or before all of them or after all of them, and these final segments, considered as orders themselves, all have different sizes and hence are not isomorphic.

He wanted to know whether an ordinal could have infinitely many different order-types for its tails.

Question. Is there an ordinal having infinitely many different isomorphism types for its final segments?

The answer is no, and I’d like to explain why. I’ll discuss two different arguments, the first being an easy direct argument aimed only at this answer, and the second being a more careful analysis aimed at understanding exactly how many and which order-types arise as the order type of a final segment of an ordinal $\alpha$.

Theorem. Every ordinal has only finitely many order types of its final segments.

Proof: Suppose that $\alpha$ is an ordinal, and consider the order types of the final segments $[\eta,\alpha)$, for $\eta\leq\alpha$. Note that as $\eta$ increases, the final segment $[\eta,\alpha)$ becomes smaller as a suborder, and so it’s order type does not go up. And since these are well-orders, it can go down only finitely many times. So only finitely many order types arise, and the theorem is proved. QED

But let’s figure out exactly how many and which order types arise.

Theorem. The number of order types of final segments of an ordinal $\alpha$ is precisely $n+1$, where $n$ is the number of terms in the Cantor normal form of $\alpha$, and one can describe those order types in terms of the normal form of $\alpha$.

Cantor proved that every ordinal $\alpha$ can be uniquely expressed as a finite sum $$\alpha=\omega^{\beta_n}+\cdots+\omega^{\beta_0},$$ where $\beta_n\geq\cdots\geq\beta_0$, and this is called the Cantor normal form of the ordinal. There are alternative forms, where one allows terms like $\omega^\beta\cdot n$ for finite $n$, but in my favored formulation, one simply expands this into $n$ terms with $\omega^\beta+\cdots+\omega^\beta$. In particular, the ordinal $\omega=\omega^1$ has exactly one term in its Cantor normal form, and a finite number $n=\omega^0+\cdots+\omega^0$ has exactly $n$ terms in its Cantor normal form. So the statement of the theorem agrees with the calculations that we had made at the very beginning.

Proof: First, let’s observe that every nonempty final segment of an ordinal of the form $\omega^\beta$ is isomorphic to $\omega^\beta$ again. This amounts to the fact that ordinals of the form $\omega^\beta$ are additively indecomposable, or in other words, closed under ordinal addition, since the final segments of an ordinal $\alpha$ are precisely the ordinals $\zeta$ such that $\alpha=\xi+\zeta$ for some $\xi$. If $\alpha$ is additively indecomposable, then it cannot be that $\zeta<\alpha$, and so all final segments would be isomorphic to $\alpha$. So let’s prove that $\omega^\beta$ is additively indecomposable. This is clear if $\beta=0$, since the only ordinal less than $\omega^0=1$ is $0$ and $0+0<1$. If $\beta$ is a limit ordinal, then the ordinals $\omega^\eta$ for $\eta<\beta$ are unbounded in $\omega^\beta$, and adding them stays below because $\omega^\eta+\omega^\eta=\omega^\eta\cdot 2\leq\omega^\eta\cdot\omega=\omega^{\eta+1}<\omega^\beta$. If $\beta=\delta+1$ is a successor ordinal, then $\omega^\beta=\omega^{\delta+1}=\omega^\delta\cdot\omega=\sup_{n<\omega}\omega^\delta\cdot n$, but again adding them stays below because $\omega^\delta\cdot n+\omega^\delta\cdot m=\omega^\delta\cdot(n+m) < \omega^\delta\cdot\omega=\omega^\beta$.

To prove the theorem, consider any ordinal $\alpha$ with Cantor normal form $\alpha=\omega^{\beta_n}+\cdots+\omega^{\beta_0}$, where $\beta_n\geq\cdots\geq\beta_0$. So as an order type, $\alpha$ consists of finitely many pieces, the first of type $\omega^{\beta_n}$, the next of type $\omega^{\beta_{n-1}}$ and so on up to $\omega^{\beta_0}$. Any final segment of $\alpha$ therefore consists of a final segment of one of these segments, together with all the segments after that segment (and omitting any segments prior to it, if any). But since these segments all have the form $\omega^{\beta_i}$, they are additively indecomposable and therefore are isomorphic to all their nonempty final segments. So any final segment of $\alpha$ is order-isomorphic to an ordinal whose Cantor normal form simply omits some (or none) of the terms from the front of the Cantor normal form of $\alpha$. Since we may start with any of the $n$ terms (or none), this gives precisely $n+1$ many order types of the final segments of $\alpha$, as claimed.

The argument shows, furthermore, that the possible order types of the final segments of $\alpha$, where $\alpha=\omega^{\beta_n}+\cdots+\omega^{\beta_0}$, are precisely the ordinals of the form $\omega^{\beta_k}+\cdots+\omega^{\beta_0}$, omitting terms only from the front, where $k\leq n$. QED

# The axiom of determinacy for small sets

I should like to argue that the axiom of determinacy is true for all games having a small payoff set. In particular, the size of the smallest non-determined set, in the sense of the axiom of determinacy, is the continuum; every set of size less than the continuum is determined, even when the continuum is enormous.

We consider two-player games of perfect information. Two players, taking turns, play moves from a fixed space $X$ of possible moves, and thereby together build a particular play or instance of the game $\vec a=\langle a_0,a_1,\ldots\rangle\in X^\omega$. The winner of this instance of the game is determined according to whether the play $\vec a$ is a member of some fixed payoff set $U\subset X^\omega$ specifying the winning condition for this game. Namely, the first player wins in the case $\vec a\in U$.

A strategy in such a game is a function $\sigma:X^{<\omega}\to X$ that instructs a particular player how to move next, given the sequence of partial play, and such a strategy is a winning strategy for that player, if all plays made against it are winning for that player. (The first player applies the strategy $\sigma$ only on even-length input, and the second player only to the odd-length inputs.) The game is determined, if one of the players has a winning strategy.

It is not difficult to see that if $U$ is countable, then the game is determined. To see this, note first that if the space of moves $X$ has at most one element, then the game is trivial and hence determined; and so we may assume that $X$ has at least two elements. If the payoff set $U$ is countable, then we may enumerate it as $U=\{s_0,s_1,\ldots\}$. Let the opposing player now adopt the strategy of ensuring on the $n^{th}$ move that the resulting play is different from $s_n$. In this way, the opposing player will ensure that the play is not in $U$, and therefore win. So every game with a countable payoff set is determined.

Meanwhile, using the axiom of choice, we may construct a non-determined set even for the case $X=\{0,1\}$, as follows. Since a strategy is function from finite binary sequences to $\{0,1\}$, there are only continuum many strategies. By the axiom of choice, we may well-order the strategies in order type continuum. Let us define a payoff set $U$ by a transfinite recursive procedure: at each stage, we will have made fewer than continuum many promises about membership and non-membership in $U$; we consider the next strategy on the list; since there are continuum many plays that accord with that strategy for each particular player, we may make two additional promises about $U$ by placing one of these plays into $U$ and one out of $U$ in such a way that this strategy is defeated as a winning strategy for either player. The result of the recursion is a non-determined set of size continuum.

So what is the size of the smallest non-determined set? For a lower bound, we argued above that every countable payoff set is determined, and so the smallest non-determined set must be uncountable, of size at least $\aleph_1$. For an upper bound, we constructed a non-determined set of size continuum. Thus, if the continuum hypothesis holds, then the smallest non-determined set has size exactly continuum, which is $\aleph_1$ in this case. But what if the continuum hypothesis fails? I claim, nevertheless, that the smallest non-determined set still has size continuum.

Theorem. Every game whose winning condition is a set of size less than the continuum is determined.

Proof. Suppose that $U\subset X^\omega$ is the payoff set of the game under consideration, so that $U$ has size less than continuum. If $X$ has at most one element, then the game is trivial and hence determined. So we may assume that $X$ has at least two elements. Let us partition the elements of $X^\omega$ according to whether they have exactly the same plays for the second player. So there are at least continuum many classes in this partition. If $U$ has size less than continuum, therefore, it must be disjoint from at least one (and in fact from most) of the classes of this partition (since otherwise we would have an injection from the continuum into $U$). So there is a fixed sequence of moves for the second player, such that any instance of the game in which the second player makes those moves, the result is not in $U$ and hence is a win for the second player. This is a winning strategy for the second player, and so the game is determined. QED

This proof generalizes the conclusion of the diagonalization argument against a countable payoff set, by showing that for any winning condition set of size less than continuum, there is a fixed play for the opponent (not depending on the play of the first player) that defeats it.

The proof of the theorem uses the axiom of choice in the step where we deduce that $U$ must be disjoint from a piece of the partition, since there are continuum many such pieces and $U$ had size less than the continuum. Without the axiom of choice, this conclusion does not follow. Nevertheless, what the proof does show without AC is that every set that does not surject onto $\mathbb{R}$ is determined, since if $U$ contained an element from every piece of the partition it would surject onto $\mathbb{R}$. Without AC, the assumption that $U$ does not surject onto $\mathbb{R}$ is stronger than the assumption merely that it has size less the continuum, although these properties are equivalent in ZFC.  Meanwhile, these issues are relevant in light of the model suggested by Asaf Karagila in the comments below, which shows that it is consistent with ZF without the axiom of choice that there are small non-determined sets. Namely, the result of Monro shows that it is consistent with ZF that $\mathbb{R}=A\sqcup B$, where both $A$ and $B$ have cardinality less than the continuum. In particular, in this model the continuum injects into neither $A$ nor $B$, and consequently neither player can have a strategy to force the play into their side of this partition. Thus, both $A$ and $B$ are non-determined, even though they have size less than the continuum.

# Determinacy for proper-class clopen games is equivalent to transfinite recursion along proper-class well-founded relations

I’d like to continue a bit further my exploration of some principles of determinacy for proper-class games; it turns out that these principles have a surprising set-theoretic strength.  A few weeks ago, I explained that the determinacy of proper-class open games and even clopen games implies Con(ZFC) and much more.  Today, I’d like to prove that clopen determinacy is exactly equivalent over GBC to the principle of transfinite recursion along proper-class well-founded relations.  Thus, GBC plus either of these principles is a strictly intermediate set theory between GBC and KM.

The principle of clopen determinacy for class games is the assertion that in any two-player infinite game of perfect information whose winning condition is a clopen class, there is a winning strategy for one of the players. Players alternately play moves in a playing space $X$, thereby creating a particular play $\vec a\in X^\omega$, and the winner is determined according to whether $\vec a$ is in a certain fixed payoff class $U\subset X^\omega$ or not. One has an open game when this winning condition class $U$ is open in the product topology (using the discrete topology on $X$). A game is open for a player if and only if every winning play for that player has an initial segment, all of whose extensions are also winning for that player. So the game is won for an open player at a finite stage of play. A clopen game, in contrast, has a payoff set that is open for both players. Clopen games can be equivalently cast in terms of the game tree, consisting of positions in the game where the winner is not yet determined, and where play terminates when the winner is known. Namely, a game is clopen exactly when this game tree is well-founded, so that in every play, the outcome is known already at a finite stage.

A strategy is a class function $\sigma:X^{<\omega}\to X$ that instructs the player what to play next, given a position of partial play, and the strategy is winning for a player if all plays that accord with it satisfy the winning condition for that player.

The principle of transfinite recursion along well-founded class relations is the assertion that we may undertake recursive definitions along any class well-founded partial order relation. That is, suppose that $\lhd$ is a class well-founded partial order relation on a class $A$, and suppose that $\varphi(F,a,y)$ is a formula, using only first-order quantifiers but having a class variable $F$, which is functional in the sense that for any class $F$ and any set $a\in A$ there is a unique $y$ such that $\varphi(F,a,y)$. The idea is that $\varphi(F,a,y)$ expresses the recursive rule to be iterated, and a solution of the recursion is a class function $F$ such that $\varphi(F\upharpoonright a,a,F(a))$ holds for every $a\in A$, where $F\upharpoonright a$ means the restriction of $F$ to the class $\{ b\in A\mid b\lhd a\}$. Thus, the value $F(a)$ is determined by the class of previous values $F(b)$ for $b\lhd a$. The principle of transfinite recursion along class well-founded relations is the assertion scheme that for every such well-founded partial order class $\langle A,\lhd\rangle$ and any recursive rule $\varphi$ as above, there is a solution.

In the case that the relation $\lhd$ is set-like, which means that the predecessors $\{b\mid b\lhd a\}$ of any point $a$ form a set (rather than a proper class), then GBC easily proves that there is a unique solution class, which furthermore is definable from $\lhd$. Namely, one can show that every $a\in A$ has a partial solution that obeys the recursive rule at least up to $a$, and furthermore all such partial solutions agree below $a$, because there can be no $\lhd$-minimal violation of this. It follows that the class function $F$ unifying these partial solutions is a total solution to the recursion. Similarly, GBC can prove that there are solutions to other transfinite recursion instances where the well-founded relation is not necessarily set-like, such as a recursion of length $\text{Ord}+\text{Ord}$ or even much longer.

Meanwhile, if GBC is consistent, then it cannot in general prove that transfinite recursions along non-set-like well-founded relations always succeed, since this principle would imply that there is a truth-predicate for first-order truth, as the Tarskian conditions are precisely such a recursion on a well-founded relation based on the complexity of formulas. (That relation is not set-like, since when considering the truth of $\exists x\,\psi(x,\vec a)$, we want to consider the truth of $\psi(b,\vec a)$ for any parameter $b$, and there are a proper class of such $b$.) Thus, GBC plus transfinite recursion (or plus clopen determinacy) is strictly stronger than GBC, although it is provable in Kelley-Morse set theory KM essentially the same as GBC proves the set-like special case.

Theorem. Assume GBC. Then the following are equivalent.

1. Clopen determinacy for class games. That is, for any two-player game of perfect information whose payoff class is both open and closed, there is a winning strategy for one of the players.
2. Transfinite recursion for proper class well-founded relations (not necessarily set-like).

Proof. ($2\to 1$) Assume the principle of transfinite recursion for proper class well-founded relations, and suppose we are faced with a clopen game. Consider the game tree $T$, consisting of positions arising during play, up to the moment that a winner is known. This tree is well-founded because the game is clopen. Let us label the terminal nodes of the tree with I or II according to who has won the game in that position, and more generally, let us label all the nodes of the tree with I or II according to the following transfinite recursion: if a node has I to play, then it will have label I if there is a move to a node labeled I, and otherwise II; and similarly when it is II to play. By the principle of transfinite recursion, there is a labeling of the entire tree that accords with this recursive rule. It is now easy to see that if the initial node is labeled with I, then player I has a winning strategy, which is simply to stay on the nodes labeled I. Note that player II cannot play in one move from a node labeled I to one labeled II. Similarly, if the initial node is labeled II, then player II has a winning strategy; and so the game is determined, as desired.

($1\to 2$) Conversely, let us assume the principle of clopen determinacy for class games. Suppose we are faced with a recursion along a class relation $\lhd$ on a class $A$, using a recursion rule $\varphi(F,a,y)$. We shall define a certain clopen game, and prove that any winning strategy for this game will produce a solution for the recursion.

It will be convenient to assume that $\varphi(F,a,y)$ is strongly functional, meaning that not only does it define a function as we have mentioned in $V$, but also that $\varphi(F,a,y)$ defines a function $(F,a)\mapsto y$ when used over any model $\langle V_\theta,\in,F\rangle$ for any class $F\subset V_\theta$. The strongly functional property can be achieved simply by replacing the formula with the assertion that $\varphi(F,a,y)$, if $y$ is unique such that this holds, and otherwise $y=\emptyset$.

At first, let us consider a slightly easier game, which will be open rather than clopen; a bit later, we shall revise this game to a clopen game. The game is the recursion game, which will be very much like the truth-telling game of my previous post, Open determinacy for proper class games implies Con(ZFC) and much more. Namely, we have two players, the challenger and the truth-teller. The challenger will issues challenges about truth in a structure $\langle V,\in,\lhd,F\rangle$, where $\lhd$ is the well-founded class relation and $F$ is a class function, not yet specified. Specifically, the challenger is allowed to ask about the truth of any formula $\varphi(\vec a)$ in this structure, and to inquire as to the value of $F(a)$ for any particular $a$. The truth-teller, as before, will answer the challenges by saying either that $\varphi(\vec a)$ is true or false, and in the case $\varphi(\vec a)=\exists x\,\psi(x,\vec a)$ and the formula was declared true, by also giving a witness $b$ and declaring $\psi(b,\vec a)$ is true; and the truth-teller must specify a specific value for $F(a)$ for any particular $a$. The truth-teller loses immediately, if she should ever violate Tarski’s recursive definition of truth; and she also loses unless she declares the recursive rules $\varphi(F\upharpoonright a,a,F(a))$ to be true. Since these violations occur at a finite stage of play if they do at all, the game is open for the challenger.

Lemma. The challenger has no winning strategy in the recursion game.

Proof. Suppose that $\sigma$ is a strategy for the challenger. So $\sigma$ is a class function that instructs the challenger how to play next, given a position of partial play. By the reflection theorem, there is an ordinal $\theta$ such that $V_\theta$ is closed under $\sigma$, and using the satisfaction class that comes from clopen determinacy, we may actually also arrange that $\langle V_\theta,\in,\lhd\cap V_\theta,\sigma\cap V_\theta\rangle\prec\langle V,\in,\lhd,\sigma\rangle$. Consider the relation $\lhd\cap V_\theta$, which is a well-founded relation on $A\cap V_\theta$. The important point is that this relation is now a set, and in GBC we may certainly undertake transfinite recursions along well-founded set relations. Thus, there is a function $f:A\cap V_\theta\to V_\theta$ such that $\langle V_\theta,\in,f\rangle$ satisfies $\varphi(f\upharpoonright a,a,f(a))$ for all $a\in V_\theta$, where $f\upharpoonright a$ means restricting $f$ to the predecessors of $a$ in $V_\theta$, and this may not be all the predecessors of $a$ with respect to $\lhd$, which may not be set-like. Note that this is the place where we use our assumption that $\varphi$ was strongly functional, since we want to ensure that it can still be used to define a valid recursion over $\lhd\cap V_\theta$. (We are not claiming that $\langle V_\theta,\in,\lhd\cap V_\theta,f\rangle$ models $\text{ZFC}(\lhd,f)$.)

Consider now the play of the recursion game in $V$, where the challenger uses the strategy $\sigma$ and the truth-teller plays in accordance with $\langle V_\theta,\in,\lhd\cap V_\theta,f\rangle$. Since $V_\theta$ was closed under $\sigma$, the challenger will never issue challenges outside of $V_\theta$. And since the function $f$ fulfills the recursion $\varphi(f\upharpoonright a,a,f(a))$ in this structure, the truth-teller will not be trapped in any violation of the Tarski conditions or the recursion condition. Thus, the truth-teller will win this instance of the game, and so $\sigma$ was not a winning strategy for the challenger, as desired. QED

Lemma. The truth-teller has a winning strategy in the recursion game if and only if there is a solution of the recursion.

Proof. If there is a solution $F$ of the recursion, then by clopen determinacy, we also get a satisfaction class for the structure $\langle V,\in,\lhd,F\rangle$, and the truth-teller can answer all queries of the challenger by referring to what is actually true in this structure. This will be winning for the truth-teller, since the actual truth obeys the Tarskian conditions and the recursive rule.

Conversely, suppose that $\tau$ is a winning strategy for the truth-teller in the recursion game. We claim that the truth assertions made by $\tau$ do not depend on the order in which challenges are made by the challenger; they all cohere with one another. This is easy to see for formulas not involving $F$ by induction on formulas, for if the truth of a formula $\psi(\vec a)$ is independent of play, then also the truth of $\neg\psi(\vec a)$ is as well, and similarly if $\exists x\psi(x,\vec a)$ is declared true with witness $\psi(b,\vec a)$, then by induction $\psi(b,\vec a)$ is independent of the play, in which case $\exists x\psi(x,\vec a)$ must always be declared true by $\tau$ independently of the order of play by the challenger (although the particular witness $b$ provided by $\tau$ may depend on the play). Now, let us also argue that the values of $F(a)$ declared by $\tau$ are also independent of the order of play. If not, there is some $\lhd$-least $a$ where this fails. (Note that such an $a$ exists, since $\tau$ is a class, and we can define from $\tau$ the class of $a$ for which the value of $F(a)$ declared by $\tau$ depends on the order of play; without $\tau$, one might have expected to need $\Pi^1_1$-comprehension to find a minimal $a$ where the recursion fails.) As in the truth-telling game, the truth assertions made by $\tau$ about $\langle V,\in,\lhd,F\upharpoonright a\rangle$, where $F\upharpoonright a$ is the class function of values that are determined by $\tau$ on $b\lhd a$, must not depend on the order of play. Since the recursion rule $\varphi(F\upharpoonright a,a,y)$ is functional, there is only one value $y=F(a)$ for which this formula can be truthfully held, and so if some play causes $\tau$ to play a different value for $F(a)$, the challenger can in finitely many additional moves (bounded by the syntactic complexity of $\varphi$) trap the truth-teller in a violation of the Tarskian conditions or the recursion condition. Thus, the values of $F(a)$ declared by $\tau$ must in fact all cohere independently of the order of play, and so $\tau$ is describing a class function $F:A\to V$ such that $\varphi(F\upharpoonright a,a,F(a))$ is true for every $a\in A$. So the recursion has a solution, as desired. QED

So far, we have established that the principle of open determinacy implies the principle of transfinite recursion along well-founded class relations. In order to improve this implication to use only clopen determinacy rather than open determinacy, we modify the game to become a clopen game rather than an open game.

Consider the clopen form of the recursion game, where we insist also that the challenger announce on the first move a natural number $n$, such that the challenger loses if the truth-teller survives for at least $n$ moves. This is now a clopen game, since the winner will be known by that time, either because the truth-teller will violate the Tarski conditions or the recursion condition, or else the challenger’s limit on play will expire.

Since the modified version of the game is even harder for the challenger, there can still be no winning strategy for the challenger. So by the principle of clopen determinacy, there is a winning strategy $\tau$ for the truth-teller. This strategy is allowed to make decisions based on the number $n$ announced by the challenger on the first move, and it will no longer necessarily be the case that the theory declared true by $\tau$ will be independent of the order of play. Nevertheless, it will be the case, we claim, that the theory declared true by $\tau$ for all plays with sufficiently large $n$ (and with sufficiently many remaining moves) will be independent of the order of play. One can see this by observing that if an assertion $\psi(\vec a)$ is independent in this sense, then also $\neg\psi(\vec a)$ will be independent in this sense, for otherwise there would be plays with large $n$ giving different answers for $\neg\psi(\vec a)$ and we could then challenge with $\psi(\vec a)$, which would have to give different answers or else $\tau$ would not win. Similarly, since $\tau$ is winning, one can see that allowing the challenger to specify a bound on the total length of play does not prevent the arguments above showing that $\tau$ describes a coherent solution function $F:A\to V$ satisfying the recursion $\varphi(F\upharpoonright a,a,F(a))$, provided that one looks only at plays in which there are sufficiently many moves remaining. There cannot be a $\lhd$-least $a$ where the value of $F(a)$ is not determined in this sense, and so on as before.

Thus, we have proved that the principle of clopen determinacy for class games is equivalent to the principle of transfinite recursion along well-founded class relations. QED

The material in this post will become part of a joint project with Victoria Gitman and Thomas Johnstone. We are currently investigating several further related issues.

# Open determinacy for proper class games implies Con(ZFC) and much more

$\newcommand\Tr{\text{Tr}}$One of the intriguing lessons we have learned in the past half-century of set-theoretic developments is that there is a surprisingly robust connection between infinitary game theory and fundamental set-theoretic principles, including large cardinals. Assertions about the existence of strategies in infinite games often turn out to have an unexpected set-theoretic power. In this post, I should like to give another specific example of this, which Thomas Johnstone and I hit upon yesterday in an enjoyable day of mathematics.

Specifically, I’d like to prove that if we generalize the open-game concept from sets to classes, then assuming consistency, ZFC cannot prove that every definable open class game is determined, and indeed, over Gödel-Bernays set theory GBC the principle of open determinacy (and even just clopen determinacy) implies Con(ZFC) and much more.

To review a little, we are talking about games of perfect information, where two players alternately play elements from an allowed space $X$ of possible moves, and together they build an infinite sequence $\vec x=\langle x_0,x_1,x_2,\ldots\rangle$ in $X^\omega$, which is the resulting play of this particular instance of the game. We have a fixed collection of plays $A\subset X^\omega$ that is used to determine the winner, namely, the first player wins this particular instance of the game if the resulting play $\vec x$ is in $A$, and otherwise the second player wins. A strategy for a player is a function $\sigma:X^{<\omega}\to X$, which tells a player how to move next, given a finite position in the game. Such a strategy is winning for that player, if he or she always wins by following the instructions, no matter how the opponent plays. The game is determined, if one of the players has a winning strategy.

It is a remarkable but elementary fact that if the winning condition $A$ is an open set, then the game is determined. One can prove this by using the theory of ordinal game values, and my article on transfinite game values in infinite chess contains an accessible introduction to the theory of game values. Basically, one defines that a position has game value zero (for player I, say), if the game has already been won at that stage, in the sense that every extension of that position is in the winning payoff set $A$. A position with player I to play has value $\alpha+1$, if player I can move to a position with value $\alpha$, and $\alpha$ is minimal. The value of a position with player II to play is the supremum of the values of all the positions that he or she might reach in one move, provided that those positions have a value. The point now is that if a position has a value, then player I can play so as strictly to decrease the value, and player II cannot play so as to increase it. So if a position has a value, then player I has a winning strategy, which is the value-reducing strategy. Conversely, if a position does not have a value, then player II can maintain that fact, and player I cannot play so as to give it a value; thus, in this case player II has a winning strategy, the value-maintaining strategy. Thus, we have proved the Gale-Stewart theorem: every open game is determined.

That proof relied on the space of moves $X$ being a set, since we took a supremum over the values of the possible moves, and if $X$ were a proper class, we couldn’t be sure to stay within the class of ordinals and the recursive procedure might break down. What I’d like to do is to consider more seriously the case where $X$ is a proper class. Similarly, we allow the payoff collection $A$ to be a proper class, and the strategies $\sigma:X^{<\omega}\to X$ are also proper classes. Can we still prove the Gale-Steward theorem for proper classes? The answer is no, unless we add set-theoretic strength. Indeed, even clopen determinacy has set-theoretic strength.

Theorem. (GBC) Clopen determinacy for proper classes implies Con(ZFC) and much more. Specifically, there is a clopen game, such the existence of a winning strategy is equivalent to the existence of a satisfaction class for first-order truth.

Proof. Let me first describe a certain open game, the truth-telling game, with those features, and I shall later modify it to a clopen game. The truth-telling game will have two players, which I call the challenger and the truth-teller. At any point in the game, the challenger plays by making an inquiry about a particular set-theoretic formula $\varphi(\vec a)$ with parameters. The truth-teller must reply to the inquiry by stating either true or false, and in the case that the formula $\varphi$ is an existential assertion $\exists x\,\psi(x,\vec a)$ declared to be true, then the truth teller must additionally identify a particular witness $b$ and assert that $\psi(b,\vec a)$ is true. So a play of the game consists of a sequence of such inquires and replies.

The truth-teller wins a play of the game, provided that she never violates the recursive Tarskian truth conditions. Thus, faced with an atomic formula, she must state true or false in accordance with the actual truth or falsity of that atomic formula, and similarly,
she must say true to $\varphi\wedge\psi$ just in case she said true to both $\varphi$ and $\psi$ separately (if those formulas had been issued by the challeger), and she must state opposite truth values for $\varphi$ and $\neg\varphi$, if both are issued as challenges.

This is an open game, since the challenger will win, if at all, at a finite stage of play, when the violation of the Tarskian truth conditions is first exhibited.

Lemma 1. The truth-teller has a winning strategy in the truth-telling game if and only if there is a satisfaction class for first-order truth.

Proof. Clearly, if there is a satisfaction class for first-order truth, then the truth-teller has a winning strategy, which is simply to answer all questions about truth by consulting the
satisfaction class. Since that class obeys the Tarskian conditions, she will win the game, no matter which challenges are issued.

Conversely, suppose that the truth-teller has a winning strategy $\tau$ in the game. I claim that we may use $\tau$ to build a satisfaction class for first-order truth. Specifically, let $T$ be the collection of formulas $\varphi(\vec a)$ that are asserted to be true by $\tau$ in any play according to $\tau$. I claim that $T$ is a satisfaction class. We may begin by noting that since $T$ must correctly state the truth of all atomic formulas, it follows that the particular answers that $\tau$ gives on the atomic formulas does not depend on the order of the challenges issued by the challenger. Now, we argue by induction on formulas that the truth values issued by $\tau$ does not depend on the order of the challenges. For example, if all plays in which $\varphi(\vec a)$ is issued as a challenge come out true, then all plays in which $\neg\varphi(\vec a)$ is challenged will result in false, or else we would have a play in which $\tau$ would violate the Tarskian truth conditions. Similarly, if $\varphi$ and $\psi$ always come out the same way, then so does $\varphi\wedge\psi$. We don’t claim that $\tau$ must always issue the same witness $b$ for an existential $\exists x\,\psi(x,\vec a)$, but if it ever says true to this statement, then it will provide some witness $b$, and for that statement $\psi(b,\vec a)$, the truth value stated by $\tau$ is independent of the order of play by the challenger, by induction. Thus, by induction on formulas, the answers provided by the truth-teller strategy $\tau$ gives us a satisfaction predicate for first-order truth. QED

Lemma 2. The challenger has no winning strategy in the truth-telling game.

Proof. Suppose that $F$ is a strategy for the challenger. So $F$ is a proper class function that directs the challenger to issue certain challenges, given the finite sequence of previous challenges and truth-telling answers. By the reflection theorem, there is a closed unbounded proper class of cardinals $\theta$, such that $F”V_\theta\subset V_\theta$. That is, $V_\theta$ is closed under $F$, in the sense that if all previous challenges and responses come from $V_\theta$, then the next challenge will also come from $V_\theta$. Since $\langle V_\theta,\in\rangle$ is a set, we have a satisfaction predicate on it. Consider the play, where the truth-teller replies to all inquires by consulting truth in $V_\theta$, rather than truth in $V$. The point is that if the challenger follows $\tau$, then all the inquiries will involve only parameters $\vec a$ in $V_\theta$, provided that the truth-teller also always gives witnesses in $V_\theta$, which in this particular play will be the case. Since the satisfaction predicate on $V_\theta$ does satisfy the Tarskian truth conditions, it follows that the truth-teller will win this instance of the game, and so $F$ is not a winning strategy for the challenger. QED

Thus, if open determinacy holds for classes, then there is a satisfaction predicate for first-order truth.

This implies Con(ZFC) for reasons I explained on my post KM implies Con(ZFC) and much more, by appealing to the fact that we have the collection axiom relative to the class for the satisfaction predicate itself, and this is enough to verify that the nonstandard instances of collection also must be declared true in the satisfaction predicate.

But so far, we only have an open game, rather than a clopen game, since the truth-teller wins only by playing the game out for infinitely many steps. So let me describe how to modify the game to be clopen. Specifically, consider the version of the truth-telling game, where the challenger must also state on each move a specific ordinal $\alpha_n$, which descend during play $\alpha_0>\alpha_1>\cdots>\alpha_n$. If the challenger gets to $0$, then the truth-teller is declared the winner. For this modified game, the winner is known in finitely many moves, because either the truth-teller violates the Tarskian conditions or the challenger hits zero. So this is a clopen game. Since we made the game harder for the challenger, it follows that the challenger still can have no winning strategy. One can modify the proof of lemma 1 to say that if $\tau$ is a winning strategy for the truth teller, then the truth assertions made by $\tau$ in response to all plays with sufficiently large ordinals for the challenger all agree with one another independently of the order of the formulas issued by the challenger. Thus, there is a truth-telling strategy just in case there is a satisfaction class for first-order truth.

So clopen determinacy for class games implies the existence of a satisfaction class for first-order truth, and this implies Con(ZFC) and much more. QED

One may easily modify the game by allowing a fixed class parameter $B$, so that clopen determinacy implies that there is a satisfaction class relative to truth in $\langle V,\in,B\rangle$.

Furthermore, we may also get iterated truth predicates. Specifically, consider the iterated truth-telling game, which in addition to the usual language of set theory, we have a hierarchy of predicates $\Tr_\alpha$ for any ordinal $\alpha$. We now allow the challenger to ask about formulas in this expanded language, and the truth teller is required to obey not only the usual Tarskian recursive truth conditions, but also the requirements that $\Tr_\alpha(\varphi(\vec a))$ is declared true just in case $\varphi(\vec a)$ uses only truth predicates $\Tr_\beta$ for $\beta<\alpha$ and also $\varphi(\vec a)$ is declared true (if this challenge was issued).

The main arguments as above generalize easily to show that the challenger cannot have a winning strategy in this iterated truth-telling game, and the truth-teller has a strategy just in case there is a satisfaction predicate for truth-about-truth iterated through the ordinals.  Thus, the principle of open determinacy for proper class games implies Con(Con(ZFC)) and $\text{Con}^\alpha(\text{ZFC})$ and so on.

Let me finish by mentioning that Kelley-Morse set theory is able to prove open determinacy for proper class games in much the same manner as we proved the Gale-Stewart theorem above, using well-ordered class meta-ordinals, rather than merely set ordinals, as well as in other ways. If there is interest, I can make a further post about that, so just ask in the comments!

# Transfinite Nim

Shall we have a game of transfinite Nim? One of us sets up finitely many piles of wooden blocks, each pile having some ordinal height, possibly transfinite, and the other of us decides who shall make the first move. Taking turns, we each successively remove a top part of any one pile of our choosing, making it strictly shorter. Whoever takes the very last block wins. (It is fine to remove an entire pile on a turn or to remove blocks from a different pile on a later turn.)

In my challenge problem last week, for example, I set up six piles with heights:
$$1\qquad \omega+3\qquad \omega^\omega+5 \qquad \omega^{\omega+3}+\omega^\omega\cdot3+\omega\cdot 5+7\qquad \epsilon_0\qquad \omega_1$$Would you want to go first or second? What is the best move? In general, we can start with any finite number of piles of arbitrary ordinal heights — what is the general winning strategy?

Before proceeding with the transfinite case, however, let’s review the winning strategy in ordinary finite Nim, which I explained in my post last week concerning my visit to the 7th/8th grade Math Team at my son’s school. To say it quickly again, a finite Nim position is balanced, if when you consider the binary representations of the pile heights, there are an even number of ones in each binary place position. Another way to say this, and this is how I explained it to the school kids, is that if you think of each pile height as a sum of distinct powers of two, then any power of two that arises in any pile does so an even number of times overall for all the piles. The mathematical facts to establish are that (1) any move on a balanced position will unbalance it; and (2) any unbalanced position admits a balancing move. Since the winning move of taking the very last block is a balancing move, it follows that the winning strategy is to balance whatever position with which you are faced. At the start, if the position is unbalanced, then you should go first and balance it; if it is already balanced, then you should go second and adopt the balancing strategy. It may be interesting to note that this winning strategy is unique in the sense that any move that does not balance the position is a losing move, since the opposing player can adopt the balancing strategy from that point on. But of course there is often a choice of balancing moves.

Does this balancing strategy idea continue to apply to transfinite Nim? Yes! All we need to do is to develop a little of the theory of transfinite binary representation. Let me assume that you are all familiar with the usual ordinal arithmetic, for which $\alpha+\beta$ is the ordinal whose order type is isomorphic to a copy of $\alpha$ followed by a copy of $\beta$, and $\alpha\cdot\beta$ is the ordinal whose order type is isomorphic to $\beta$ many copies of $\alpha$. Consider now ordinal exponentiation, which can be defined recursively as follows:
$$\alpha^0=1$$ $$\alpha^{\beta+1}=\alpha^\beta\cdot\alpha$$ $$\alpha^\lambda=\sup_{\beta<\lambda} \alpha^\beta\qquad\lambda\text{ limit}$$ It turns out that $\alpha^\beta$ is the order-type of the finite-support functions from $\beta$ to $\alpha$, under the suitable lexical order. Ordinal exponentiation should not be confused with cardinal exponentiation, since they are very different. For example, with ordinal exponentiation, one has $$2^\omega=\sup_{n<\omega}2^n=\omega,$$which of course is not the case with cardinal exponentiation. In this post, I use only ordinal exponentiation.

Theorem. Every ordinal $\beta$ has a unique representation as a decreasing finite sum of ordinal powers of two. $$\beta=2^{\beta_n}+\cdots+2^{\beta_0}, \qquad \beta_n>\cdots>\beta_0$$

The proof is easy! We simply prove it by transfinite induction on $\beta$. If the theorem holds below an ordinal $\beta$, first let $2^\alpha$ be the largest power of two that is at most $\beta$, so that $\beta=2^\alpha+\gamma$ for some ordinal $\gamma$. It follows that $\gamma<2^\alpha$, for otherwise we could have made $2^{\alpha+1}\leq\beta$. Thus, by induction, $\gamma$ has a representation with powers of two, and so we may simply add $2^\alpha$ at the front to represent $\beta$. To see that the representations are unique, first establish that any power of two is the supremum of the finite decreasing sums of any strictly smaller powers of two. From this, it follows that any representation of $\beta$ as above must have used $2^\alpha$ just as we did for the first term, because otherwise it couldn’t be large enough, and then the representation of the remaining part $\gamma$ is unique by induction, and so we get uniqueness for the representation of $\beta$. QED

Thus, the theorem shows that every ordinal has a unique binary representation in the ordinals, with finitely many nonzero bits. Suppose that we are given a position in transfinite Nim with piles of ordinal heights $\eta_0,\ldots,\eta_n$. We define that such a position is balanced, if every power of two appearing in the representation of any of the piles appears an even number of times overall for all the piles.

The mathematical facts to establish are (1) any move on a balanced position will unbalance it; and (2) every unbalanced position has a balancing move. These facts can be proved in the transfinite case in essentially the same manner as the finite case. Namely, if a position is balanced, then any move affects only one pile, changing the ordinal powers of two that appear in it, and thereby destroy the balanced parity of whichever powers of two are affected. And if a position is unbalanced, then look at the largest unbalanced ordinal power of two appearing, and make a move on any pile having such a power of two in its representation, reducing it so as exactly to balance all the smaller powers of two appearing in the position.

Finally, those two facts again imply that the balancing strategy is a winning strategy, since the winning move of taking the last block or blocks is a balancing move, down to the all-zero position, which is balanced.

In the case of my challenge problem above, we may represent the ordinals in binary. We know how to do that in the case of 1, 3, 5 and 7, and actually those numbers are balanced. Here are some other useful binary representations:

$\omega+3=2^\omega+2+1$

$\omega^\omega+5 = (2^\omega)^\omega+5=2^{\omega^2}+4+1$

$\omega^{\omega+3}=(2^\omega)^{\omega+3}=2^{\omega^2+\omega\cdot 3}$

$\omega^\omega\cdot3=(2^\omega)^\omega\cdot 3=2^{\omega^2}\cdot 2+2^{\omega^2}=2^{\omega^2+1}+2^{\omega^2}$

$\omega\cdot 5+7 =2^{\omega}\cdot 2^2+2^\omega+7=2^{\omega+2}+2^\omega+4+2+1$

$\epsilon_0 = 2^{\epsilon_0}$

$\omega_1=2^{\omega_1}$

I emphasize again that this is ordinal exponentiation. The Nim position of the challenge problem above is easily seen to be unbalanced in several ways. For example, the $\omega_1$ term among others appears only once. Thus, we definitely want to go first in this position. And since $\omega_1$ is the largest unbalanced power of two and it appears only once, we know that we must play on the $\omega_1$ pile. Once one represents all the ordinals in terms of their powers of two representation, one sees that the unique winning move is to reduce the $\omega_1$ pile to have ordinal height
$$\epsilon_0+\omega^{\omega+3}+\omega^\omega\cdot 2+\omega\cdot 4.$$This will exactly balance all the smaller powers of two in the other piles and therefore leaves a balanced position overall. In general, the winning strategy in transfinite Nim, just as for finite Nim, is always to leave a balanced position.

Special honors to Pedro Sánchez Terraf for being the only one to post the winning move in the comments on the other post!

# Win at Nim! The secret mathematical strategy for kids (with challange problems in transfinite Nim for the rest of us)

Welcome to my latest instance of Math for Kids!

Today I had the pleasure to make an interactive mathematical presentation at my son’s school to the 7th / 8th grade Math Team, about 30 math-enthusiastic kids (twelve and thirteen years old) along with their math teachers and the chair of the school math department.

The topic was the game of Nim! This game has a secret mathematical strategy enabling anyone with that secret knowledge to win against those without it. It is a great game for kids, because with the strategy they can realistically expect to beat their parents, friends, siblings and parent’s friends almost every single time!

To play Nim, one player sets up a number of piles of blocks, and the opponent chooses whether to go first or second. The players take turns removing blocks — each player may remove any number of blocks (at least one) from any one pile, and it is fine to take a whole pile — whichever player takes the last block wins.

For the math team, we played a few demonstration games, in which I was able to beat all the brave challengers, and then the kids paired off to play each other and gain familiarity with the game. Then, it was time for the first strategy discussion.

What could the secret winning strategy be? I explained to the kids a trick that mathematicians often use when approaching a difficult problem, namely, to consider in detail some very simple special cases or boundary instances of the problem. It often happens that these special cases reveal a way of thinking about the problem that applies much more generally.

Perhaps one of the easiest special cases of Nim occurs when there is only one pile. If there is only one pile, then clearly one wants to go first, in order to make the winning move: take the entire pile!

Two balanced piles

A slightly less trivial and probably more informative case arises when there are exactly two piles. If the stacks have the same height, then the kids realized that the second player could make copying moves so as to preserve this balanced situation. The key insight now is that this copying strategy is a winning strategy, because if one can always copy, then in particular one will have a move whenever the opponent did, and so the opponent will never take the last block. With two piles, therefore, one wants always to make them balanced. If they are initially unbalanced, then choose to go first and follow the balancing strategy. If they are initially balanced, then choose to go second, and copy whatever moves your opponent makes to rebalance them.

A balanced position

With that insight, it is not difficult to see that it is winning to leave a position with any number of pairs of balanced piles. One can in effect play on each pair separately, because whenever the opponent makes a move on one of the piles, one can copy the move with the corresponding partner pile. In this way, we may count such a position overall as balanced. The more fundamental game-theoretic observation to make is that balanced piles in effect cancel each other out in any position, and one can ignore them when analyzing a position. When two balanced piles are present in a possibly more complicated position, one can pretend that they aren’t there, precisely because whenever your opponent plays on one of them, you can copy the move on the other, and so any winning strategy for the position in which those piles are absent can be converted into a winning strategy in which the balanced piles are present.

This idea now provides a complete winning strategy in the case that all piles have height one or two at most. One wants to leave a position with an even number of piles of each height. If only one height has an odd number of piles, then take a whole pile of that height. And if there are odd numbers of piles both of height one and two, then turn a height-two pile into a pile of height one, and this will make them both even. So any unbalanced position can be balanced, and any move on a balanced position will unbalance it.

1+2+3 counts as balanced

Let’s now consider that there may be piles of height three. For example, consider the basic position with piles of height one, two and three. The observation to make here is that any move on this position can be replied to with a move that leaves it balanced (check it yourself to be sure!). It follows that this position is winning to leave for the other player (and so one should go second on $1+2+3$). It would be nice if we could consider this position itself as already balanced in some sense. Indeed, we may incorporate this situation into the balancing idea if we think of the pile of height three as really consisting of two subpiles, one of height two and one of height one. In this way, the Nim position 1+2+3 counts as balanced, since the 3 counts as 2+1, which balances the other stacks.  The 1+2+3 position has two stacks of height two and two of height one, when one regards the stack of height three as having a substack of height two and a substack of height one.

This way of thinking produces a complete winning strategy for Nim positions involving piles of height at most three. (And this is a strategy that can be mastered even by very young children — a few years ago I had talked about Nim with much younger children, Math for six-year-olds: Win at Nim!, first-graders at my daughter’s school, and at that time we concentrated on posititions with piles of height at most three. Older kids, however, can handle the full strategy.) Namely, the winning strategy in this case is to strive to balance the position, to make an even number overall of piles of height one and two, where we count piles of height three as one each of one and two. If you always give your opponent a balanced position, then  you will win!  Faced with an unbalanced position, it is a fact that you can always find a balancing move, and any move on an balanced position will unbalance it.  If the game is just starting, and you are deciding whether to go first or second, you should determine whether it is balanced yet or not.  If it unbalanced, then you should go first and make the balancing move; if it is already balanced, then you should go second and adopt the copying strategy, in which you re-balance the position with each move.

The general winning strategy, of course, goes beyond three. The key idea is to realize that what is really going on when we represent $3$ as $2+1$ is that we are using the binary representation of the number $3$. To explain, I wrote the following numbers on the chalkboard $$1,\ 2,\ 4,\ 8,\ 16,\ 32,\ 64,\ \cdots$$ and was very pleased when the kids immediately shouted out, “The powers of two!” I explained that any natural number can be expressed uniquely as a sum of distinct powers of two. Asked for a favorite number less than one hundred, one student suggested $88$, and together we calculated $$88=64+16+8,$$ which means that the binary representation of $88$ is $1011000$, which I read off as, “one $64$, no $32$s, one $16$, one $8$, no $4$s, no $2$s and no $1$s. This is just the same as thinking of $9572$ as 9 thousands, 5 hundreds, 7 tens and 2 ones, using the powers of ten. It is interesting to learn that one may easily count very high on one hand using binary, up to 1023 on two hands!

The general strategy is to view every Nim pile as consisting of subpiles whose height is a power of two, and to make sure that one leaves a position that is balanced in the sense that every power of two has an even number of such instances in the position. So we think of $3$ as really $2+1$ for the purposes of balancing; $4$ counts as itself because it is a power of two, but $5$ counts as $4+1$ and $6$ counts as $4+2$ and $7$ as $4+2+1$. Another way to describe the strategy is that we express all the pile heights in binary, and we want an even number of $1$s in each binary place position.

The mathematical facts to verify are (1) any move on a balanced position in this powers-of-two sense will cause it to become unbalanced, and (2) any unbalanced position can be balanced in one move. It follows that leaving balanced positions is a winning strategy, because the winning move of taking the last block is a balancing move rather than an unbalancing move.

One can prove statement (1) by realizing that when you move a single stack, the binary representation changes, and so whichever binary digits changed will now become unbalanced.  For statement (2), consider the largest unbalanced power of two $2^k$ and move on any stack that contains a $2^k$ size substack. Since $2^k-1=111\cdots11$ in binary, one can attain any binary pattern for the smaller height stacks by removing between $1$ and $2^k$ many blocks. So one can balance the position.

As a practical matter, the proof of (2) also shows how one can find a (winning) balancing move, which can otherwise be difficult in some cases: look for the largest unbalanced power of two, and move on any pile containing such a subpile, making sure to leave a balanced position.

In most actual instances of Nim, the pile heights are rarely very tall, and so one is usually considering just $1$, $2$ and $4$ as the powers of two that arise.  A traditional starting configuration has piles of height 1, 3, 5, and 7, and this position is balanced, because one may view it as: $1, 2+1, 4+1, 4+2+1$, and there are an even number of 1s, 2s and 4s.

It is interesting to consider also the Misère form of Nim, where one wants NOT to take the last block. This version of the game also has a secret mathematical strategy, which I shall reveal later on.

Challenge 1.   What is the winning strategy in Misère Nim?

If you figure it out, please post a comment! I’ll post the solution later. One might naively expect that the winning strategy of Misère Nim is somehow totally opposite to the winning strategy of regular Nim, but in fact, the positions $1,2,3$ and $1,3,5,7$ are winning for the second player both in Nim and also in Misère Nim. Indeed, I claim that all nontrivial Nim positions that are winning for regular Nim (with a suitable meaning of “nontrivial”) are also winning for Misère Nim. Can you prove it?

Another interesting generalization, for the set-theorists, is to consider transfinite Nim, where the piles can have transfinite ordinal height. So we have finitely many piles of ordinal height, perhaps infinite, and a move consists of making any one pile strictly shorter. Since there are no infinite descending sequence of ordinals, the game will terminate in finitely many moves, and the winner is whowever removes the last block.

Challenge 2.  Who wins the transfinite Nim game with piles of heights: $$1\qquad \omega+3\qquad \omega^\omega+5\qquad \omega^{\omega+3}+\omega^\omega\cdot3+\omega\cdot 5+7\qquad \epsilon_0\qquad \omega_1$$ and what are the winning moves? What is the general winning strategy for transfinite Nim?

Post your solutions! You can also see my solution and further discussion.