Quantifer elimination

A theory admits quantifier-elimination when every assertion is logically equivalent over the theory to a quantifier-free assertion. This is quite a remarkable property when it occurs, because it reveals a severe limitation on the range of concepts that can be expressed in the theory—a quantifier-free assertion, after all, is able to express only combinations of the immediate atomic facts at hand. As a result, we are generally able to prove quantifier-elimination results for a theory only when we already have a profound understanding of it and its models, and the quantifier-elimination result itself usually leads quickly to classification of the definable objects, sets, and relations in the theory and its models. In this way, quantifier-elimination results often showcase our mastery over a particular theory and its models. So let us present a few quantifier-elimination results, exhibiting our expertise over some natural theories.

Endless dense linear orders

$\def\<#1>{\left\langle#1\right\rangle}\newcommand\Q{\mathbb{Q}}\newcommand\R{\mathbb{R}}\newcommand\N{\mathbb{N}}\newcommand\bottom{\mathord{\perp}}\newcommand{\Th}{\mathop{\rm Th}}\newcommand{\unaryminus}{-}\newcommand\Z{\mathbb{Z}}\newcommand\divides{\mid}$Consider first the theory of an endless dense linear order, such as the rational order $\<\Q,<>$. In light of Cantor’s theorems on the universality of the rational line and the categoricity theorem for countable endless dense linear orders, we already have a fairly deep understanding of this theory and this particular model.

Consider any two rational numbers $x,y$ in the structure $\<\Q,<>$. What can one say about them? Well, we can certainly make the atomic assertions that $x$ to a Boolean combination of these assertions.

Theorem. The theory of the rational order $\<\Q,<>$ admits elimination of quantifiers—every assertion $\varphi(x,\ldots)$ is logically equivalent in the rational order to a quantifier-free assertion.

Proof. To see this, observe simply by Cantor’s categoricity theorem for countable dense linear orders that any pair $x<y$ in $\Q$ is automorphic to any other such pair $x'<y’$, and similarly for pairs with $x=y$ or $y<x$. Consequently, $\varphi(x,y)$ either holds of all pairs with $x<y$ or of none of them, of all pairs with $x=y$ or none, and of all pairs with $y<x$ or none. The assertion $\varphi(x,y)$ is therefore equivalent to the disjunction of the three atomic relations for which it is realized, including $\top$ as the disjunction of all three atomic possibilities and $\bottom$ as the empty disjunction.

More generally, a similar observation applies to assertions $\varphi(x_1,\ldots,x_n)$ with more free variables. By Cantor’s theorem, every $n$-tuple of points in $\Q$ is automorphic with any other such $n$-tuple of points having the same atomic order relations. Therefore any assertion holding of one such $n$-tuple holds of all $n$-tuples with that same atomic type, and consequently every assertion $\varphi(x_1,\ldots,x_n)$ is logically equivalent in $\<\Q,<>$ to a disjunction of those combinations of atomic relations amongst the variables $x_1,\ldots,x_n$ for which it holds. In particular, every assertion is equivalent in $\<\Q,<>$ to a quantifier-free assertion. In short, the theory of this model $\Th(\<\Q,<>)$ admits elimination of quantifiers. $\Box$

What about other endless dense linear orders? The argument we have given so far is about the theory of this particular model $\<\Q,<>$. In fact, the theory of the rational order is exactly the theory of endless dense linear orders, because this theory is complete, which one can see as an immediate consequence of the categoricity result of Cantor’s theorem and the downward Löwenheim-Skolem theorem. In my book, I have not yet proved the Löwenheim-Skolem theorem at this stage, however, and so let me give a direct proof of quantifier-elimination in the theory of endless dense linear orders, from which we can also derive the completeness of this theory.

Theorem. In the theory of endless dense linear orders, every statement is logically equivalent to a quantifier-free statement.

Proof. To clarify, the quantifier-free statement will have the same free variables as the original assertion, provided we allow $\bottom$ and $\top$ as logical constants. We argue by induction on formulas. The claim is of course already true for the atomic formulas, and it is clearly preserved under Boolean connectives. So it suffices inductively to eliminate the quantifier from $\exists x\, \varphi(x,\ldots)$, where $\varphi$ is itself quantifier-free. We can place $\varphi$ in disjunctive normal form, a disjunction of conjunction clauses, where each conjunction clause is a conjunction of literals, that is, atomic or negated atomic assertions. Since the atomic assertions $x<y$, $x=y$ and $y<x$ are mutually exclusive and exhaustive, the negation of any one of them is equivalent to the disjunction of the other two. Thus we may eliminate any need for negation. By redistributing conjunction over disjunction as much as possible, we reduce to the case of $\exists x\,\varphi$, where $\varphi$ is in disjunctive normal form without any negation. The existential quantifier distributes over disjunction, and so we reduce to the case $\varphi$ is a conjunction of atomic assertions. We may eliminate any instance of $x=x$ or $y=y$, since these impose no requirement. We may assume that the variable $x$ occurs in each conjunct, since otherwise that conjunct commutes outside the quantifier. If $x=y$ occurs in $\varphi$ for some variable $y$ not identical to $x$, then the existential claim is equivalent to $\varphi(y,\ldots)$, that is, by replacing every instance of $x$ with $y$, and we have thus eliminated the quantifier. If $x<x$ occurs as one of the conjuncts, this is not satisfiable and so the assertion is equivalent to $\bottom$. Thus we have reduced to the case where $\varphi$ is a conjunction of assertions of the form $x<y_i$ and $z_j<x$. If only one type of these occurs, then the assertion $\exists x\,\varphi$ is outright provable in the theory by the endless assumption and thus equivalent to $\top$. Otherwise, both types $x<y_i$ and $z_j<x$ occur, and in this case the existence of an $x$ obeying this conjunction of assertions is equivalent over the theory of endless dense linear orders to the quantifier-free conjunction $\bigwedge_{i,j}z_j<y_i$, since there will be an $x$ between them in this case and only in this case. Thus, we have eliminated the quantifier $\exists x$, and so by induction every formula is equivalent over this theory to a quantifier-free formula. $\Box$

Corollary. The theory of endless dense linear orders is complete.

Proof. If $\sigma$ is any sentence in this theory, then by theorem above, it is logically equivalent to a Boolean combination of quantifier-free assertions with the same variables. Since $\sigma$ is a sentence and there are no quantifier-free atomic sentences except $\bottom$ and $\top$, it follows that $\sigma$ is equivalent over the theory to a Boolean combination of $\bottom$ or $\top$. All such sentences are equivalent either to $\bottom$ or $\top$, and thus either $\sigma$ is entailed by the theory or $\neg\sigma$ is, and so the theory is complete. $\Box$

Corollary. In any endless dense linear order, the definable sets (allowing parameters) are precisely the finite unions of intervals.

Proof. By intervals we mean a generalized concept allowing either open or closed endpoints, as well as rays, in any of the forms:
$$(a,b)\qquad [a,b]\qquad [a,b)\qquad (a,b]\qquad (a,\infty)\qquad [a,\infty)\qquad (\unaryminus\infty,b)\qquad (\unaryminus\infty,b]$$
Of course any such interval is definable, since $(a,b)$ is defined by $(a<x)\wedge(x<b)$, taking the endpoints $a$ and $b$ as parameters, and $(-\infty,b]$ is defined by $(x<b)\vee (x=b)$, and so on. Thus, finite unions of intervals are also definable by taking a disjunction.

Conversely, any putative definition $\varphi(x,y_1,\ldots,y_n)$ is equivalent to a Boolean combination of atomic assertions concerning $x$ and the parameters $y_i$. Thus, whenever it is true for some $x$ between, above, or below the parameters $y_i$, it will be true of all $x$ in that same interval, and so the set that is defined will be a finite union of intervals having the parameters $y_i$ as endpoints, with the intervals being open or closed depending on whether the parameters themselves satisfy the formula or not. $\Box$

Theory of successor

Let us next consider the theory of a successor function, as realized for example in the Dedekind model, $\<\N,S,0>$, where $S$ is the successor
function $Sn=n+1$. The theory has the following three axioms:
$$\forall x\, (Sx\neq 0)$$

$$\forall x,y\, (Sx=Sy\implies x=y)$$

$$\forall x\, \bigl(x\neq 0\implies \exists y\,(Sy=x)\bigr).$$
In the Dedekind model, every individual is definable, since $x=n$ just in case $x=SS\cdots S0$, where we have $n$ iterative applications of $S$. So this is a pointwise definable model, and hence also Leibnizian. Note the interplay between the $n$ of the object theory and $n$ of the metatheory in the claim that every individual is definable.

What definable subsets of the Dedekind model can we think of? Of course, we can define any particular finite set, since the numbers are definable as individuals. For example, we can define the set ${1,5,8}$ by saying, “either $x$ has the defining property of $1$ or it has the defining property of $5$ or it has the defining property of $8$.” Thus any finite set is definable, and by negating such a formula, we see also that any cofinite set—the complement of a finite set—is definable. Are there any other definable sets? For example, can we define the set of even numbers? How could we prove that we cannot? The Dedekind structure has no automorphisms, since all the individuals are definable, and so we cannot expect to use automorphism to show that the even numbers are not definable as a set. We need a deeper understanding of definability and truth in this structure.

Theorem. The theory of a successor function admits elimination of quantifiers—every assertion is equivalent in this theory to a quantifier-free assertion.

Proof. By induction on formulas. The claim is already true for atomic assertions, since they have no quantifiers, and quantifier-free assertions are clearly closed under the Boolean connectives. So it suffices by induction to eliminate the quantifier from assertions of the form $\exists x\, \varphi(x,\ldots)$, where $\varphi$ is quantifier free. We may place $\varphi$ in disjunctive normal form, and since the quantifier distributes over disjunction, we reduce to the case that $\varphi$ is a conjunction of atomic and negated atomic assertions. We may assume that $x$ appears in each atomic conjunct, since otherwise we may bring that conjunct outside the quantifier. We may furthermore assume that $x$ appears on only one side of each atomic clause, since otherwise the statement is either trivially true as with $SSx=SSx$ or $Sx\neq SSx$, or trivially false as with $Sx=SSx$. Consider for example:
$$\exists x\,\bigl[(SSSx=y)\wedge (SSy=SSSz)\wedge (SSSSSx=SSSw)\wedge{}$$
$$\hskip1in{}\wedge (Sx\neq SSSSw)\wedge (SSSSy\neq SSSSSz)\bigr]$$
We can remove duplicated $S$s occurring on both sides of an equation. If $x=S^ky$ appears, we can get rid of $x$ and replace all occurrences with $S^ky$. If $S^nx=y$ appears, can add $S$’s everywhere and then replace any occurrence of $S^nx$ with $y$. If only inequalities appear, then the statement is simply true.

For example, since the third clause in the formula above is equivalent to $SSx=w$, we may use that to omit any need to refer to $x$, and the formula overall is equivalent to
$$(Sw=y)\wedge (y=Sz)\wedge (w\neq SSSSSw)\wedge (y\neq Sz),$$ which has no quantifiers.
Since the method is completely general, we have proved that the theory of successor admits elimination of quantifiers. $\Box$

It follows that the definable sets in the Dedekind model $\<\N,S,0>$, using only the first-order language of this structure, are precisely the finite and cofinite sets.

Corollary. The definable sets in $\<\N,S,0>$ are precisely the finite and cofinite sets

Proof. This is because an atomic formula defines a finite set, and the collection of finite or cofinite sets is closed under negation and Boolean combinations. Since every formula is equivalent to a quantifier-free formula, it follows that every formula is a Boolean combination of atomic formulas, and hence defines a finite or cofinite set. $\Box$

In particular, the concepts of being even or being odd are not definable from the successor operation in $\<\N,S,0>$, since the set of even numbers is neither finite nor cofinite.

Corollary. The theory of a successor function is complete—it is the theory of the standard model $\<\N,S,0>$.

Proof. If $\sigma$ is a sentence in the language of successor, then by the quantifier-elimination theorem it is equivalent to a quantifier-free assertion in the language with the successor function $S$ and constant symbol $0$. But the only quantifier-free sentences in this language are Boolean combinations of equations of the form $S^n0=S^k0$. Since all such equations are settled by the theory, the sentence itself is settled by the theory, and so the theory is complete. $\Box$

We saw that the three axioms displayed on the previous page were true in the Dedekind model $\<\N,S,0>$. Are there any other models of these axioms? Yes, there are. For example, we can add another $\Z$-chain of successors on the side, as with $\N+\Z$ or $\N\sqcup\Z$, although we shall see that the order is not definable. What are the definable elements in the enlarged structure? Still $0$ and all its finite successors are definable as before. But no elements of the $\Z$-chains can be definable, because we may perform an automorphism of the structure that translates elements within the $\Z$-chain by a fixed amount.

Let me prove next that the theory implies the induction axiom schema.

Corollary. The theory of successor (the three axioms) implies the induction axiom schema in the language of successor, that is, the following assertion for any assertion $\varphi(x)$:
$$\left[\varphi(0)\wedge\bigl(\forall x\,\bigl(\varphi(x)\implies\varphi(Sx)\bigr)\right]\implies\forall x\,\varphi(x)$$

Proof. Consider the set defined by $\varphi(x)$. By the earlier corollary, it must be eventually periodic in the standard model $\<\N,S,0>$. But by the induction assumption stated in the theorem, it must hold of every number in the standard model. So the standard model thinks that $\forall x\,\varphi(x)$. But the theory of the standard model is the theory of successor, which is complete. So the theory of successor entails that $\varphi$ is universal, as desired. $\Box$

In other words, in the trivial theory of successor–the three axioms—we get the corresponding induction axiom for free.

Presburger arithmetic

Presburger arithmetic is the theory of addition on the natural numbers, that is, the theory of the structure $\<\N,+,0,1>$. The numbers $0$ and $1$ are actually definable here from addition alone, since $0$ is the unique additive identity, and $1$ is the only number $u$ that is not expressible as a sum $x+y$ with both $x\neq u$ and $y\neq u$. So we may view this model if desired as a definitional expansion of $\<\N,+>$, with addition only. The number $2$ is similarly definable as $1+1$, and indeed any number $n$ is definable as $1+\cdots+1$, with $n$ summands, and so this is a pointwise definable model and hence also Leibnizian.

What are the definable subsets? We can define the even numbers, of course, since $x$ is even if and only if $\exists y\,(y+y=x)$. We can similarly define congruence modulo $2$ by $x\equiv_2 y\iff \exists z\,\bigl[(z+z+x=y)\vee (z+z+y=x)\bigr]$. More generally, we can express the relation of congruence modulo $n$ for any fixed $n$ as follows:
$$x\equiv_n y\quad\text{ if and only if }\exists z\,\bigl[(\overbrace{z+\cdots+z}^n+x=y)\vee(\overbrace{z+\cdots+z}^n+y=x)\bigr].$$
What I claim is that this exhausts what is expressible.

Theorem. Presburger arithmetic in the definitional expansion with all congruence relations, that is, the theory of the structure
$$\<\N,+,0,1,\equiv_2,\equiv_3,\equiv_4,\ldots>$$
admits elimination of quantifiers. In particular, every assertion in the language of $\<\N,+,0,1>$ is equivalent to a quantifer-free assertion in the language with the congruence relations.

Proof. We consider Presburger arithmetic in the language with addition $+$, with all the congruence relations $\equiv_n$ for every $n\geq 2$, and the constants $0$ and $1$. We prove quantifier-elimination in this language by induction on formulas. As before the claim already holds for atomic assertions and is preserved by Boolean connectives. So it suffices to eliminate the quantifier from assertions of the form $\exists x\,\varphi(x,\ldots)$, where $\varphi$ is quantifier-free. By placing $\varphi$ into disjunctive normal form and distributing the quantifier over the disjunction, we may assume that $\varphi$ is a conjunction of atomic and negated atomic assertions. Note that negated congruences are equivalent to a disjunction of positive congruences, such as in the case:
$$x\not\equiv_4 y\quad\text{ if and only if }\quad (x+1\equiv_4y)\vee(x+1+1\equiv_4y)\vee (x+1+1+1\equiv_4 y).$$
We may therefore assume there are no negated congruences in $\varphi$. By canceling like terms on each side of an equation or congruence, we may assume that $x$ occurs on only one side. We may assume that $x$ occurs nontrivially in every conjunct of $\varphi$, since otherwise this conjunct commutes outside the quantifier. Since subtraction modulo $n$ is the same as adding $n-1$ times, we may also assume that all congruences occurring in $\varphi$ have the form $kx\equiv_n t$, where $kx$ denotes the syntactic expression $x+\cdots+x$ occurring in the formula, with $k$ summands, and $t$ is a term not involving the variable $x$. Thus, $\varphi$ is a conjunction of expressions each having the form $kx\equiv_n t$, $ax+r=s$, or $bx+u\neq v$, where $ax$ and $bx$ similarly denote the iterated sums $x+\cdots+x$ and $r,s,u,v$ are terms not involving $x$.

If indeed there is a conjunct of the equality form $ax+r=s$ occurring in $\varphi$, then we may omit the quantifier as follows. Namely, in order to fulfill the existence assertion, we know that $x$ will have to solve $ax+r=s$, and so in particular $r\equiv_a s$, which ensures the existence of such an $x$, but also in this case any inequality $bx+u\neq v$ can be equivalently expressed as $abx+au\neq av$, which since $ax+r=s$ is equivalent to $bs+au\neq av+br$, and this does does not involve $x$; similarly, any congruence $kx\equiv_n t$ is equivalent to $akx\equiv_{an}at$, which is equivalent to $s\equiv_{an} r+at$, which again does not involve $x$. Thus, when there is an equality involving $x$ present in $\varphi$, then we can use that fact to express the whole formula in an equivalent manner not involving $x$.

So we have reduced to the case $\exists x\,\varphi$, where $\varphi$ is a conjunction of inequalities $bs+u\neq v$ and congruences $kx\equiv_n t$. We can now ignore the inequalities, since if the congruence system has a solution, then it will have infinitely many solutions, and so there will be an $x$ solving any finitely given inequalities. So we may assume that $\varphi$ is simply a list of congruences of the form $kx\equiv_n t$, and the assertion is that this system of congruences has a solution. But there are only finitely many congruences mentioned, and so by working modulo the least common multiple of the bases that occur, there are only finitely many possible values for $x$ to be checked. And so we can simply replace $\varphi$ with a disjunction over these finitely many values $i$, replacing $x$ in each conjunction with $1+\cdots+1$, using $i$ copies of $1$, for each $i$ up to the least common multiples of the bases that arise in the congruences appearing in $\varphi$. If there is an $x$ solving the system, then one of these values of $i$ will work, and conversely.

So we have ultimately succeeded in expressing $\exists x\,\varphi$ in a quantifier-free manner, and so by induction every assertion in Presburger arithmetic is equivalent to a quantifier-free assertion in the language allowing addition, congruences, and the constants $0$ and $1$. $\Box$

Corollary. The definable sets in $\<\N,+,0,1>$ are exactly the eventually periodic sets.

Proof. Every periodic set is definable, since one can specify the set up to the period $p$, and then express the invariance modulo $p$. Any finite deviation from a definable set also is definable, since every individual number is definable. So every eventually period set is definable. Conversely, every definable set is defined by a quantifier-free assertion in the language of $\<\N,+,0,1,\equiv_2,\equiv_3,\equiv_4,\ldots>$. We may place the definition in disjunctive normal form, and again replace negated congruences with a disjunction of positive congruences. For large enough values of $x$, the equalities and inequalities appearing in the definition become irrelevant, and so the definition eventually agrees with a finite union of solutions of congruence systems. Every such system is periodic with a period at most the least common multiple of the bases of the congruences appearing in it. And so every definable set is eventually periodic, as desired. $\Box$

Corollary. Multiplication is not definable in $\<\N,+,0,1>$. Indeed, the squaring operation is not definable, and neither is the divisibility relation $p\divides q$.

Proof. If we could define multiplication, or even the squaring operation, then we would be able to define the set of perfect squares, but this is not eventually periodic. Similarly, if we could define the divides relation $p\divides q$, then we could define the set of prime numbers, which is not eventually periodic. $\Box$

Real-closed field

Let us lastly consider the ordered real field $\<\R,+,\cdot,0,1,<>$. I want to mention (without proof) that a deep theorem of Tarski shows that this structure admits elimination of quantifiers: every assertion is equivalent in this structure to a quantifier-free assertion. In fact all that is need is that this is a real-closed field, an ordered field in which every odd-degree polynomial has a root and every positive number has a square root.

We can begin to gain insight to this fact by reaching into the depths of our high-school education. Presented with an equation $ax^2+bx+c=0$ in the integers, we know by the quadratic formula that the solution is $x=\left(-b\pm\sqrt{b^2-4ac}\right)/2a$, and in particular, there is a solution in the real numbers if and only if $b^2-4ac\geq 0$, since otherwise a negative discriminant means the solution is a complex number. In other words,
$$\exists x\,(ax^2+bx+c=0)\quad\text{ if and only if }\quad b^2-4ac\geq 0.$$
The key point is that this an elimination of quantifiers result, since we have eliminated the quantifier $\exists x$.

Tarski’s theorem proves more generally that every assertion in the language of ordered fields is equivalent in real-closed fields to a quantifier-free assertion. Furthermore, there is a computable procedure to find the quantifier-free equivalent, as well as a computable procedure to determine the truth of any quantifier-free assertion in the theory of real-closed fields.

What I find incredible is that it follows from this that there is a computable procedure to determine the truth of any first-order assertion of Cartesian plane geometry, since all such assertions are expressible in the language of $\<\R,+,\cdot,0,1,<>$. Amazing! I view this as an incredible culmination of two thousand years of mathematical investigation: we now have an algorithm to determine by rote procedure the truth of any statement in Cartesian geometry. Meanwhile, a counterpoint is that the decision procedure, unfortunately, is not computationally feasible, however, since it takes more than exponential time, and it is a topic of research to investigate the computational running time of the best algorithms.

The model theory of set-theoretic mereology, Notre Dame Math Logic Seminar, February 2022

This will be a talk for the Mathematical Logic Seminar at the University of Notre Dame on 8 February 2022 at 2 pm in 125 Hayes Healy.

Abstract. Mereology, the study of the relation of part to whole, is often contrasted with set theory and its membership relation, the relation of element to set. Whereas set theory has found comparative success in the foundation of mathematics, since the time of Cantor, Zermelo and Hilbert, mereology is strangely absent. Can a set-theoretic mereology, based upon the set-theoretic inclusion relation ⊆ rather than the element-of relation ∈, serve as a foundation of mathematics? How well is a model of set theory ⟨M,∈⟩ captured by its mereological reduct ⟨M,⊆⟩? In short, how much set theory does set-theoretic mereology know? In this talk, I shall present results on the model theory of set-theoretic mereology that lead broadly to negative answers to these questions and explain why mereology has not been successful as a foundation of mathematics. (Joint work with Makoto Kikuchi)

Handwritten lecture notes

See the research papers:

  • Set-theoretic mereology
    • [DOI] J. D. Hamkins and M. Kikuchi, “Set-theoretic mereology,” Logic and Logical Philosophy, Special issue “Mereology and beyond, part II”, vol. 25, iss. 3, p. 285–308, 2016.
      [Bibtex]
      @ARTICLE{HamkinsKikuchi2016:Set-theoreticMereology,
      author = {Joel David Hamkins and Makoto Kikuchi},
      title = {Set-theoretic mereology},
      journal = {Logic and Logical Philosophy, Special issue ``Mereology and beyond, part II''},
      editor = {A.~C.~Varzi and R.~Gruszczy{\'n}ski},
      year = {2016},
      volume = {25},
      number = {3},
      pages = {285--308},
      month = {},
      doi = {10.12775/LLP.2016.007},
      note = {},
      eprint = {1601.06593},
      archivePrefix = {arXiv},
      primaryClass = {math.LO},
      url = {http://jdh.hamkins.org/set-theoretic-mereology},
      abstract = {},
      keywords = {},
      source = {},
      ISSN = {1425-3305},
      MRCLASS = {03A05 (03E70)},
      MRNUMBER = {3546211},
      }
  • The inclusion relations of the countable models of set theory are all isomorphic
    • J. D. Hamkins and M. Kikuchi, “The inclusion relations of the countable models of set theory are all isomorphic,” ArXiv e-prints, 2017.
      [Bibtex]
      @ARTICLE{HamkinsKikuchi:The-inclusion-relations-of-the-countable-models-of-set-theory-are-all-isomorphic,
      author = {Joel David Hamkins and Makoto Kikuchi},
      title = {The inclusion relations of the countable models of set theory are all isomorphic},
      journal = {ArXiv e-prints},
      editor = {},
      year = {2017},
      volume = {},
      number = {},
      pages = {},
      month = {},
      doi = {},
      note = {Manuscript under review},
      eprint = {1704.04480},
      archivePrefix = {arXiv},
      primaryClass = {math.LO},
      url = {http://jdh.hamkins.org/inclusion-relations-are-all-isomorphic},
      abstract = {},
      keywords = {under-review},
      source = {},
      }
Notre Dame campus in snow

An infinitary-logic-free proof of the Barwise end-extension theorem, with new applications, University of Münster, January 2019

This will be a talk for the Logic Oberseminar at the University of Münster, January 11, 2019.

Abstract. I shall present a new proof, with new applications, of the amazing extension theorem of Barwise (1971), which shows that every countable model of ZF has an end-extension to a model of ZFC + V=L. This theorem is both (i) a technical culmination of Barwise’s pioneering methods in admissible set theory and the admissible cover, but also (ii) one of those rare mathematical results saturated with significance for the philosophy of set theory. The new proof uses only classical methods of descriptive set theory, and makes no mention of infinitary logic. The results are directly connected with recent advances on the universal $\Sigma_1$-definable finite set, a set-theoretic version of Woodin’s universal algorithm.

A new proof of the Barwise extension theorem, without infinitary logic, CUNY Logic Workshop, December 2018

I’ll be back in New York from Oxford, and this will be a talk for the CUNY Logic Workshop, December 14, 2018.

Abstract. I shall present a new proof, with new applications, of the amazing extension theorem of Barwise (1971), which shows that every countable model of ZF has an end-extension to a model of ZFC + V=L. This theorem is both (i) a technical culmination of Barwise’s pioneering methods in admissible set theory and the admissible cover, but also (ii) one of those rare mathematical results saturated with significance for the philosophy of set theory. The new proof uses only classical methods of descriptive set theory, and makes no mention of infinitary logic. The results are directly connected with recent advances on the universal $\Sigma_1$-definable finite set, a set-theoretic version of Woodin’s universal algorithm.

My lecture notes are available.

A new proof of the Barwise extension theorem, without infinitary logic

I have found a new proof of the Barwise extension theorem, that wonderful yet quirky result of classical admissible set theory, which says that every countable model of set theory can be extended to a model of $\text{ZFC}+V=L$.

Barwise Extension Theorem. (Barwise 1971) $\newcommand\ZF{\text{ZF}}\newcommand\ZFC{\text{ZFC}}$ Every countable model of set theory $M\models\ZF$ has an end-extension to a model of $\ZFC+V=L$.

The Barwise extension theorem is both (i) a technical culmination of the pioneering methods of Barwise in admissible set theory and infinitary logic, including the Barwise compactness and completeness theorems and the admissible cover, but also (ii) one of those rare mathematical theorems that is saturated with significance for the philosophy of mathematics and particularly the philosophy of set theory. I discussed the theorem and its philosophical significance at length in my paper, The multiverse perspective on the axiom of constructibility, where I argued that it can change how we look upon the axiom of constructibility and whether this axiom should be considered ‘restrictive,’ as it often is in set theory. Ultimately, the Barwise extension theorem shows how wrong a model of set theory can be, if we should entertain the idea that the set-theoretic universe continues growing beyond it.

Regarding my new proof, below, however, what I find especially interesting about it, if not surprising in light of (i) above, is that it makes no use of Barwise compactness or completeness and indeed, no use of infinitary logic at all! Instead, the new proof uses only classical methods of descriptive set theory concerning the representation of $\Pi^1_1$ sets with well-founded trees, the Levy and Shoenfield absoluteness theorems, the reflection theorem and the Keisler-Morley theorem on elementary extensions via definable ultrapowers. Like the Barwise proof, my proof splits into cases depending on whether the model $M$ is standard or nonstandard, but another interesting thing about it is that with my proof, it is the $\omega$-nonstandard case that is easier, whereas with the Barwise proof, the transitive case was easiest, since one only needed to resort to the admissible cover when $M$ was ill-founded. Barwise splits into cases on well-founded/ill-founded, whereas in my argument, the cases are $\omega$-standard/$\omega$-nonstandard.

To clarify the terms, an end-extension of a model of set theory $\langle M,\in^M\rangle$ is another model $\langle N,\in^N\rangle$, such that the first is a substructure of the second, so that $M\subseteq N$ and $\in^M=\in^N\upharpoonright M$, but further, the new model does not add new elements to sets in $M$. In other words, $M$ is an $\in$-initial segment of $N$, or more precisely: if $a\in^N b\in M$, then $a\in M$ and hence $a\in^M b$.

Set theory, of course, overflows with instances of end-extensions. For example, the rank-initial segments $V_\alpha$ end-extend to their higher instances $V_\beta$, when $\alpha<\beta$; similarly, the hierarchy of the constructible universe $L_\alpha\subseteq L_\beta$ are end-extensions; indeed any transitive set end-extends to all its supersets. The set-theoretic universe $V$ is an end-extension of the constructible universe $L$ and every forcing extension $M[G]$ is an end-extension of its ground model $M$, even when nonstandard. (In particular, one should not confuse end-extensions with rank-extensions, also known as top-extensions, where one insists that all the new sets have higher rank than any ordinal in the smaller model.)

Let’s get into the proof.

Proof. Suppose that $M$ is a model of $\ZF$ set theory. Consider first the case that $M$ is $\omega$-nonstandard. For any particular standard natural number $k$, the reflection theorem ensures that there are arbitrarily high $L_\alpha^M$ satisfying $\ZFC_k+V=L$, where $\ZFC_k$ refers to the first $k$ axioms of $\ZFC$ in a fixed computable enumeration by length. In particular, every countable transitive set $m\in L^M$ has an end-extension to a model of $\ZFC_k+V=L$. By overspill (that is, since the standard cut is not definable), there must be some nonstandard $k$ for which $L^M$ thinks that every countable transitive set $m$ has an end-extension to a model of $\ZFC_k+V=L$, which we may assume is countable. This is a $\Pi^1_2$ statement about $k$, which will therefore also be true in $M$, by the Shoenfield absolutenss theorem. It will also be true in all the elementary extensions of $M$, as well as in their forcing extensions. And indeed, by the Keisler-Morley theorem, the model $M$ has an elementary top extension $M^+$. Let $\theta$ be a new ordinal on top of $M$, and let $m=V_\theta^{M^+}$ be the $\theta$-rank-initial segment of $M^+$, which is a top-extension of $M$. Let $M^+[G]$ be a forcing extension in which $m$ has become countable. Since the $\Pi^1_2$ statement is true in $M^+[G]$, there is an end-extension of $\langle m,\in^{M^+}\rangle$ to a model $\langle N,\in^N\rangle$ that $M^+[G]$ thinks satisfies $\ZFC_k+V=L$. Since $k$ is nonstandard, this theory includes all the $\ZFC$ axioms, and since $m$ end-extends $M$, we have found an end-extension of $M$ to a model of $\ZFC+V=L$, as desired.

It remains to consider the case where $M$ is $\omega$-standard. By the Keisler-Morley theorem, let $M^+$ be an elementary top-extension of $M$. Let $\theta$ be an ordinal of $M^+$ above $M$, and consider the corresponding rank-initial segment $m=V_\theta^{M^+}$, which is a transitive set in $M^+$ that covers $M$. If $\langle m,\in^{M^+}\rangle$ has an end-extension to a model of $\ZFC+V=L$, then we’re done, since such a model would also end-extend $M$. So assume toward contradiction that there is no such end-extension of $m$. Let $M^+[G]$ be a forcing extension in which $m$ has become countable. The assertion that $m$ has no end-extension to a model of $\ZFC+V=L$ is actually true and hence true in $M^+[G]$. This is a $\Pi^1_1$ assertion there about the real coding $m$. Every such assertion has a canonically associated tree, which is well-founded exactly when the statement is true. Since the statement is true in $M^+[G]$, this tree has some countable rank $\lambda$ there. Since these models have the standard $\omega$, the tree associated with the statement is the same for us as inside the model, and since the statement is actually true, the tree is actually well founded. So the rank $\lambda$ must come from the well-founded part of the model.

If $\lambda$ happens to be countable in $L^{M^+}$, then consider the assertion, “there is a countable transitive set, such that the assertion that it has no end-extension to a model of $\ZFC+V=L$ has rank $\lambda$.” This is a $\Sigma_1$ assertion, since it is witnessed by the countable transitive set and the ranking function of the tree associated with the non-extension assertion. Since the parameters are countable, it follows by Levy reflection that the statement is true in $L^{M^+}$. So $L^{M^+}$ has a countable transitive set, such that the assertion that it has no end-extension to a model of $\ZFC+V=L$ has rank $\lambda$. But since $\lambda$ is actually well-founded, the statement would have to be actually true; but it isn’t, since $L^{M^+}$ itself is such an extension, a contradiction.

So we may assume $\lambda$ is uncountable in $M^+$. In this case, since $\lambda$ was actually well-ordered, it follows that $L^M$ is well-founded beyond its $\omega_1$. Consider the statement “there is a countable transitive set having no end-extension to a model of $\ZFC+V=L$.” This is a $\Sigma^1_2$ sentence, which is true in $M^+[G]$ by our assumption about $m$, and so by Shoenfield absoluteness, it is true in $L^{M^+}$ and hence also $L^M$. So $L^M$ thinks there is a countable transitive set $b$ having no end-extension to a model of $\ZFC+V=L$. This is a $\Pi^1_1$ assertion about $b$, whose truth is witnessed in $L^M$ by a ranking of the associated tree. Since this rank would be countable in $L^M$ and this model is well-founded up to its $\omega_1$, the tree must be actually well-founded. But this is impossible, since it is not actually true that $b$ has no such end-extension, since $L^M$ itself is such an end-extension of $b$. Contradiction. $\Box$

One can prove a somewhat stronger version of the theorem, as follows.

Theorem. For any countable model $M$ of $\ZF$, with an inner model $W\models\ZFC$, and any statement $\phi$ true in $W$, there is an end-extension of $M$ to a model of $\ZFC+\phi$. Furthermore, one can arrange that every set of $M$ is countable in the extension model.

In particular, one can find end-extensions of $\ZFC+V=L+\phi$, for any statement $\phi$ true in $L^M$.

Proof. Carry out the same proof as above, except in all the statements, ask for end-extensions of $\ZFC+\phi$, instead of end-extensions of $\ZFC+V=L$, and also ask that the set in question become countable in that extension. The final contradictions are obtained by the fact that the countable transitive sets in $L^M$ do have end-extensions like that, in which they are countable, since $W$ is such an end-extension. $\Box$

For example, we can make the following further examples.

Corollaries.

  1. Every countable model $M$ of $\ZFC$ with a measurable cardinal has an end-extension to a model $N$ of $\ZFC+V=L[\mu]$.
  2. Every countable model $M$ of $\ZFC$ with extender-based large cardinals has an end-extension to a model $N$ satisfying $\ZFC+V=L[\vec E]$.
  3. Every countable model $M$ of $\ZFC$ with infinitely many Woodin cardinals has an end-extension to a model $N$ of $\ZF+\text{AD}+V=L(\mathbb{R})$.

And in each case, we can furthermore arrange that every set of $M$ is countable in the extension model $N$.

This proof grew out of a project on the $\Sigma_1$-definable universal finite set, which I am currently undertaking with Kameryn Williams and Philip Welch.


Jon Barwise. Infinitary methods in the model theory of set theory. In Logic
Colloquium ’69 (Proc. Summer School and Colloq., Manchester, 1969), pages
53–66. North-Holland, Amsterdam, 1971.

Different set theories are never bi-interpretable

I was fascinated recently to discover something I hadn’t realized about relative interpretability in set theory, and I’d like to share it here. Namely,

Different set theories extending ZF are never bi-interpretable!

For example, ZF and ZFC are not bi-interpretable, and neither are ZFC and ZFC+CH, nor ZFC and ZFC+$\neg$CH, despite the fact that all these theories are equiconsistent. The basic fact is that there are no nontrivial instances of bi-interpretation amongst the models of ZF set theory. This is surprising, and could even be seen as shocking, in light of the philosophical remarks one sometimes hears asserted in the philosophy of set theory that what is going on with the various set-theoretic translations from large cardinals to determinacy to inner model theory, to mention a central example, is that we can interpret between these theories and consequently it doesn’t much matter which context is taken as fundamental, since we can translate from one context to another without loss.

The bi-interpretation result shows that these interpretations do not and cannot rise to the level of bi-interpretations of theories — the most robust form of mutual relative interpretability — and consequently, the translations inevitably must involve a loss of information.

To be sure, set theorists classify the various set-theoretic principles and theories into a hierarchy, often organized by consistency strength or by other notions of interpretative power, using forcing or definable inner models. From any model of ZF, for example, we can construct a model of ZFC, and from any model of ZFC, we can construct models of ZFC+CH or ZFC+$\neg$CH and so on. From models with sufficient large cardinals we can construct models with determinacy or inner-model-theoretic fine structure and vice versa. And while we have relative consistency results and equiconsistencies and even mutual interpretations, we will have no nontrivial bi-interpretations.

(I had proved the theorem a few weeks ago in joint work with Alfredo Roque Freire, who is visiting me in New York this year. We subsequently learned, however, that this was a rediscovery of results that have evidently been proved independently by various authors. Albert Visser proves the case of PA in his paper, “Categories of theories and interpretations,” Logic in Tehran, 284–341, Lect. Notes Log., 26, Assoc. Symbol. Logic, La Jolla, CA, 2006, (pdf, see pp. 52-55). Ali Enayat gave a nice model-theoretic argument for showing specifically that ZF and ZFC are not bi-interpretable, using the fact that ZFC models can have no involutions in their automorphism groups, but ZF models can; and he proved the general version of the theorem, for ZF, second-order arithmetic $Z_2$ and second-order set theory KM in his 2016 article, A. Enayat, “Variations on a Visserian theme,” in Liber Amicorum Alberti : a tribute to Albert Visser / Jan van Eijck, Rosalie Iemhoff and Joost J. Joosten (eds.) Pages, 99-110. ISBN, 978-1848902046. College Publications, London. The ZF version was apparently also observed independently by Harvey Friedman, Visser and Fedor Pakhomov.)

Meanwhile, let me explain our argument. Recall from model theory that one theory $S$ is interpreted in another theory $T$, if in any model of the latter theory $M\models T$, we can define (and uniformly so in any such model) a certain domain $N\subset M^k$ and relations and functions on that domain so as to make $N$ a model of $S$. For example, the theory of algebraically closed fields of characteristic zero is interpreted in the theory of real-closed fields, since in any real-closed field $R$, we can consider pairs $(a,b)$, thinking of them as $a+bi$, and define addition and multiplication on those pairs in such a way so as to construct an algebraically closed field of characteristic zero.

Two theories are thus mutually interpretable, if each of them is interpretable in the other. Such theories are necessarily equiconsistent, since from any model of one of them we can produce a model of the other.

Note that mutual interpretability, however, does not insist that the two translations are inverse to each other, even up to isomorphism. One can start with a model of the first theory $M\models T$ and define the interpreted model $N\models S$ of the second theory, which has a subsequent model of the first theory again $\bar M\models T$ inside it. But the definition does not insist on any particular connection between $M$ and $\bar M$, and these models need not be isomorphic nor even elementarily equivalent in general.

By addressing this, one arrives at a stronger and more robust form of mutual interpretability. Namely, two theories $S$ and $T$ are bi-interpretable, if they are mutually interpretable in such a way that the models can see that the interpretations are inverse. That is, for any model $M$ of the theory $T$, if one defines the interpreted model $N\models S$ inside it, and then defines the interpreted model $\bar M$ of $T$ inside $N$, then $M$ is isomorphic to $\bar M$ by a definable isomorphism in $M$, and uniformly so (and the same with the theories in the other direction). Thus, every model of one of the theories can see exactly how it itself arises definably in the interpreted model of the other theory.

For example, the theory of linear orders $\leq$ is bi-interpretable with the theory of strict linear order $<$, since from any linear order $\leq$ we can define the corresponding strict linear order $<$ on the same domain, and from any strict linear order $<$ we can define the corresponding linear order $\leq$, and doing it twice brings us back again to the same order.

For a richer example, the theory PA is bi-interpretable with the finite set theory $\text{ZF}^{\neg\infty}$, where one drops the infinity axiom from ZF and replaces it with the negation of infinity, and where one has the $\in$-induction scheme in place of the foundation axiom. The interpretation is via the Ackerman encoding of hereditary finite sets in arithmetic, so that $n\mathrel{E} m$ just in case the $n^{th}$ binary digit of $m$ is $1$. If one starts with the standard model $\mathbb{N}$, then the resulting structure $\langle\mathbb{N},E\rangle$ is isomorphic to the set $\langle\text{HF},\in\rangle$ of hereditarily finite sets. More generally, by carrying out the Ackermann encoding in any model of PA, one thereby defines a model of $\text{ZF}^{\neg\infty}$, whose natural numbers are isomorphic to the original model of PA, and these translations make a bi-interpretation.

We are now ready to prove that this bi-interpretation situation does not occur with different set theories extending ZF.

Theorem. Distinct set theories extending ZF are never bi-interpretable. Indeed, there is not a single model-theoretic instance of bi-interpretation occurring with models of different set theories extending ZF.

Proof. I mean “distinct” here in the sense that the two theories are not logically equivalent; they do not have all the same theorems. Suppose that we have a bi-interpretation instance of the theories $S$ and $T$ extending ZF. That is, suppose we have a model $\langle M,\in\rangle\models T$ of the one theory, and inside $M$, we can define an interpreted model of the other theory $\langle N,\in^N\rangle\models S$, so the domain of $N$ is a definable class in $M$ and the membership relation $\in^N$ is a definable relation on that class in $M$; and furthermore, inside $\langle N,\in^N\rangle$, we have a definable structure $\langle\bar M,\in^{\bar M}\rangle$ which is a model of $T$ again and isomorphic to $\langle M,\in^M\rangle$ by an isomorphism that is definable in $\langle M,\in^M\rangle$. So $M$ can define the map $a\mapsto \bar a$ that forms an isomorphism of $\langle M,\in^M\rangle$ with $\langle \bar M,\in^{\bar M}\rangle$. Our argument will work whether we allow parameters in any of these definitions or not.

I claim that $N$ must think the ordinals of $\bar M$ are well-founded, for otherwise it would have some bounded cut $A$ in the ordinals of $\bar M$ with no least upper bound, and this set $A$ when pulled back pointwise by the isomorphism of $M$ with $\bar M$ would mean that $M$ has a cut in its own ordinals with no least upper bound; but this cannot happen in ZF.

If the ordinals of $N$ and $\bar M$ are isomorphic in $N$, then all three models have isomorphic ordinals in $M$, and in this case, $\langle M,\in^M\rangle$ thinks that $\langle N,\in^N\rangle$ is a well-founded extensional relation of rank $\text{Ord}$. Such a relation must be set-like (since there can be no least instance where the predecessors form a proper class), and so $M$ can perform the Mostowski collapse of $\in^N$, thereby realizing $N$ as a transitive class $N\subseteq M$ with $\in^N=\in^M\upharpoonright N$. Similarly, by collapsing we may assume $\bar M\subseteq N$ and $\in^{\bar M}=\in^M\upharpoonright\bar M$. So the situation consists of inner models $\bar M\subseteq N\subseteq M$ and $\langle \bar M,\in^M\rangle$ is isomorphic to $\langle M,\in^M\rangle$ in $M$. This is impossible unless all three models are identical, since a simple $\in^M$-induction shows that $\pi(y)=y$ for all $y$, because if this is true for the elements of $y$, then $\pi(y)=\{\pi(x)\mid x\in y\}=\{x\mid x\in y\}=y$. So $\bar M=N=M$ and so $N$ and $M$ satisfy the same theory, contrary to assumption.

If the ordinals of $\bar M$ are isomorphic to a proper initial segment of the ordinals of $N$, then a similar Mostowski collapse argument would show that $\langle\bar M,\in^{\bar M}\rangle$ is isomorphic in $N$ to a transitive set in $N$. Since this structure in $N$ would have a truth predicate in $N$, we would be able to pull this back via the isomorphism to define (from parameters) a truth predicate for $M$ in $M$, contrary to Tarski’s theorem on the non-definability of truth.

The remaining case occurs when the ordinals of $N$ are isomorphic in $N$ to an initial segment of the ordinals of $\bar M$. But this would mean that from the perspective of $M$, the model $\langle N,\in^N\rangle$ has some ordinal rank height, which would mean by the Mostowski collapse argument that $M$ thinks $\langle N,\in^N\rangle$ is isomorphic to a transitive set. But this contradicts the fact that $M$ has an injection of $M$ into $N$. $\Box$

It follows that although ZF and ZFC are equiconsistent, they are not bi-interpretable. Similarly, ZFC and ZFC+CH and ZFC+$\neg$CH are equiconsistent, but no pair of them is bi-interpretable. And again with all the various equiconsistency results concerning large cardinals.

A similar argument works with PA to show that different extensions of PA are never bi-interpretable.

Infinite time computable model theory

  • J. D. Hamkins, R. Miller, D. Seabold, and S. Warner, “Infinite time computable model theory,” in New Computational Paradigms: Changing Conceptions of What is Computable, S. B. Cooper, B. Löwe, and A. Sorbi, Eds., Springer, 2008, p. 521–557.
    [Bibtex]
    @INCOLLECTION{HamkinsMillerSeaboldWarner2007:InfiniteTimeComputableModelTheory,
    AUTHOR = {Hamkins, Joel David and Miller, Russell and Seabold, Daniel
    and Warner, Steve},
    TITLE = {Infinite time computable model theory},
    BOOKTITLE = "{New Computational Paradigms: Changing Conceptions of What is Computable}",
    PAGES = {521--557},
    PUBLISHER = {Springer},
    ADDRESS = {},
    YEAR = {2008},
    MRCLASS = {03C57 (03D10)},
    MRNUMBER = {2762096},
    editor = {S. B. Cooper and Benedikt Löwe and Andrea Sorbi},
    isbn = "0-387-36033-6",
    file = F,
    url = {http://wp.me/p5M0LV-3t},
    }

We introduce infinite time computable model theory, the computable model theory arising with infinite time Turing machines,  which provide infinitary notions of computability for structures built on the reals  $\mathbb{R}$. Much of the finite time theory generalizes to the infinite time context, but several fundamental questions, including the infinite time  computable analogue of the Completeness Theorem, turn out to be  independent of ZFC.