# The pirate treasure division problem

In my logic course this semester, as a part of the section on the logic of games, we considered the pirate treasure division problem.

Imagine a pirate ship with a crew of fearsome, perfectly logical pirates and a treasure of 100 gold coins to be divided amongst them. How shall they do it? They have long agreed upon the pirate treasure division procedure: The pirates are linearly ordered by rank, with the Captain, the first Lieutenant, the second Lieutenant and so on down the line; but let us simply refer to them as Pirate 1, Pirate 2, Pirate 3 and so on. Pirate 9 is swabbing the decks in preparation. For the division procedure, all the pirates assemble on deck, and the lowest-ranking pirate mounts the plank. Facing the other pirates, she proposes a particular division of the gold — so-and-so many gold pieces to the captain, so-and-so many pieces to Pirate 2 and so on.  The pirates then vote on the plan, including the pirate on the plank, and if a strict majority of the pirates approve of the plan, then it is adopted and that is how the gold is divided. But if the pirate’s plan is not approved by a pirate majority, then regretfully she must walk the plank into the sea (and her death) and the procedure continues with the next-lowest ranking pirate, who of course is now the lowest-ranking pirate.

Suppose that you are pirate 10: what plan do you propose?  Would you think it is a good idea to propose that you get to keep 94 gold pieces for yourself, with the six remaining given to a few of the other pirates? In fact, you can propose just such a thing, and if you do it correctly, your plan will pass!

Before explaining why, let me tell you a little more about the pirates. I mentioned that the pirates are perfectly logical, and not only that, they have the common knowledge that they are all perfectly logical. In particular, in their reasoning they can rely on the fact that the other pirates are logical, and that the other pirates know that they are all logical and that they know that, and so on.

Furthermore, it is common knowledge amongst the pirates that they all share the same pirate value system, with the following strictly ordered list of priorities:

Pirate value system:

1. Stay alive.
2. Get gold.
3. Cause the death of other pirates.
4. Arrange that other’s gold goes to the most senior pirates.

That is, at all costs, each pirate would prefer to avoid death, and if alive, to get as much gold as possible, but having achieved that, would prefer that as many other pirates die as possible (but not so much as to give up even one gold coin for additional deaths), and if all other things are equal, would prefer that whatever gold was not gotten for herself, that it goes as much as possible to the most senior pirates, for the pirates are, in their hearts, conservative people.

So, what plan should you propose as Pirate 10? Well, naturally, the pirates will consider Pirate 10’s plan in light of the alternative, which will be the plan proposed by Pirate 9, which will be compared with the plan of Pirate 8 and so on. Thus, it seems we should propagate our analysis from the bottom, working backwards from what happens with a very small number of pirates.

One pirate. If there is only one pirate, the captain, then she mounts the plank, and clearly she should propose “Pirate 1 gets all the gold”, and she should vote in favor of this plan, and so Pirate 1 gets all the gold, as anyone would have expected.

Two pirates. If there are exactly two pirates, then Pirate 2 will mount the plank, and what will she propose? She needs a majority of the two pirates, which means she must get the captain to vote for her plan. But no matter what plan she proposes, even if it is that all the gold should go to the captain, the captain will vote against the plan, since if Pirate 2 is killed, then the captain will get all the gold anyway, and because of pirate value 3, she would prefer that Pirate 2 is killed off.  So Pirate 2’s plan will not be approved by the captain, and so unfortunately, Pirate 2 will walk the plank.

Three pirates. If there are three pirates, then what will Pirate 3 propose? Well, she needs only two votes, and one of them will be her own. So she must convince either Pirate 1 or Pirate 2 to vote for her plan. But actually, Pirate 2 will have a strong incentive to vote for the plan regardless, since otherwise Pirate 2 will be in the situation of the two-pirate case, which ended with Pirate 2’s death. So Pirate 3 can count on Pirate 2’s vote regardless, and so Pirate 3 will propose:  Pirate 3 gets all the gold! This will be approved by both Pirate 2 and Pirate 3, a majority, and so with three pirates, Pirate 3 gets all the gold.

Four pirates. Pirate 4 needs to have three votes, so she needs to get two of the others to vote for her plan. She notices that if she is to die, then Pirates 1 and 2 will get no gold, and so she realizes that if she offers them each one gold coin, they will prefer that, because of the pirate value system. So Pirate 4 will propose to give one gold coin each to Pirates 1 and 2, and 98 gold coins to herself. This plan will pass with the votes of 1, 2 and 4.

Five pirates. Pirate 5 needs three votes, including her own. She can effectively buy the vote of Pirate 3 with one gold coin, since Pirate 3 will otherwise get nothing in the case of four pirates. And she needs one additional vote, that of Pirate 1 or 2, which she can get by offering two gold coins. Because of pirate value 4, she would prefer that the coins go to the highest ranking pirate, so she offers the plan:  two coins to Pirate 1, nothing to pirate 2, one coin to pirate 3, nothing to Pirate 4 and 97 coins to herself.  This plan will pass with the votes of 1, 3 and 5.

Six pirates. Pirate 6 needs four votes, and she can buy the votes of Pirates 2 and 4 with one gold coin each, and then two gold coins to Pirate 3, which is cheaper than the alternatives. So she proposes:  one coin each to 2 and 4, two coins to 3 and 96 coins for herself, and this passes with the votes of 2, 3, 4 and 6.

Seven pirates. Pirate 7 needs four votes, and she can buy the votes of Pirates 1 and 5 with only one coin each, since they get nothing in the six-pirate case. By offering two coins to Pirate 2, she can also get another vote (and she prefers to give the extra gold to Pirate 2 than to other pirates in light of the pirate values).

Eight pirates. Pirate 8 needs five votes, and she can buy the votes of Pirates 3, 4 and 6 with one coin each, and ensure another vote by giving two coins to Pirate 1, keeping the other 96 coins for herself. With her own vote, this plan will pass.

Nine pirates. Pirate 9 needs five votes, and she can buy the votes of Pirates 2, 5 and 7 with one coin each, with two coins to Pirate 3 and her own vote, the plan will pass.

Ten pirates. In light of the division offered by Pirate 9, we can now see that Pirate 10 can ensure six votes by proposing to give one coin each to Pirates 1, 4, 6 and 8, two coins to Pirate 2, and the remaining 94 coins for herself. This plan will pass with those pirates voting in favor (and herself), because they each get more gold this way than they would under the plan of Pirate 9.

We can summarize the various proposals in a table, where the $n^{\rm th}$ row corresponds to the proposal of Pirate $n$.

1 2 3 4 5 6 7 8 9 10
One pirate 100
Two pirates * X
Three pirates 0 0 100
Four pirates 1 1 0 98
Five pirates 2 0 1 0 97
Six pirates 0 1 2 1 0 96
Seven pirates 1 2 0 0 1 0 96
Eight pirates 2 0 1 1 0 1 0 95
Nine pirates 0 1 2 0 1 0 1 0 95
Ten pirates 1 2 0 1 0 1 0 1 0 94

There are a few things to notice, which we can use to deduce how the pattern will continue. Notice that in each row beyond the third row, the number of pirates that get no coins is almost half (the largest integer strictly less than half), exactly one pirate gets two coins, and the remainder get one coin, except for the proposer herself, who gets all the rest. This pattern is sustainable for as long as there is enough gold to implement it, because each pirate can effectively buy the votes of the pirates getting $0$ under the alternative plan with one fewer pirate, and this will be at most one less than half of the previous number; then, she can buy one more vote by giving two coins to one of the pirates who got only one coin in the alternative plan; and with her own vote this will be half plus one, which is a majority. We can furthermore observe that by the pirate value system, the two coins will always go to either Pirate 1, 2 or 3, since one of these will always be the top-ranked pirate having one coin on the previous round. They each cycle with the pattern of 0 coins, one coin, two coins in the various proposals. At least until the gold becomes limited, all the other pirates from Pirate 4 onwards will alternate between zero coins and one coin with each subsequent proposal, and Pirate $n-1$ will always get zero from Pirate $n$.

For this reason, we can see that the pattern continues upward until at least Pirate 199, whose proposal will follow the pattern:

199 Pirates: 1 2 0 0 1 0 1 0 1 0 1 0 1 $\dots$ 1 0 1 0 0

It is with Pirate 199, specifically, that for the first time it takes all one hundred coins to buy the other votes, since she must give ninety-eight pirates one coin each, and two coins to Pirate 2 in order to have one hundred votes altogether, including her own, leaving no coins left over for herself.

For this reason, Pirate 200 will have a successful proposal, since she no longer needs to spend two coins for one vote, as the proposal of Pirate 199 has one hundred pirates getting zero. So Pirate 200 can get 100 votes by proposing one coin to everyone who would get zero from 199, plus her own vote, for a majority of 101 votes.

200 pirates: 0 0 1 1 0 1 0 1 0 1 0 $\dots$ 0 1 0 1 1 0

The reader is encouraged to investigate further to see how the pattern continues. It is a fun problem to work out! What emerges is the phenomenon by which longer and longer sequences of pirates in a row find themselves unable to make a winning proposal, and then suddenly a pirate is able to survive by counting on their votes.

It is very interesting also to work out what happens when there is a very small number of coins. For example, if there is only one gold coin, then already Pirate 4 is unable to make a passing proposal, since she can buy only one other vote, and with her own this will make only two votes, falling short of a majority. With only one coin, Pirate 5 will survive by buying a vote from Pirate 1 and counting on the vote of Pirate 4 and her own vote, for a majority.

Even the case of zero coins is interesting to think about! In this case, there is no gold to distribute, and so the voting is really just about whether the pirate should walk the plank or not. If only one pirate, she will live. Pirate 2 will die, since Pirate 1 will vote against. But for that reason, Pirate 2 will vote in favor of Pirate 3, who will live. The pattern that emerges is:

lives, dies, lives, dies, dies, dies, lives, dies, dies, dies, dies, dies, dies, dies, lives, ….

After each successful proposal, where the pirates lives, for subsequently larger numbers of pirates, there must be many deaths in a row in order for the proposal to count on enough votes. So after each “lives” in the pattern, you have to double the length with many “dies” in a row, before there will be enough votes to support the next pirate who lives.

See also the Pirate Game entry on Wikipedia, which is a slightly different formulation of the puzzle, since tie-votes are effectively counted as success in that variation. For this reason, the outcomes are different in that variation. I prefer the strict-majority variation, since I find it interesting that one must sometimes use two gold coins to gain the majority, and also because the death of Pirate 2 arrives right away in an interesting way, rather than having to wait for 200 or more pirates as with the plurality version.

Another (inessential) difference in presentation is that in the other version of the puzzle, they have the captain on the plank first, and then always the highest-ranking pirate making the proposal, rather than the lowest-ranking pirate. This corresponds simply to inverting the ranking, and so it doesn’t change the results.

The puzzle appears to have been around for some time, but I am unsure of the exact provenance. Ian Stewart wrote a popular 1998 article for Scientific American analyzing the patterns that arise when the number of pirates is large in comparison with the number of gold pieces.

# How does a slinky fall?

Have you ever observed carefully how a slinky falls? Suspend a slinky from one end, letting it hang freely in the air under its own weight, and then, let go! The slinky begins to fall. The top of the slinky, of course, begins to fall the moment you let go of it. But what happens at the bottom of the slinky? Does it also start to fall at the same moment you release the top? Or perhaps it moves upward, as the slinky contracts as it falls? Or does the bottom of the slinky simply hang motionless in the air for a time?

The surprising fact is that indeed the bottom of the slinky doesn’t move at all when you release the top of the slinky! It hangs momentarily motionless in the air in exactly the same coiled configuration that it had before the drop. This is the surprising slinky drop effect.

My son (age 13, eighth grade) took up the topic for his science project this year at school.  He wanted to establish the basic phenomenon of the slinky drop effect and to investigate some of the subtler aspects of it.  For a variety of different slinky types, he filmed the slinky drops against a graded background with high-speed camera, and then replayed them in slow motion to watch carefully and take down the data.  Here are a few sample videos. He made about a dozen drops altogether.  For the actual data collection, the close-up videos were more useful. Note the ring markers A, B, C, and so on, in some of the videos.

See more videos here.

For each slinky drop video, he went through the frames and recorded the vertical location of various marked rings (you can see the labels A, B, C and so on in some of the videos above) into a spreadsheet. From this data he then produced graphs such as the following for each slinky drop:

In each case, you can see clearly in the graph the moment when the top of the slinky is released, since this is the point at which the top line begins to descend. The thing to notice next — the main slinky drop effect — is that the lower parts of the slinky do not move at the same time. Rather, the lower lines remain horizontal for some time after the drop point. Basically, they remain horizontal until the bulk of the slinky nearly descends upon them. So the experiments clearly establish the main slinky drop phenomenon: the bottom of the slinky remains motionless for a time hanging in the air unchanged after the top is released.

In addition to this effect, however, my son was focused on investigating a much more subtle aspect of the slinky drop phenomenon. Namely, when exactly does the bottom of the slinky start to move?  Some have said that the bottom moves only when the top catches up to it; but my son hypothesized, based on observations, as well as discussions with his father and uncles, that the bottom should start to move slightly before the bulk of the slinky meets it. Namely, he thought that when you release the top of the slinky, a wave of motion travels through the slinky, and this wave travels slightly fast than the top of the slinky falls. The bottom moves, he hypothesized, when the wave front first gets to the bottom.

His data contains some confirming evidence for this subtler hypothesis, but for some of the drops, the experiment was inconclusive on this smaller effect. Overall, he had a great time undertaking the science project.

# Jacob Davis, PhD 2016, Carnegie Mellon University

Jacob Davis successfully defended his dissertation, “Universal Graphs at $\aleph_{\omega_1+1}$ and Set-theoretic Geology,” at Carnegie Mellon University on April 29, 2016, under the supervision of James Cummings. I was on the dissertation committee (participating via Google Hangouts), along with Ernest Schimmerling and Clinton Conley.

The thesis consisted of two main parts. In the first half, starting from a model of ZFC with a supercompact cardinal, Jacob constructed a model in which $2^{\aleph_{\omega_1}} = 2^{\aleph_{\omega_1+1}} = \aleph_{\omega_1+3}$ and in which there is a jointly universal family of size $\aleph_{\omega_1+2}$ of graphs on $\aleph_{\omega_1+1}$.  The same technique works with any uncountable cardinal in place of $\omega_1$.  In the second half, Jacob proved a variety of results in the area of set-theoretic geology, including several instances of the downward directed grounds hypothesis, including an analysis of the chain condition of the resulting ground models.

# An equivalent formulation of the GCH

The continuum hypothesis CH is the assertion that the size of the power set of a countably infinite set $\aleph_0$ is the next larger cardinal $\aleph_1$, or in other words, that $2^{\aleph_0}=\aleph_1$. The generalized continuum hypothesis GCH makes this same assertion about all infinite cardinals, namely, that the power set of any infinite cardinal $\kappa$ is the successor cardinal $\kappa^+$, or in other words, $2^\kappa=\kappa^+$.

Yesterday I received an email from Geoffrey Caveney, who proposed to me the following axiom, which I have given a name.   First, for any set $F$ of cardinals, define the $F$-restricted power set operation $P_F(Y)=\{X\subseteq Y\mid |X|\in F\}$ to consist of the subsets of $Y$ having a cardinality allowed by $F$.  The only cardinals of $F$ that matter are those that are at most the cardinality of $Y$.

The Alternative GCH is the assertion that for every cardinal number $\kappa$, there is a set $F$ of cardinals such that the $F$-restricted power set $P_F(\kappa)$ has size $\kappa^+$.

Caveney was excited about his axiom for three reasons. First, a big part of his motivation for considering the axiom was the observation that the equation $2^\kappa=\kappa^+$ is simply not correct for finite cardinals $\kappa$ (other than $0$ and $1$) — and this is why the GCH makes the assertion only for infinite cardinals $\kappa$ — whereas the alternative GCH axiom makes a uniform statement for all cardinals, including the finite cardinals, and it gets the right answer for the finite cardinals. Specifically, for any natural number $n$, we can let $F=\{0,1\}$, and then note that $n$ has exactly $n+1$ many subsets of size in $F$. Second, Caveney had also observed that the GCH implies his axiom, since as we just mentioned, it is true for the finite cardinals and for infinite $\kappa$ we can take $F=\{\kappa\}$, using the fact that every infinite cardinal $\kappa$ has $2^\kappa$ many subsets of size $\kappa$ (we are working in ZFC). Third, Caveney had noticed that his axiom implies the continuum hypothesis, since in the case that $\kappa=\aleph_0$, there would be a family $F$ for which $P_F(\aleph_0)$ has size $\aleph_1$. But since there are only countably many finite subsets of $\aleph_0$, it follows that $F$ must include $\aleph_0$ itself, and so this would mean that $\aleph_0$ has only $\aleph_1$ many infinite subsets, and this implies CH.

To my way of thinking, the natural question to consider was whether Caveney’s axiom was actually weaker than GCH or not. At first I noticed that the axiom implies $2^{\aleph_1}=\aleph_2$ and similarly $2^{\aleph_n}=\aleph_{n+1}$, getting us up to $\aleph_\omega$. Then, after a bit I noticed that we can push the argument through all the way.

Theorem. The alternative GCH is equivalent to the GCH.

Proof. We’ve already argued for the converse implication, so it remains only to show that the alternative GCH implies the GCH. Assume that the alternative GCH holds.

We prove the GCH by transfinite induction. For the anchor case, we’ve shown already above that the GCH holds at $\aleph_0$, that is, that CH holds. For the successor case, assume that the GCH holds at some $\delta$, so that $2^\delta=\delta^+$, and consider the case $\kappa=\delta^+$. By the alternative GCH, there is a family $F$ of cardinals such that $|P_F(\kappa)|=\kappa^+$. If every cardinal in $F$ is less than $\kappa$, then $P_F(\kappa)$ has size at most $\kappa^{<\kappa}=(\delta^+)^\delta=2^\delta=\delta^+=\kappa$, which is too small. So $\kappa$ itself must be in $F$, and from this it follows that $\kappa$ has at most $\kappa^+$ many subsets of size $\kappa$, which implies $2^\kappa=\kappa^+$. So the GCH holds at $\kappa$, and we’ve handled the successor case. For the limit case, suppose that $\kappa$ is a limit cardinal and the GCH holds below $\kappa$. So $\kappa$ is a strong limit cardinal. By the alternative GCH, there is a family $F$ of cardinals for which $P_F(\kappa)=\kappa^+$. It cannot be that all cardinals in $F$ are less than the cofinality of $\kappa$, since in this case all the subsets of $\kappa$ in $P_F(\kappa)$ would be bounded in $\kappa$, and so it would have size at most $\kappa$, since $\kappa$ is a strong limit. So there must be a cardinal $\mu$ in $F$ with $\newcommand\cof{\text{cof}}\cof(\kappa)\leq\mu\leq\kappa$. But in this case, it follows that $\kappa^\mu=\kappa^+$, and this implies $\kappa^{\cof(\kappa)}=\kappa^+$, since by König’s theorem it is always at least $\kappa^+$, and it cannot be bigger if $\kappa^\mu=\kappa^+$. Finally, since $\kappa$ is a strong limit cardinal, it follows easily that $2^\kappa=\kappa^{\cof(\kappa)}$, since every subset of $\kappa$ is determined by it’s initial segments, and hence by a $\cof(\kappa)$-sequence of bounded subsets of $\kappa$, of which there are only $\kappa$ many. So we have established that $2^\kappa=\kappa^+$ in the limit case, completing the induction. So we get all instances of the GCH.
QED

# Same structure, different truths, Stanford University CSLI, May 2016

This will be a talk for the Workshop on Logic, Rationality, and Intelligent Interaction at the CSLI, Stanford University, May 27-28, 2016.

Abstract. To what extent does a structure determine its theory of truth? I shall discuss several surprising mathematical results illustrating senses in which it does not, for the satisfaction relation of first-order logic is less absolute than one might have expected. Two models of set theory, for example, can have exactly the same natural numbers and the same arithmetic structure $\langle\mathbb{N},+,\cdot,0,1,<\rangle$, yet disagree on what is true in this structure; they have the same arithmetic, but different theories of arithmetic truth; two models of set theory can have the same natural numbers and a computable linear order in common, yet disagree on whether it is a well-order; two models of set theory can have the same natural numbers and the same reals, yet disagree on projective truth; two models of set theory can have a rank initial segment of the universe $\langle V_\delta,{\in}\rangle$ in common, yet disagree about whether it is a model of ZFC. These theorems and others can be proved with elementary classical model-theoretic methods, which I shall explain. On the basis of these observations, Ruizhi Yang (Fudan University, Shanghai) and I argue that the definiteness of the theory of truth for a structure, even in the case of arithmetic, cannot be seen as arising solely from the definiteness of the structure itself in which that truth resides, but rather is a higher-order ontological commitment.

Slides | Main article: Satisfaction is not absolute | CLSI 2016 | Abstract at CLSI

# On the strength of second-order set theories beyond ZFC, PSC-CUNY Research Award grant, 2016

J. D. Hamkins, On the strength of second-order set theories beyond ZFC, PSC-CUNY Research Award grant #69573-00 47, funded for 2016-2017.

Abstract. Professor Hamkins proposes to undertake research in the area of logic and foundations known as set theory, focused on the comparative strengths of several of the second-order set theories upon which several prominent recent research efforts have been based. These theories span the range from ZFC through GBC, plus various comprehension, transfinite recursion or class determinacy principles, up to KM and through the hierarchy to KM+ and beyond. Hamkins’s recent result with Gitman characterizing the precise strength of clopen determinacy for proper class games is a good start for the project, but many open questions remain, and Hamkins outlines various strategies that might solve them.

# The finite axiom of symmetry

At the conclusion of my talk today for the CUNY Math Graduate Student Colloquium, Freiling’s axiom of symmetry Or, throwing darts at the real line, I had assigned an exercise for the audience, and so I’d like to discuss the solution here.

The axiom of symmetry asserts that if you assign to each real number $x$ a countable set $A_x\subset\mathbb{R}$, then there should be two reals $x,y$ for which $x\notin A_y$ and $y\notin A_x$.

Informally, if you have attached to each element $x$ of a large set $\mathbb{R}$ a certain comparatively small subset $A_x$, then there should be two independent points $x,y$, meaning that neither is in the set attached to the other.

The challenge exercise I had made is to prove a finite version of this:

The finite axiom of symmetry. For each finite number $k$ there is a sufficiently large finite number $n$ such that for any set $X$ of size $n$ and any assignment $x\mapsto A_x$ of elements $x\in X$ to subsets $A_x\subset X$ of size $k$, there are elements $x,y\in X$ such that $x\notin A_y$ and $y\notin A_x$.

Proof. Suppose we are given a finite number $k$. Let $n$ be any number larger than $k^2+k$. Consider any set $X$ of size $n$ and any assignment $x\mapsto A_x$ of elements $x\in X$ to subsets $A_x\subset X$ of size at most $k$. Let $x_0,x_1,x_2,\dots,x_k$ be any $k+1$ many elements of $X$. The union $\bigcup_{i\leq k} A_{x_i}$ has size at most $(k+1)k=k^2+k$, and so by the choice of $n$ there is some element $y\in X$ not in any $A_{x_i}$. Since $A_y$ has size at most $k$, there must be some $x_i$ not in $A_y$. So $x_i\notin A_y$ and $y\notin A_{x_i}$, and we have fulfilled the desired conclusion. QED

Question. What is the optimal size of $n$ as a function of $k$?

It seems unlikely to me that my argument gives the optimal bound, since we can find at least one of the pair elements inside any $k+1$ size subset of $X$, which is a stronger property than requested. So it seems likely to me that the optimal bound will be smaller.

# Every function can be computable!

I’d like to share a simple proof I’ve discovered recently of a surprising fact:  there is a universal algorithm, capable of computing any given function!

Wait, what? What on earth do I mean? Can’t we prove that some functions are not computable?  Yes, of course.

What I mean is that there is a universal algorithm, a Turing machine program capable of computing any desired function, if only one should run the program in the right universe. There is a Turing machine program $p$ with the property that for any function $f:\newcommand\N{\mathbb{N}}\N\to\N$ on the natural numbers, including non-computable functions, there is a model of arithmetic or set theory inside of which the function computed by $p$ agrees exactly with $f$ on all standard finite input. You have to run the program in a different universe in order that it will compute your desired function $f$.
Theorem There is a Turing machine program $p$, carrying out the specific algorithm described in the proof, such that for any function $f:\N\to\N$, there is a model of arithmetic $M\models\PA$, or indeed a model of set theory $M\models\ZFC$ or more (if consistent), such that the function computed by program $p$ inside $M$ agrees exactly with $f$ on all standard finite input.

The proof is elementary, relying essentially only on the ideas of the classical proof of the Gödel-Rosser theorem. To briefly review, for any computably axiomatized theory $T$ extending $\PA$, there is a corresponding sentence $\rho$, called the Rosser sentence, which asserts, “for any proof of $\rho$ in $T$, there is a smaller proof of $\neg\rho$.” That is, by smaller, I mean that the Gödel-code of the proof is smaller. One constructs the sentence $\rho$ by a simple application of the Gödel fixed-point lemma, just as one constructs the usual Gödel sentence that asserts its own non-provability. The basic classical facts concerning the Rosser sentence include the following:

• If $T$ is consistent, then so are both $T+\rho$ and $T+\neg\rho$
• $\PA+\Con(T)$ proves $\rho$.
• The theories $T$, $T+\rho$ and $T+\neg\rho$ are equiconsistent.
• If $T$ is consistent, then $T+\rho$ does not prove $\Con(T)$.

The first statement is the essential assertion of the Gödel-Rosser theorem, and it is easy to prove: if $T$ is consistent and $T\proves\rho$, then the proof would be finite in the meta-theory, and so since $T$ would have to prove that there is a smaller proof of $\neg\rho$, that proof would also be finite in the meta-theory and hence an actual proof, contradicting the consistency of $T$. Similarly, if $T\proves\neg\rho$, then the proof would be finite in the meta-theory, and so $T$ would be able to verify that $\rho$ is true, and so $T\proves\rho$, again contradicting consistency. By internalizing the previous arguments to PA, we see that $\PA+\Con(T)$ will prove that neither $\rho$ nor $\neg\rho$ are provable in $T$, making $\rho$ vacuously true in this case and also establishing $\Con(T+\rho)$ and $\Con(T+\neg\rho)$, for the second and third statements. In particular, $T+\Con(T)\proves\Con(T+\rho)$, which implies that $T+\rho$ does not prove $\Con(T)$ by the incompleteness theorem applied to the theory $T+\rho$, for the fourth statement.

Let’s now proceed to the proof of the theorem. To begin, we construct what I call the Rosser tree over a c.e. theory $T$. Namely, we recursively define theories $R_s$ for each finite binary string $s\in 2^{{<}\omega}$, placing the initial theory $R_{\emptyset}=T$ at the root, and then recursively adding either the Rosser sentence $\rho_s$ for the theory $R_s$ or its negation $\neg\rho_s$ at each stage to form the theories at the next level of the tree.
$$R_{s\concat 1}=R_s+\rho_s$$
$$R_{s\concat 0}=R_s+\neg\rho_s$$
Each theory $R_s$ is therefore a finite extension of $T$ by successively adding the appropriate Rosser sentences or their negations in the pattern described by $s$. If the initial theory $T$ is consistent, then it follows by induction using the Gödel-Rosser theorem that all the theories $R_s$ in the Rosser tree are consistent. Extending our notation to the branches through the tree, if $f\in{}^\omega 2$ is an infinite binary sequence, we let $R_f=\bigcup_n R_{f\upharpoonright n}$ be the union of the theories arising along that branch of the Rosser tree. In this way, we have constructed a perfect set of continuum many distinct consistent theories.

I shall now describe a universal algorithm for the case of computing binary functions. Consider the Rosser tree over the theory $T=\PA+\neg\Con(\PA)$. This is a consistent theory that happens to prove its own inconsistency. By considering the Gödel-codes in order, the algorithm should begin by searching for a proof of the Rosser sentence $\rho_{\emptyset}$ or its negation in the initial theory $R_{\emptyset}$. If such a proof is ever found, then the algorithm outputs $0$ or $1$ on input $0$, respectively, depending on whether it was the Rosser sentence or its negation that was found first, and moves to the next theory in the Rosser tree by adding the opposite statement to the current theory. Then, it starts searching for a proof of the Rosser sentence of that theory or its negation. At each stage in the algorithm, there is a current theory $R_s$, depending on which prior proofs have been found, and the algorithm searches for a proof of $\rho_s$ or $\neg\rho_s$. If found, it outputs $0$ or $1$ accordingly (on input $n=|s|$), and moves to the next theory in the Rosser tree by adding the opposite statement to the current theory.

If $f:\N\to 2=\{0,1\}$ is any binary function on the natural numbers, then let $R_f$ be the theory arising from the corresponding path through the Rosser tree, and let $M\models R_f$ be a model of this theory. I claim that the universal algorithm I just described will compute exactly $f(n)$ on input $n$ inside this model. The thing to notice is that because $\neg\Con(\PA)$ was part of the initial theory, the model $M$ will think that all the theories in the Rosser tree are inconsistent. So the model will have plenty of proofs of every statement and its negation for any theory in the Rosser tree, and so in particular, the function computed by $p$ in $M$ will be a total function. The question is which proofs will come first at each stage, affecting the values of the function. Let $s=f\restrict n$ and notice that $R_s$ is true in $M$. Suppose inductively that the function computed by $p$ has worked correctly below $n$ in $M$, and consider stage $n$ of the procedure. By induction, the current theory will be exactly $R_s$, and the algorithm will be searching for a proof of $\rho_s$ or its negation in $R_s$. Notice that $f(n)=1$ just in case $\rho_s$ is true in $M$, and because of what $\rho_s$ asserts and the fact that $M$ thinks it is provable in $R_s$, it must be that there is a smaller proof of $\neg\rho_s$. So in this case, the algorithm will find the proof of $\neg\rho_s$ first, and therefore, according to the precise instructions of the algorithm, it will output $1$ on input $n$ and add $\rho_s$ (the opposite statement) to the current theory, moving to the theory $R_{s\concat 1}$ in the Rosser tree. Similarly, if $f(n)=0$, then $\neg\rho_s$ will be true in $M$, and the algorithm will therefore first find a proof of $\rho_s$, give output $0$ and add $\neg\rho_s$ to the current theory, moving to $R_{s\concat 0}$. In this way, the algorithm finds the proofs in exactly the right way so as to have $R_{f\restrict n}$ as the current theory at stage $n$ and thereby compute exactly the function $f$, as desired.

Basically, the theory $R_f$ asserts exactly that the proofs will be found in the right order in such a way that program $p$ will exactly compute $f$ on all standard finite input. So every binary function $f$ is computed by the algorithm in any model of the theory $R_f$.

Let me now explain how to extend the result to handle all functions $g:\N\to\N$, rather than only the binary functions as above. The idea is simply to modify the binary universal algorithm in a simple way. Any function $g:\N\to \N$ can be coded with a binary function $f:\N\to 2$ in a canonical way, for example, by having successive blocks of $1$s in $f$, separated by $0$s, with the $n^{\rm th}$ block of size $g(n)$. Let $q$ be the algorithm that runs the binary universal algorithm described above, thereby computing a binary sequence, and then extract from that binary sequence a corresponding function from $\N$ to $\N$ (this may fail, if for example, the binary sequence is finite or if it has only finitely many $0$s). Nevertheless, for any function $g:\N\to \N$ there is a binary function $f:\N\to 2$ coding it in the way we have described, and in any model $M\models R_f$, the binary universal algorithm will compute $f$, causing this adapted algorithm to compute exactly $g$ on all standard finite input, as desired.

Finally, let me describe how to extend the result to work with models of set theory, rather than models of arithmetic. Suppose that $\ZFC^+$ is a consistent c.e. extension of ZFC; perhaps it is ZFC itself, or ZFC plus some large cardinal axioms. Let $T=\ZFC^++\neg\Con(\ZFC^+)$ be a slightly stronger theory, which is also consistent, by the incompleteness theorem. Since $T$ interprets arithmetic, the theory of Rosser sentences applies, and so we may build the corresponding Rosser tree over $T$, and also we may undertake the binary universal algorithm using $T$ as the initial theory. If $f:\N\to 2$ is any binary function, then let $R_f$ be the theory arising on the corresponding branch through the Rosser tree, and suppose $M\models R_f$. This is a model of $\ZFC^+$, which also thinks that $\ZFC^+$ is inconsistent. So again, the universal algorithm will find plenty of proofs in this model, and as before, it will find the proofs in just the right order that the binary universal algorithm will compute exactly the function $f$. From this binary universal algorithm, one may again design an algorithm universal for all functions $g:\N\to\N$, as desired.

One can also get another kind of universality. Namely, there is a program $r$, such that for any finite $s\subset\N$, there is a model $M$ of $\PA$ (or $\ZFC$, etc.) such that inside the model $M$, the program $r$ will enumerate the set $s$ and nothing more. One can obtain such a program $r$ from the program $p$ of the theorem: just let $r$ run the universal binary program $p$ until a double $0$ is produced, and then interprets the finite binary string up to that point as the set $s$ to output.

Let me now also discuss another form of universality.

Corollary
There is a program $p$, such that for any model $M\models\PA+\Con(\PA)$ and any function $f:M\to M$ that is definable in $M$, there is an end-extension of $M$ to a taller model $N\models\PA$ such that in $N$, the function computed by program $p$ agrees exactly with $f$ on input in $M$.

Proof
We simply apply the main theorem inside $M$. The point is that if $M$ thinks $\Con(\PA)$, then it can build what it thinks is the tree of Rosser extensions, and it will think that each step maintains consistency. So the theory $T_f$ that it constructs will be consistent in $M$ and therefore have a model (the Henkin model) definable in $M$, which will therefore be an end-extension of $M$.
QED

This last application has a clear affinity with a theorem of Woodin’s, recently extended by Rasmus Blanck and Ali Enayat. See Victoria Gitman’s posts about her seminar talk on those theorems: Computable processes can produce arbitrary outputs in nonstandard models, continuation.

Alternative proof.  Here is an alternative elegant proof of the theorem based on the comments below of Vadim Kosoy. Let $T$ be any consistent computably axiomatizable theory interpreting PA, such as PA itself or ZFC or what have you. For any Turing machine program $e$, let $q(e)$ be a program carrying out the following procedure: on input $n$, search systematically for a finite function $h:X\to\mathbb{N}$, with $X$ finite and $n\in X$, and for a proof of the statement “program $p$ does not agree with $h$ on all inputs in $X$,” using the function $h$ simply as a list of values for this assertion. For the first such function and proof that is found, if any, give as output the value $h(n)$.

Since the function $e\mapsto q(e)$ is computable, there is by Kleene’s recursion theorem a program $p$ for which $p$ and $f(p)$ compute the same function, and furthermore, $T$ proves this.  So the program $p$ is searching for proofs that $p$ itself does not behave in a certain way, and then it is behaving in that way when such a proof is found.

I claim that the theory $T$ does not actually prove any of those statements, “program $p$ does not agree with $h$ on inputs in $X$,” for any particular finite function $h:X\to\mathbb{N}$. If it did prove such a statement, then for the smallest such function and proof, the output of $p$ would indeed be $h$ on all inputs in $X$, by design. Thus, there would also be a proof that the program did agree with this particular $h$, and so $T$ would prove a contradiction, contrary to our assumption that it was consistent. So $T$ actually proves none of those statements. In particular, the program $p$ computes the empty function in the standard model of arithmetic. But also, for any particular finite function $h:X\to\mathbb{N}$, we may consistently add the assertion “program $p$ agrees with $h$ on inputs in $X$” to $T$, since $T$ did not refute this assertion.

For any function $f:\mathbb{N}\to\mathbb{N}$, let $T_f$ be the theory $T$ together with all assertions of the form “program $p$ halts on input $n$ with value $k$”, for the particular value $k=f(n)$.  I claim that this theory is consistent, for if it is not, then by compactness there would be finitely many of the assertions that enable the inconsistency, and so there would be a finite function $h:X\to\mathbb{N}$, with $h=f\upharpoonright X$, such that $T$ proved the program $p$ does not agree with $h$ on inputs in $X$. But in the previous paragraph, we proved that this doesn’t happen. And so the theory $T_f$ is consistent.

Finally, note that in any model of $T_f$, the program $p$ computes the function $f$ on standard input, because these assertions are exactly made in the theory. QED

# Burak Kaya, Ph.D. March 2016

Burak Kaya successfully defended his dissertation, “Cantor minimal systems from a descriptive perspective,” on March 24, 2016, earning his Ph.D. degree at Rutgers University under the supervision of Simon Thomas. The dissertation committee consisted of Simon Thomas, Gregory Cherlin, Grigor Sargsyan and myself, as the outside member.

The defense was very nice, with an extremely clear account of the main results, and the question session included a philosophical discussion on various matters connected with the dissertation, including the principle attributed to Gao that any collection of mathematical structures that has a natural Borel representation has a unique such representation up to Borel isomorphism, a principle that was presented as a Borel-equivalence-relation-theory analogue of the Church-Turing thesis.

Abstract.  In recent years, the study of the Borel complexity of naturally occurring classification problems has been a major focus in descriptive set theory. This thesis is a contribution to the project of analyzing the Borel complexity of the topological conjugacy relation on various Cantor minimal systems.

We prove that the topological conjugacy relation on pointed Cantor minimal systems is Borel bireducible with the Borel equivalence relation $\newcommand\d{\Delta^+_{\mathbb{R}}}\d$. As a byproduct of our analysis, we also show that $\d$ is a lower bound for the topological conjugacy relation on Cantor minimal systems.

The other main result of this thesis concerns the topological conjugacy relation on Toeplitz subshifts. We prove that the topological conjugacy relation on Toeplitz subshifts with separated holes is a hyperfinite Borel equivalence relation. This result provides a partial affirmative answer to a question asked by Sabok and Tsankov.

As pointed Cantor minimal systems are represented by properly ordered Bratteli diagrams, we also establish that the Borel complexity of equivalence of properly ordered Bratteli diagrams is $\d$.

# Famous quotations in their original first-order language

Historians everywhere are shocked by the recent discovery that many of our greatest thinkers and poets had first expressed their thoughts and ideas in the language of first-order predicate logic, and sometimes modal logic, rather than in natural language. Some early indications of this were revealed in the pioneering historical research of Henle, Garfield and Tymoczko, in their work Sweet Reason:

We now know that the phenomenon is widespread!  As shown below, virtually all of our cultural leaders have first expressed themselves in the language of first-order predicate logic, before having been compromised by translations into the vernacular.

$\neg\lozenge\neg\exists s\ G(i,s)$

$(\exists x\ x=i)\vee\neg(\exists x\ x=i)$

$\left(\strut\neg\exists t\ \exists d\ \strut D(d)\wedge F(d)\wedge S_t(i,d)\right)\wedge\left(\strut\neg\exists t\ w\in_t \text{Ro}\right)\wedge\left(\strut \text{Ru}(i,y)\to \lozenge\text{C}(y,i,qb)\wedge \text{Ru}(i)\wedge\text{Ru}(i)\wedge\text{Ru}(i)\wedge\text{Ru}(i)\right)$

$\neg B_i \exists g\ G(g)$

$\forall b\ \left(\strut G(b)\wedge B(b)\to \exists x\ (D(b,x)\wedge F(x))\right)$

$(\exists!w\ W_1(w)\wedge W_2(w)), \ \ \exists w\ W_1(w)\wedge W_2(w)\wedge S(y,w)$?

$\exists s\ Y(s)\wedge S(s)\wedge \forall x\ L(x,s)$

$\exists p\ \left[\forall c\ (c\neq p\to G(c))\right]\wedge\neg G(p)$

$\exists l\ \left[L(l)\wedge \boxdot_l\left({}^\ulcorner\,\forall g\ \text{Gl}(g)\to \text{Gd}(g){}^\urcorner\right)\wedge\exists s\ \left(SH(s)\wedge B(l,s)\right)\right]$

$(\forall p\in P\ \exists c\in\text{Ch}\ c\in p)\wedge(\forall g\in G\ \exists c\in\text{Cr}\ c\in g)$

$\forall x (F(w,x)\to x=F)$

$B\wedge \forall x\ \left[S(x)\wedge T(x)\to \exists!w\ W(w)\wedge\text{Gy}(x,w)\wedge\text{Gi}(x,w)\right]$

$\exists!x\ D(x)\wedge D(\ {}^\ulcorner G(i){}^\urcorner\ )$

$\forall f\ \forall g\ \left(\strut H(f)\wedge H(g)\to f\sim g\right)\wedge\forall f\ \forall g\ \left(\strut\neg H(f)\wedge \neg H(g)\to \neg\ f\sim g\right)$

$\exists w\ \left(\strut O(w)\wedge W(w)\wedge\exists s\ (S(s)\wedge L(w,s))\right)$

$C(i)\to \exists x\ x=i$

$\neg\neg\left(\strut H(y)\wedge D(y)\right)$

$\neg (d\in K)\wedge\neg (t\in K)$

$W(i,y)\wedge N(i,y)\wedge\neg\neg\lozenge L(i,y)\wedge \left(\strut \neg\ \frac23<0\to\neg S(y)\right)$

$\lozenge \text{CL}(i)\wedge\lozenge C(i)\wedge \lozenge (\exists x\ x=i)\wedge B(i)$

$\forall x\ K_x({}^\ulcorner \forall m\ \left[M(m)\wedge S(m)\wedge F(m)\to\boxdot\ \exists w\ M(m,w)\right]{}^\urcorner)$

$\forall e\forall h\ \left(\strut G(e)\wedge E(e)\wedge H(h)\to \neg L(i,e,h)\right)$

$\forall p\ \boxdot\text{St}(p)$

$\lozenge^w_i\ \forall g\in G\ \lozenge (g\in C)$

$\forall m\ (a\leq_C m)$

$\forall t\ (p\geq t)\wedge \forall t\ (p\leq t)$

$\forall x\ (F(x)\iff x=h)$

$(\forall x\ \forall y\ x=y)\wedge(\exists x\ \exists y (x=x>y=y))$

$\forall p\ \left(\strut\neg W(p)\to \neg S(p)\right)$

$\forall p \left(\strut E(p)\to \forall h\in H\ A(p,h)\right)$

Dear readers, in order to assist with this important historical work, please provide translations into ordinary English in the comment section below of any or all of the assertions listed above. We are interested to make sure that all our assertions and translations are accurate.

In addition, any readers who have any knowledge of additional instances of famous quotations that were actually first made in the language of first-order predicate logic (or similar) are encouraged to post comments below detailing their knowledge. I will endeavor to add such additional examples to the list.

Thanks to Philip Welch, to my brother Jonathan, and to Ali Sadegh Daghighi (in the comments) for providing some of the examples, and to Timothy Gowers for some improvements.

# Giorgio Audrito, PhD 2016, University of Torino

Dr. Giorgio Audrito has successfully defended his dissertation, “Generic large cardinals and absoluteness,” at the University of Torino under the supervision of Matteo Viale.

The dissertation Examing Board consisted of myself (serving as Presidente), Alessandro Andretta and Sean Cox.  The defense took place March 2, 2016.

The dissertation was impressive, introducing (in joint work with Matteo Viale) the iterated resurrection axioms $\text{RA}_\alpha(\Gamma)$ for a forcing class $\Gamma$, which extend the idea of the resurrection axioms from my work with Thomas Johnstone, The resurrection axioms and uplifting cardinals, by making successive extensions of the same type, forming the resurrection game, and insisting that that the resurrection player have a winning strategy with game value $\alpha$. A similar iterative game idea underlies the $(\alpha)$-uplifting cardinals, from which the consistency of the iterated resurrection axioms can be proved. A final chapter of the dissertation (joint with Silvia Steila), develops the notion of $C$-systems of filters, generalizing the more familiar concepts of extenders and towers.

# Open determinacy for games on the ordinals, Torino, March 2016

This will be a seminar talk I shall give on March 3, 2016 at the University of Torino, Italy, in the same department where Giuseppe Peano had his position.  I shall be in Italy for the dissertation defense of Giorgio Audrito, on whose dissertation committee I am serving as president.

Abstract. The principle of open determinacy for class games — two-player games of perfect information with plays of length $\omega$, where the moves are chosen from a possibly proper class, such as games on the ordinals — is not provable in Zermelo-Fraenkel set theory ZFC or Gödel-Bernays set theory GBC, if these theories are consistent, because provably in ZFC there is a definable open proper class game with no definable winning strategy. In fact, the principle of open determinacy and even merely clopen determinacy for class games implies Con(ZFC) and iterated instances Con(Con(ZFC)) and more, because it implies that there is a satisfaction class for first-order truth, and indeed a transfinite tower of truth predicates $\text{Tr}_\alpha$ for iterated truth-about-truth, relative to any class parameter. This is perhaps explained, in light of the Tarskian recursive definition of truth, by the more general fact that the principle of clopen determinacy is exactly equivalent over GBC to the principle of elementary transfinite recursion ETR over well-founded class relations. Meanwhile, the principle of open determinacy for class games is provable in the stronger theory GBC+$\Pi^1_1$-comprehension, a proper fragment of Kelley-Morse set theory KM.

This is joint work with Victoria Gitman. See our article, Open determinacy for class games, which is currently under review.

# Math for nine-year-olds: fold, punch and cut for symmetry!

Today I had the pleasure to visit my daughter’s fourth-grade classroom for some fun mathematical activities. The topic was Symmetry!   I planned some paper-folding activities, involving hole-punching and cutting, aiming to display the dynamism that is present in the concept of symmetry. Symmetry occurs when a figure can leap up, transforming itself through space, and land again exactly upon itself in different ways.  I sought to have the students experience this dynamic action not only in their minds, as they imagined various symmetries for the figures we considered, but also physically, as they folded a paper along a line of symmetry, checking that this fold brought the figure exactly to itself.

The exercises were good plain fun, and some of the kids wanted to do them again and again, the very same ones.  Here is the pdf file for the entire project.

To get started, we began with a very simple case, a square with two dots on it, which the girls folded and then punched through with a hole punch, so that one punch would punch through both holes at once.

Next, we handled a few patterns that required two folds. I told the kids to look for the lines of symmetry and fold on them.  Fold the paper in such a way that with ONE punch, you can make exactly the whole pattern.

Don’t worry if the holes you punch do not line up exactly perfectly with the dots — if the holes are all touching their intended target and there are the right number of holes, then I told the kids it is great!

The three-fold patterns are a bit more challenging, but almost all of the kids were able to do it.  They did need some help punching through, however, since it sometimes requires some strength to punch through many layers.

With these further patterns, some of the folds don’t completely cover the paper. So double-check and make sure that you won’t end up with unwanted extra holes!

The second half of the project involved several cutting challenges.  To begin, let’s suppose that you wanted to cut a square out of the middle of a piece of paper. How would you do it?  Perhaps you might want to poke your scissors through and then cut around the square in four cuts.  But can you do it in just ONE cut?  Yes!  If you fold the paper on the diagonals of the square, then you can make one quick snip to cut out exactly the square, leaving the frame completely intact.

Similarly, one can cut out many other shapes in just one cut.  The rectangle and triangle are a little trickier than you might think at first, since at a middle stage of folding, you’ll find that you end up folding a shorter line onto a longer one, but the point is that this is completely fine, since one cut will go through both.  Give it a try!

Here are a few harder shapes, requiring more folds.

It is an amazing mathematical fact, the fold-and-cut theorem, that ANY shape consisting of finitely many straight line segments can be made with just one cut after folding.  Here are a few challenges, which many of the fourth-graders were able to do during my visit.

What a lot of fun the visit was!  I shall look forward to returning to the school next time.

In case anyone is interested, I have made available a pdf file of this project.  I would like to thank Mike Lawler, whose tweet put me onto the topic.  And see also the awesome Numberphile video on the fold-and-cut theorem, featuring mathematician Katie Steckles.

See more of my Math for Kids!

# Set-theoretic mereology

• J. D. Hamkins and M. Kikuchi, “Set-theoretic mereology.” (manuscript under review)
@ARTICLE{HamkinsKikuchi:Set-theoreticMereology,
author = {Joel David Hamkins and Makoto Kikuchi},
title = {Set-theoretic mereology},
journal = {},
year = {},
volume = {},
number = {},
pages = {},
month = {},
note = {manuscript under review},
eprint = {1601.06593},
url = {http://jdh.hamkins.org/set-theoretic-mereology},
abstract = {},
keywords = {},
source = {},
}

Abstract. We consider a set-theoretic version of mereology based on the inclusion relation $\newcommand\of{\subseteq}\of$ and analyze how well it might serve as a foundation of mathematics. After establishing the non-definability of $\in$ from $\of$, we identify the natural axioms for $\of$-based mereology, which constitute a finitely axiomatizable, complete, decidable theory. Ultimately, for these reasons, we conclude that this form of set-theoretic mereology cannot by itself serve as a foundation of mathematics. Meanwhile, augmented forms of set-theoretic mereology, such as that obtained by adding the singleton operator, are foundationally robust.

In light of the comparative success of membership-based set theory in the foundations of mathematics, since the time of Cantor, Zermelo and Hilbert, a mathematical philosopher naturally wonders whether one might find a similar success for mereology, based upon a mathematical or set-theoretic parthood relation rather than the element-of relation $\in$. Can set-theoretic mereology serve as a foundation of mathematics? And what should be the central axioms of set-theoretic mereology?

We should like therefore to consider a mereological perspective in set theory, analyzing how well it might serve as a foundation while identifying the central axioms. Although set theory and mereology, of course, are often seen as being in conflict, what we take as the project here is to develop and investigate, within set theory, a set-theoretic interpretation of mereological ideas. Mereology, by placing its focus on the parthood relation, seems naturally interpreted in set theory by means of the inclusion relation $\of$, so that one set $x$ is a part of another $y$, just in case $x$ is a subset of $y$, written $x\of y$. This interpretation agrees with David Lewis’s Parts of Classes (1991) interpretation of set-theoretic mereology in the context of sets and classes, but we restrict our attention to the universe of sets. So in this article we shall consider the formulation of set-theoretic mereology as the study of the structure $\langle V,\of\rangle$, which we shall take as the canonical fundamental structure of set-theoretic mereology, where $V$ is the universe of all sets; this is in contrast to the structure $\langle V,{\in}\rangle$, usually taken as central in set theory. The questions are: How well does this mereological structure serve as a foundation of mathematics? Can we faithfully interpret the rest of mathematics as taking place in $\langle V,\of\rangle$ to the same extent that set theorists have argued (with whatever degree of success) that one may find faithful representations in $\langle V,{\in}\rangle$? Can we get by with merely the subset relation $\of$ in place of the membership relation $\in$?

Ultimately, we shall identify grounds supporting generally negative answers to these questions. On the basis of various mathematical results, our main philosophical thesis will be that the particular understanding of set-theoretic mereology via the inclusion relation $\of$ cannot adequately serve by itself as a foundation of mathematics. Specifically, the following theorem and corollary show that $\in$ is not definable from $\of$, and we take this to show that one may not interpret membership-based set theory itself within set-theoretic mereology in any straightforward, direct manner.

Theorem. In any universe of set theory $\langle V,{\in}\rangle$, there is a definable relation $\in^*$, different from $\in$, such that $\langle V,{\in^*}\rangle$ is a model of set theory, in fact isomorphic to the original universe $\langle V,{\in}\rangle$, for which the corresponding inclusion relation $$u\subseteq^* v\quad\longleftrightarrow\quad \forall a\, (a\in^* u\to a\in^* v)$$ is identical to the usual inclusion relation $u\subseteq v$.

Corollary. One cannot define $\in$ from $\subseteq$ in any model of set theory, even allowing parameters in the definition.

A counterpoint to this is provided by the following theorem, however, which identifies a weak sense in which $\of$ may identify $\in$ up to definable automorphism of the universe.

Theorem. Assume ZFC in the universe $\langle V,\in\rangle$. Suppose that $\in^*$ is a definable class relation in $\langle V,{\in}\rangle$ for which $\langle V,\in^*\rangle$ is a model of set theory (a weak set theory suffices), such that the corresponding inclusion relation $$u\subseteq^* v\quad\iff\quad\forall a\,(a\in^* u\to a\in^* v)$$is the same as the usual inclusion relation $u\subseteq v$. Then the two membership relations are isomorphic $$\langle V,\in\rangle\cong\langle V,\in^*\rangle.$$

That counterpoint is not decisive, however, in light of the question whether we really need $\in^*$ to be a class with respect to $\in$, a question resolved by the following theorem, which shows that set-theoretic mereology does not actually determine the $\in$-isomorphism class or even the $\in$-theory of the $\in$-model in which it arises.

Theorem. For any two consistent theories extending ZFC, there are models $\langle W,{\in}\rangle$ and $\langle W,{\in^*}\rangle$ of those theories, respectively, with the same underlying set $W$ and the same induced inclusion relation $\of=\of^*$.

For example, we cannot determine in $\of$-based set-theoretic mereology whether the continuum hypothesis holds or fails, whether the axiom of choice holds or fails or whether there are large cardinals or not. Initially, the following central theorem may appear to be a positive result for mereology, since it identifies precisely what are the principles of set-theoretic mereology, considered as the theory of $\langle V,{\of}\rangle$. Namely, $\of$ is an atomic unbounded relatively complemented distributive lattice, and this is a finitely axiomatizable complete theory. So in a sense, this theory simply is the theory of $\of$-based set-theoretic mereology.

Theorem. Set-theoretic mereology, considered as the theory of $\langle V,\of\rangle$, is precisely the theory of an atomic unbounded relatively complemented distributive lattice, and furthermore, this theory is finitely axiomatizable, complete and decidable.

But upon reflection, since every finitely axiomatizable complete theory is decidable, the result actually appears to be devastating for set-theoretic mereology as a foundation of mathematics, because a decidable theory is much too simple to serve as a foundational theory for all mathematics. The full spectrum and complexity of mathematics naturally includes all the instances of many undecidable decision problems and so cannot be founded upon a decidable theory. Finally, it follows as a corollary that the structure consisting of the hereditarily finite sets under inclusion forms an elementary substructure of the full set-theoretic mereological universe $$\langle \text{HF},\of\rangle\prec\langle V,\of\rangle.$$ Consequently set-theoretic mereology cannot properly treat or even express the various concepts of infinity that arise in mathematics.

Some previous posts on this blog:

# Freiling’s axiom of symmetry, or throwing darts at the real line, Graduate Student Colloquium, April 2016

This will be a talk I’ll give at the CUNY Graduate Center Graduate Student Colloquium on Monday, April 11 (new date!), 2016, 4-4:45 pm.  The talk will be aimed at a general audience of mathematics graduate students.

Abstract. I shall give an elementary presentation of Freiling’s axiom of symmetry, which is the principle asserting that if $x\mapsto A_x$ is a function mapping every real $x\in[0,1]$ in the unit interval to a countable set of such reals $A_x\subset[0,1]$, then there are two reals $x$ and $y$ for which $x\notin A_y$ and $y\notin A_x$.  To argue for the truth of this principle, Freiling imagined throwing two darts at the real number line, landing at $x$ and $y$ respectively: almost surely, the location $y$ of the second dart is not in the set $A_x$ arising from that of the first dart, since that set is countable; and by symmetry, it shouldn’t matter which dart we imagine as being first. So it may seem that almost every pair must fulfill the principle. Nevertheless, the principle is independent of the axioms of ZFC and in fact it is provably equivalent to the failure of the continuum hypothesis.  I’ll introduce the continuum hypothesis in a general way and discuss these foundational matters, before providing a proof of the equivalence of $\neg$CH with the axiom of symmetry. The axiom of symmetry admits natural higher dimensional analogues, such as the case of maps $(x,y)\mapsto A_{x,y}$, where one seeks a triple $(x,y,z)$ for which no member is in the set arising from the other two, and these principles also have an equivalent formulation in terms of the size of the continuum.