# On number sense — Would one day of NYC coffee fill the Statue of Liberty? And other fun questions…

Today I gave a lecture on what I call number sense, using a process of estimation and approximation in order to calculate various unknown quantities, including a few fantastical ones. How much coffee is made per day in New York City? Would it fill up the Statue of Liberty? Approximately how many babies are born in New York City each day? If you made a stack of quarters to reach the distance to the moon, what would the dollar-value be? If you piled those quarters in a heap, would it fit in Central Park? How much does the Empire State Building weigh?

These kinds of back-of-the-envelope calculations, in my view, have at their heart the idea that one can solve a difficult or seemingly impossible problem by breaking it into more manageable pieces. We don’t just pull a final answer out of the air, but rather make simplifying assumptions and informed estimates for related quantities that we are more familiar with or have knowledge about, and then use that information to derive a better estimate for the main quantity. For most of these questions, at the outset we may have little idea what would be a reasonable answer, but by the end, we gain some insight and find ourselves a little closer to the truth.

These kind of calculations are also known as Fermi estimations, in light of Fermi’s remarkable ability to make surprisingly accurate estimations on the basis of little or no hard data. The wikipedia page (thanks to Artie Prendergrast-Smith for mentioning this link in the comments) emphasizes that even in a case where the estimate is significantly off the true value, nevertheless we may still find value in the Fermi calculation, because it focusses our attention to the reasons for the divergence. In discovering which of the assumptions underlying our calculations was wrong, we come to a deeper understanding of the true situation.

In the lecture, I began with some very easy cases. For example, how many seats were in the auditorium? The students estimated that there were approximately 12 seats per row and about 10 rows, so 120 seats in all. How old was one of the students, in seconds? Well, he was 18 years old, and so we could simply multiply each year by 365 days, times 24 hours per day, times 60 minutes per hour, times 60 seconds per minute, to get
$$18\times 365\times 24\times 60\times 60\approx 600,000,000 \text{ seconds}.$$
One student objected about leap days, since 365 should be 365.25 or so. But I pointed out that this difference was not as important as it might seem, since already we had made far larger rounding assumptions. For example, the student was not exactly 18 years old, but 18 years old and some several months; by using 18 years only, we made a bigger difference in the answer than caused by the leap-day issue, which would be a difference of only five days or so over 18 years. For the same reason, we should feel free to round the numbers to make the calculation easier. We are aiming at a ballpark estimate rather than an exact answer.

Let’s now do some more interesting cases.

Coffee in New York. How much coffee is made each day in New York? Would it fill the Statue of Liberty? First, let me say that I really don’t have any definite information about how much coffee is made each day in New York, and I fear that my own coffee-obsessed perspective will lead me to over-estimate the amount, but let’s give it a try. New York City has a population of approximately 10 million people. Some of those people, like myself, drink a large amount of coffee each day, but many of the others do not drink coffee at all. I would think that a sizable percentage of the NYC population does drink coffee, perhaps as much as a third or even half consumes coffee daily. Many of those coffee-drinkers have more than one cup per day, and also surely more coffee is made than consumed. So it seems reasonable to me to estimate that we have approximately one medium cup of coffee per person on average per day in New York. Basically, we’re saying that the heavy coffee drinkers and the made-but-not-sold coffee approximately makes up for those who abstain, making the average about one cup per person. So we are talking about 10 million cups of coffee per day. A medium cup of brewed coffee at Starbucks is I think about 12-16 ounces, a little less than a pint, and so let’s say about 3 cups per liter. This amounts to roughly 3 million liters of coffee.

Would it fill the Statue of Liberty? The statue itself is, I estimate, about twenty stories tall, counting the base, and if each story is 15 ft, or 5 meters, that would mean 100 meters tall, counting the base. But I think that the base is about half the height, so let’s say 50 meters for the actual statue itself. I’ve never been inside the statue, but my students say that it is about 10 meters across inside, a little more at the bottom than near the top. If we approximated it as a rectangular solid, that would give a volume of $10\times 10\times 50$ cubic meters, or 5000 cubic meters. But since the statue tapers as you go up, particularly in the arm holding the torch, it really is more like a cone than a rectangular solid, and so we should divide by three. But let’s divide just by two, because she isn’t quite as tapered as a cone. So the Statue of Liberty has a volume of approximately 2500 cubic meters. One cubic meter can be thought of as a 10 by 10 by 10 array of little 10cm cubes, and each of those is exactly one liter. So a cubic meter is 1000 liters, and therefore the Statue of Liberty has a volume of $2500\times 1000=2.5$ million liters. But since we had 3 million liters of coffee, the answer our calculation arrives at is: Yes, one day’s worth of New York coffee would fill up the Statue of Liberty!

Well, we do not have perfect confidence in our estimates and assumptions — for example, perhaps there are many fewer coffee drinkers in New York than we estimated or perhaps we underestimated the volume of the Statue of Liberty. Since the estimated volumes were of basically similar magnitudes, we aren’t really entitled to say that definitely the coffee would fill up the Statue of Liberty. Rather, what we have come to know is that those two volumes are comparably similar in size; they are in the same ballpark.

Elevator trips. While riding downtown last weekend with my son on the subway, a crowded 4 train, we overheard the group standing next to us talking about elevators. One lady said, “My elevator company serves as many elevator trips in New York City in five days as the population of the entire world,” and the rest of her group, impressed, nodded affirmatively in reply. But my thoughts, upon hearing that, were to make a quick calculation. Suppose all 10 million NYC residents rode an elevator 10 times every day, which is way too high (probably one trip per person per day is more reasonable, since many people live and work in buildings without elevators). Even in this extreme case of ten trips per person per day, it would mean only 100 million trips total per day, or 500 million trips over 5 days. This is much less than the world population, and so no way is that person’s claim true, especially since there are also many elevator companies. I thought of mentioning my calculation to those people on the subway, but decided against it. Walking out of the subway in the East Village, however, I asked my son (14 years old) whether he heard those people talking about elevators, and he replied, “Oh, yes, and when they said that, I calculated it in my head: no way is that true.” He then proceeded to explain his calculation, similar to mine. Yay, Horatio!

The Chicago marathon. In the run-up to the Chicago marathon this year, on a route that would wind through the windy city streets, Newsweek magazine reported, “Chicago Marathon organizers expect 1.7 million fans to line the route this year.” (Thanks to the critical math commentary of Mark Iris for bringing this example to my attention.) The organizers had emphasized the economic impact of these spectators, many of whom would presumably be eating in Chicago restaurants and staying at Chicago hotels. But is this a reasonable number?

Let’s calculate. A marathon is approximately 26 miles, and the route has two sides for spectators, so let’s round it to 50 miles of spectator viewing spots. Each mile is about 1800 yards, so we have $50\times 1800=90000$ yards of viewing spots. Each spectator, standing shoulder-to-shoulder, with all their stuff, takes up about a yard of space. So if the marathon was lined on both sides for the entire route with spectators standing shoulder-to-shoulder, this would amount to about 90,000 spectators. In order to have 1.7 million spectators, therefore, they would have to be lined up behind each other. But even if the spectators were 10 people deep on each side for the entire route, which is a vast crowd, it would still amount only to 900,000 people. We would have basically to double this to get to 1.7 million. So, if the live event really had 1.7 million spectators lining the route, then it would mean that the race was lined 20 people deep on each side for the entire route. No way is this number correct! I have never had the chance to attend the Chicago marathon, but at the New York marathon, which I assume is comparable, I know that while there are thick crowds at the finish line in Central Park and at some of the other prominent or especially interesting race locations, most of the rest of the route is much thinner, and at the typical nothing-special location, one could expect easily to have a front-row spot.

How many bricks on the college campus? Our campus, covered with some lovely woods and green meadows, has a number of brick Georgian buildings. Most of these are the same standard size as the mathematics department, but there are also some larger buildings such as the library, the performing arts center, the campus center and the gymnasium. At my lecture, the students and I agreed that altogether it amounted to about 30 buildings of the size of the math building. This building is approximately 30 meters by 70 meters, which would make a perimeter of 200 meters, but because the building has wings and is not purely rectangular, let’s add another 100 meters for the folds in the walls, so about 300 meters around the base of the building. The building is two stories tall, about 10 meters tall, making the area of the walls about 3000 square meters. Of course, there are windows and doors that cut out of these walls, but let’s say that they are approximately accounted for by the extra bricks used in the trimming flourishes at the corners and top of the building. So we have 3000 square meters of brick. Each brick is approximately 10 cm by 30 cm, and so one square meter of brick would have about ten rows of a little more than three bricks across, or about 20 bricks. So each building has about $3000\times 20=60,000$ bricks. And with 30 buildings, this amounts to $30\times 60,000=1,800,000$ bricks in the buildings on campus. There are also some bricks in the central fountain, a few small brick walls and some bricks lining some of the walkways. So let’s add another ten percent for those extra bricks, arriving at about 2,000,000 bricks on campus in all.

Positive test result for a rare disease. Suppose that as part of your check-up, your doctor randomly administers a clinical test for a certain rare medical condition. The test is 99 percent accurate, in terms of false positives and false negatives, in the sense that 99 percent of people taking the test have an accurate result with the test. Suppose also that the disease itself is rare, held by only one in a million people. If your test comes back positive, what is the likelihood that you actually have the disease?

This is a subtle question. It might seem to make sense initially that there is a 99 percent chance that you have the disease, since that is the accuracy of the test. But this isn’t actually right, because it doesn’t account for the fact that the disease itself is extremely rare, and so the total number of false positives will actually far outweigh the true positive results. For example, suppose that one million people take the test. About one of these people actually has the disease, and that person is 99 percent likely to test positive. So we expect about one true positive result. And for the others, who don’t have the disease, 99 percent of them will test negative. In other words, about 990,000 people will test negative. The remaining 10,000 people, however, one percent of the total, will have false positive results! So you are in this group of 10,000 people who tested positive, with only one of them actually having the disease. So the odds that you actually have the disease are only about one in ten thousand.

Quarters to the moon. How many quarters would you have to stack to reach the moon? How much would it be worth? how much would it weigh? More than the earth? More than the Empire state building? If you put all those quarters into a pile, would it fit in Central Park?

Well, OK, the question is a little absurd, and there are all kinds of problematic issues: we couldn’t ever actually make such a tower of quarters, as it would topple over; it doesn’t make sense to build a tower to the moon, since the moon is in orbit around the earth and it is moving; the quarters would begin to orbit the earth themselves, or fall back down to the earth or to the moon. But let’s just try to ignore all those problematic issues, and just try to answer how many quarters it would take to make a pile that high.

How far is the moon? I don’t really know, and we could look it up to see what the astronomers say about it, but that would go against the back-of-the-envelope spirit of these problems. So I am just going to use two facts that I picked up years ago: first, I know that the speed of light is about 300,000 km per second; and second, I happen to know that it takes radio signals traveling at the speed of light about one second to get to the moon (eight minutes to the sun), and so the moon is about one light-second away. I remember this fact from having learned it in my childhood, because it was important in a science fiction story I had read, in which radio communication from earth to the moon base had a two-second delay: you send a message, it takes a second to get there, and then a second for the answer to come back. So the moon is one light second away, and light travels 300,000 km in one second. So the moon is about 300,000 kilometers distant (actual value: 387400 km).

Let’s now stack up the quarters. By eyeballing a quarter I had in my pocket, I’d say each quarter is about 2 mm thick, which means 500 quarters per meter, or 500,000 quarters per km. So altogether, we have $300,000\times500,000=150,000,000,000$ many quarters in the stack. A hundred and fifty billion quarters to the moon! It takes four quarters to make a dollar, so that is worth about $40 billion dollars, which might be less than you would have expected initially. Let’s put those quarters in a big pile, packed as tightly as possible. Each quarter is a little over 2 cm in diameter, and so the volume of the rectangular solid bounding the quarter is a little more than 2 cm by 2 cm by 2 mm, or about 800 cubic mm, which we can round up to 1 cubic centimeter. That makes sense, because we can imagine folding a quarter up into a little 1 cm cube. So 1000 quarters takes up about one liter. Thus, our quarters-to-the-moon have a volume of about 150 million liters. There are 1000 liters in a cubic meter, and so this is 150,000 cubic meters. Since a cube 10 meters on a side, has a volume of 1000 cubic meters, we can fit all those quarters into 150 such cubes. I can easily imagine a giant art installation, with 150 such 10 meter cubes placed in Sheep Meadow in Central Park. They would definitely fit. How much would the quarters weigh? Some of the students in my lecture had worked as cashiers, and so they were familiar with quarter rolls from the bank, which fit in your palm and have ten dollars worth of quarters, or forty quarters. I can imagine a quarter roll in one hand balancing the weight of a half-pound of gouda cheese at the deli in my other hand. So 40 quarters is .5 pound, which makes 100 quarters about 2.2 pounds, which is about 1 kg. We had 150 billion quarters, which is 1.5 billion times 100 quarters, so 1.5 billion kilograms. Since we already imagined the quarters in Sheep Meadow in those 150 large boxes, clearly they weigh far less than the earth and perhaps comparable to the Empire State Building. How much does the Empire State Building weigh? After posing this question, I found out that it is also evidently a famous Google interview question, and you can easily find other blog posts about it. But let me tell you how I thought about it. The Empire State Building is about 100 stories tall, and if we assume 12-15 feet per floor, or about 4 meters, this makes the total height (without tower) about 400 meters. The CUNY Graduate Center where I work is across the corner on Fifth Avenue and 34th street, and I have walked past the Empire State Building many times. I estimate that the base is about 75 meters on Fifth Avenue by 150 meters on 34th Street. But because of the step-backs, the middle part of the tower is considerably less than that, probably 30 by 80 near the top. Let’s say the average cross section of the building is 50 by 100 meters. Now, how much does that weigh? Of course, there is a lot of steel and masonary in the floors and walls, and concrete floors, and also all the contents of the building, with desks and paper files and books and interior walls and doors. Those things are heavy. I really have no idea how much those things weigh, but let me imagine that we build a virtual copy of a complete floor from the Empire State Building, without any of those floor beams or concrete or desks or walls or anything, and instead we flood that virtual room with water, until it has the same weight. Of course, the metal and concrete and masonary is much heavier than water, but the actual space is mostly empty air. How much water would it take to have the same weight? Well, let me just guess that we’d have to flood the virtual room about one-quarter deep with water to have the same weight as the actual materials. At 50 by 100 meters by 4 meters tall for each floor, this would mean a pool of water 50 by 100 meters, one meter deep, for a volume of 5000 cubic meters of water. We already said that one cubic meter of water is 1000 liters, each of which weighs one kilogram, so we are talking about$5000\times 1000$kilograms or 5 million kilograms per floor. With 100 floors, this is 500 million kilograms, or 500,000 metric tons. So according to this estimate, the Empire State Building with all its contents weighs about 500,000 metric tons. So it weighs less than the quarters to the moon (1.5 billion kilograms) —the Empire state building weighs about one-third as much as the quarters to the moon — but these two quantities are in the same ball park. How many babies are born each day in New York City? The population of New York is approximately ten million people. And those people live about 70 years. Let’s imagine spreading those births out uniformly over the 70 year period. That means one million people born in 7 years. This is about 150,000 people born in one year, which is about 3000 births per week, or about 400 births per day. This estimate would be accurate if the city population were in a steady state, with births balancing deaths, but of course the population is increasing. On the other hand, most of that increase is from people moving here, not just being born here. But then again, the people moving to New York I expect are mainly young adults looking for a career, and so they will contribute to a heightened birth rate as they have children. Shall we add 25 percent more to account for the fact that the city is growing and not in a steady state? In that case, our estimate is that there are about 500 births each day in New York City. There is an endless supply of such questions that can be calculated. I have been talking about them in the lectures for my course Math for Liberal Arts, an undergraduate course aimed at non-math students at my college. In this course, we are fairly free to cover some imaginative topics, and I’ve covered some game theory, some elementary graph theory, a little bit logic, and now this number sense topic, which I use to try to develop the students ability to attack a technical problem by breaking it down into more managable problems. But meanwhile, I also look upon it just as plain fun. So here are a few more questions that you might enjoy thinking about. And please make your own. • What is the total volume of air that you have breathed in your lifetime? If you filled a balloon with all your hot air, how big would it be? • If you used that balloon as a hot-air balloon (with the hot air at your body temperature), would it have enough bouyency to lift you and all your possessions? See my post A lifetime of hot air for a detailed answer. • How many NYC metro cards exist? (Each lasts about a year or two; they get thrown out, but still exist in the landfill; the system has used metro cards for about twenty years; there is a large stock of new not-yet-used cards.) • How many doorknobs are there in your building? • How many subway tiles are there in the NYC subway system? • How many riders were on my crowded 4 train downtown this morning? • How many lightswitches are there on your university campus? • How much water does NYC use each day? • What are the odds that you drank a molecule of water that was once also drank by Julius Caesar? • How many people have there been? What fraction are currently alive? Please try to figure them out, and post your solutions in the comments below! # My very first lemma, which also happened to involve a philosophical dispute Let me recall the time of my very first lemma, which also happened to involve a philosophical dispute. It was about 35 years ago; I was a kid in ninth grade at McKinley Junior High School, taking a class in geometry, taught by a charismatic math teacher. We were learning how to do proofs, which in that class always consisted of a numbered list of geometrical assertions, with a specific reason given for each assertion, either stating that it was “given” or that it followed from previous assertions by a theorem that we had come to know. Only certain types of reasons were allowed. My instructor habitually used the overhead projector, writing on a kind of infinite scroll of transparency film, which he could wind up on one end of the projector, so as never to run out of room. During the semester, he had filled enough spools, it seemed, to fill the library of Alexandria. One day, it came to be my turn to present to the rest of the class my proof of a certain geometrical theorem I had been assigned. I took the black marker and drew out my diagram and theorem statement. In my proof, I had found it convenient to first establish a certain critical fact, that two particular line segments in my diagram were congruent$\vec{PQ}\cong\vec{RS}$. In order to do so, I had added various construction lines to my diagram and reasoned with side-angle-side and so on. Having established the congruency, I had then wanted to continue with my proof of the theorem. Since the previous construction lines were cluttering up my diagram, however, I simply erased them, leaving only my original diagram. The class erupted with objection! How could I sensibly continue now with my proof, claiming that$\vec{PQ}\cong\vec{RS}$, after I had erased the construction lines? After all, are those lines segments still congruent once we erase the construction lines that provided the reason we first knew this to be true? Many of the students believed that my having erased the construction lines invalidated my proof. So there I was, in a ninth-grade math class, making a philosophical argument to my fellow students that the truth of the congruence$\vec{PQ}\cong\vec{RS}$was independent of my having drawn the construction lines, and that we could rely on the truth of that fact later on in my proof, even if I were to erase those construction lines. After coming to an uneasy, tentative resolution of this philosophical dispute, I was then allowed to continue with the rest of my proof, establishing the main theorem. I realized only much later that this had been my very first lemma, since I had isolated a mathematically central fact about a certain situation, proving it with a separate argument, and then I had used that fact in the course of proving a more general theorem. # The pirate treasure division problem In my logic course this semester, as a part of the section on the logic of games, we considered the pirate treasure division problem. Imagine a pirate ship with a crew of fearsome, perfectly logical pirates and a treasure of 100 gold coins to be divided amongst them. How shall they do it? They have long agreed upon the pirate treasure division procedure: The pirates are linearly ordered by rank, with the Captain, the first Lieutenant, the second Lieutenant and so on down the line; but let us simply refer to them as Pirate 1, Pirate 2, Pirate 3 and so on. Pirate 9 is swabbing the decks in preparation. For the division procedure, all the pirates assemble on deck, and the lowest-ranking pirate mounts the plank. Facing the other pirates, she proposes a particular division of the gold — so-and-so many gold pieces to the captain, so-and-so many pieces to Pirate 2 and so on. The pirates then vote on the plan, including the pirate on the plank, and if a strict majority of the pirates approve of the plan, then it is adopted and that is how the gold is divided. But if the pirate’s plan is not approved by a pirate majority, then regretfully she must walk the plank into the sea (and her death) and the procedure continues with the next-lowest ranking pirate, who of course is now the lowest-ranking pirate. Suppose that you are pirate 10: what plan do you propose? Would you think it is a good idea to propose that you get to keep 94 gold pieces for yourself, with the six remaining given to a few of the other pirates? In fact, you can propose just such a thing, and if you do it correctly, your plan will pass! Before explaining why, let me tell you a little more about the pirates. I mentioned that the pirates are perfectly logical, and not only that, they have the common knowledge that they are all perfectly logical. In particular, in their reasoning they can rely on the fact that the other pirates are logical, and that the other pirates know that they are all logical and that they know that, and so on. Furthermore, it is common knowledge amongst the pirates that they all share the same pirate value system, with the following strictly ordered list of priorities: Pirate value system: 1. Stay alive. 2. Get gold. 3. Cause the death of other pirates. 4. Arrange that other’s gold goes to the most senior pirates. That is, at all costs, each pirate would prefer to avoid death, and if alive, to get as much gold as possible, but having achieved that, would prefer that as many other pirates die as possible (but not so much as to give up even one gold coin for additional deaths), and if all other things are equal, would prefer that whatever gold was not gotten for herself, that it goes as much as possible to the most senior pirates, for the pirates are, in their hearts, conservative people. So, what plan should you propose as Pirate 10? Well, naturally, the pirates will consider Pirate 10’s plan in light of the alternative, which will be the plan proposed by Pirate 9, which will be compared with the plan of Pirate 8 and so on. Thus, it seems we should propagate our analysis from the bottom, working backwards from what happens with a very small number of pirates. One pirate. If there is only one pirate, the captain, then she mounts the plank, and clearly she should propose “Pirate 1 gets all the gold”, and she should vote in favor of this plan, and so Pirate 1 gets all the gold, as anyone would have expected. Two pirates. If there are exactly two pirates, then Pirate 2 will mount the plank, and what will she propose? She needs a majority of the two pirates, which means she must get the captain to vote for her plan. But no matter what plan she proposes, even if it is that all the gold should go to the captain, the captain will vote against the plan, since if Pirate 2 is killed, then the captain will get all the gold anyway, and because of pirate value 3, she would prefer that Pirate 2 is killed off. So Pirate 2’s plan will not be approved by the captain, and so unfortunately, Pirate 2 will walk the plank. Three pirates. If there are three pirates, then what will Pirate 3 propose? Well, she needs only two votes, and one of them will be her own. So she must convince either Pirate 1 or Pirate 2 to vote for her plan. But actually, Pirate 2 will have a strong incentive to vote for the plan regardless, since otherwise Pirate 2 will be in the situation of the two-pirate case, which ended with Pirate 2’s death. So Pirate 3 can count on Pirate 2’s vote regardless, and so Pirate 3 will propose: Pirate 3 gets all the gold! This will be approved by both Pirate 2 and Pirate 3, a majority, and so with three pirates, Pirate 3 gets all the gold. Four pirates. Pirate 4 needs to have three votes, so she needs to get two of the others to vote for her plan. She notices that if she is to die, then Pirates 1 and 2 will get no gold, and so she realizes that if she offers them each one gold coin, they will prefer that, because of the pirate value system. So Pirate 4 will propose to give one gold coin each to Pirates 1 and 2, and 98 gold coins to herself. This plan will pass with the votes of 1, 2 and 4. Five pirates. Pirate 5 needs three votes, including her own. She can effectively buy the vote of Pirate 3 with one gold coin, since Pirate 3 will otherwise get nothing in the case of four pirates. And she needs one additional vote, that of Pirate 1 or 2, which she can get by offering two gold coins. Because of pirate value 4, she would prefer that the coins go to the highest ranking pirate, so she offers the plan: two coins to Pirate 1, nothing to pirate 2, one coin to pirate 3, nothing to Pirate 4 and 97 coins to herself. This plan will pass with the votes of 1, 3 and 5. Six pirates. Pirate 6 needs four votes, and she can buy the votes of Pirates 2 and 4 with one gold coin each, and then two gold coins to Pirate 3, which is cheaper than the alternatives. So she proposes: one coin each to 2 and 4, two coins to 3 and 96 coins for herself, and this passes with the votes of 2, 3, 4 and 6. Seven pirates. Pirate 7 needs four votes, and she can buy the votes of Pirates 1 and 5 with only one coin each, since they get nothing in the six-pirate case. By offering two coins to Pirate 2, she can also get another vote (and she prefers to give the extra gold to Pirate 2 than to other pirates in light of the pirate values). Eight pirates. Pirate 8 needs five votes, and she can buy the votes of Pirates 3, 4 and 6 with one coin each, and ensure another vote by giving two coins to Pirate 1, keeping the other 95 coins for herself. With her own vote, this plan will pass. Nine pirates. Pirate 9 needs five votes, and she can buy the votes of Pirates 2, 5 and 7 with one coin each, with two coins to Pirate 3 and her own vote, the plan will pass. Ten pirates. In light of the division offered by Pirate 9, we can now see that Pirate 10 can ensure six votes by proposing to give one coin each to Pirates 1, 4, 6 and 8, two coins to Pirate 2, and the remaining 94 coins for herself. This plan will pass with those pirates voting in favor (and herself), because they each get more gold this way than they would under the plan of Pirate 9. We can summarize the various proposals in a table, where the$n^{\rm th}$row corresponds to the proposal of Pirate$n$. 1 2 3 4 5 6 7 8 9 10 One pirate 100 Two pirates * X Three pirates 0 0 100 Four pirates 1 1 0 98 Five pirates 2 0 1 0 97 Six pirates 0 1 2 1 0 96 Seven pirates 1 2 0 0 1 0 96 Eight pirates 2 0 1 1 0 1 0 95 Nine pirates 0 1 2 0 1 0 1 0 95 Ten pirates 1 2 0 1 0 1 0 1 0 94 There are a few things to notice, which we can use to deduce how the pattern will continue. Notice that in each row beyond the third row, the number of pirates that get no coins is almost half (the largest integer strictly less than half), exactly one pirate gets two coins, and the remainder get one coin, except for the proposer herself, who gets all the rest. This pattern is sustainable for as long as there is enough gold to implement it, because each pirate can effectively buy the votes of the pirates getting$0$under the alternative plan with one fewer pirate, and this will be at most one less than half of the previous number; then, she can buy one more vote by giving two coins to one of the pirates who got only one coin in the alternative plan; and with her own vote this will be half plus one, which is a majority. We can furthermore observe that by the pirate value system, the two coins will always go to either Pirate 1, 2 or 3, since one of these will always be the top-ranked pirate having one coin on the previous round. They each cycle with the pattern of 0 coins, one coin, two coins in the various proposals. At least until the gold becomes limited, all the other pirates from Pirate 4 onwards will alternate between zero coins and one coin with each subsequent proposal, and Pirate$n-1$will always get zero from Pirate$n$. For this reason, we can see that the pattern continues upward until at least Pirate 199, whose proposal will follow the pattern: 199 Pirates: 1 2 0 0 1 0 1 0 1 0 1 0 1$\dots$1 0 1 0 0 It is with Pirate 199, specifically, that for the first time it takes all one hundred coins to buy the other votes, since she must give ninety-eight pirates one coin each, and two coins to Pirate 2 in order to have one hundred votes altogether, including her own, leaving no coins left over for herself. For this reason, Pirate 200 will have a successful proposal, since she no longer needs to spend two coins for one vote, as the proposal of Pirate 199 has one hundred pirates getting zero. So Pirate 200 can get 100 votes by proposing one coin to everyone who would get zero from 199, plus her own vote, for a majority of 101 votes. 200 pirates: 0 0 1 1 0 1 0 1 0 1 0$\dots$0 1 0 1 1 0 Pirate 201 also needs 101 votes, which she can get by giving all the zeros of the 200 case one coin each, plus her own vote. The unfortunate Pirate 202, however, needs 102 votes, and this will not be possible, since she has only 100 coins, and so Pirate 202 will die. The interesting thing to notice next is that Pirate 203 will therefore be able to count on the vote of Pirate 202 without paying any gold for it, and so since she needs only 100 additional votes (after her own vote and Pirate 202’s vote), she will be able to buy 100 votes for one coin each. Pirate 204 will again be one coin short, and so she will die. Although Pirate 205 will be able to count on that one additional free vote, this will be insufficient to gain a passing proposal, because she will be able to buy one hundred votes with the coins, plus her own vote and the free vote of Pirate 204, making 102 votes altogether, which is not a majority. Similarly, Pirate 206 will fall short, because even with her vote and the free votes of 204 and 205, she will be able to get at most 103 votes, which is not a majority. Thus, Pirate 207 will be able to count on the votes of Pirates 204, 205, and 206, which with her own vote and 100 more votes gotten by giving one coin each to the pirates who would otherwise get nothing, we can obtain 104 votes, which is a majority. The reader is encouraged to investigate further to see how the pattern continues. It is a fun problem to work out! What emerges is the phenomenon by which longer and longer sequences of pirates in a row find themselves unable to make a winning proposal, and then suddenly a pirate is able to survive by counting on their votes. It is very interesting also to work out what happens when there is a very small number of coins. For example, if there is only one gold coin, then already Pirate 4 is unable to make a passing proposal, since she can buy only one other vote, and with her own this will make only two votes, falling short of a majority. With only one coin, Pirate 5 will survive by buying a vote from Pirate 1 and counting on the vote of Pirate 4 and her own vote, for a majority. Even the case of zero coins is interesting to think about! In this case, there is no gold to distribute, and so the voting is really just about whether the pirate should walk the plank or not. If only one pirate, she will live. Pirate 2 will die, since Pirate 1 will vote against. But for that reason, Pirate 2 will vote in favor of Pirate 3, who will live. The pattern that emerges is: lives, dies, lives, dies, dies, dies, lives, dies, dies, dies, dies, dies, dies, dies, lives, …. After each successful proposal, where the pirates lives, for subsequently larger numbers of pirates, there must be many deaths in a row in order for the proposal to count on enough votes. So after each “lives” in the pattern, you have to double the length with many “dies” in a row, before there will be enough votes to support the next pirate who lives. See also the Pirate Game entry on Wikipedia, which is a slightly different formulation of the puzzle, since tie-votes are effectively counted as success in that variation. For this reason, the outcomes are different in that variation. I prefer the strict-majority variation, since I find it interesting that one must sometimes use two gold coins to gain the majority, and also because the death of Pirate 2 arrives right away in an interesting way, rather than having to wait for 200 or more pirates as with the plurality version. Another (inessential) difference in presentation is that in the other version of the puzzle, they have the captain on the plank first, and then always the highest-ranking pirate making the proposal, rather than the lowest-ranking pirate. This corresponds simply to inverting the ranking, and so it doesn’t change the results. The puzzle appears to have been around for some time, but I am unsure of the exact provenance. Ian Stewart wrote a popular 1998 article for Scientific American analyzing the patterns that arise when the number of pirates is large in comparison with the number of gold pieces. # How does a slinky fall? Have you ever observed carefully how a slinky falls? Suspend a slinky from one end, letting it hang freely in the air under its own weight, and then, let go! The slinky begins to fall. The top of the slinky, of course, begins to fall the moment you let go of it. But what happens at the bottom of the slinky? Does it also start to fall at the same moment you release the top? Or perhaps it moves upward, as the slinky contracts as it falls? Or does the bottom of the slinky simply hang motionless in the air for a time? The surprising fact is that indeed the bottom of the slinky doesn’t move at all when you release the top of the slinky! It hangs momentarily motionless in the air in exactly the same coiled configuration that it had before the drop. This is the surprising slinky drop effect. My son (age 13, eighth grade) took up the topic for his science project this year at school. He wanted to establish the basic phenomenon of the slinky drop effect and to investigate some of the subtler aspects of it. For a variety of different slinky types, he filmed the slinky drops against a graded background with high-speed camera, and then replayed them in slow motion to watch carefully and take down the data. Here are a few sample videos. He made about a dozen drops altogether. For the actual data collection, the close-up videos were more useful. Note the ring markers A, B, C, and so on, in some of the videos. See more videos here. For each slinky drop video, he went through the frames and recorded the vertical location of various marked rings (you can see the labels A, B, C and so on in some of the videos above) into a spreadsheet. From this data he then produced graphs such as the following for each slinky drop: In each case, you can see clearly in the graph the moment when the top of the slinky is released, since this is the point at which the top line begins to descend. The thing to notice next — the main slinky drop effect — is that the lower parts of the slinky do not move at the same time. Rather, the lower lines remain horizontal for some time after the drop point. Basically, they remain horizontal until the bulk of the slinky nearly descends upon them. So the experiments clearly establish the main slinky drop phenomenon: the bottom of the slinky remains motionless for a time hanging in the air unchanged after the top is released. In addition to this effect, however, my son was focused on investigating a much more subtle aspect of the slinky drop phenomenon. Namely, when exactly does the bottom of the slinky start to move? Some have said that the bottom moves only when the top catches up to it; but my son hypothesized, based on observations, as well as discussions with his father and uncles, that the bottom should start to move slightly before the bulk of the slinky meets it. Namely, he thought that when you release the top of the slinky, a wave of motion travels through the slinky, and this wave travels slightly fast than the top of the slinky falls. The bottom moves, he hypothesized, when the wave front first gets to the bottom. His data contains some confirming evidence for this subtler hypothesis, but for some of the drops, the experiment was inconclusive on this smaller effect. Overall, he had a great time undertaking the science project. June 2016 Update: On the basis of his science fair poster and presentation, my son was selected as nominee to the Broadcom Masters national science fair competition! He is now competing against other nominees (top 10% of participating science fairs) for a chance to present his research in Washington at the final national competition next October. September 2016 Update: My son has now been selected as a Broadcom Masters semi-finalist, placing him in the top 300 amongst more than 6000 nominees. The finalists will be chosen in a few weeks, with the chance to present in Washington, D.C. # Math for nine-year-olds: fold, punch and cut for symmetry! Today I had the pleasure to visit my daughter’s fourth-grade classroom for some fun mathematical activities. The topic was Symmetry! I planned some paper-folding activities, involving hole-punching and cutting, aiming to display the dynamism that is present in the concept of symmetry. Symmetry occurs when a figure can leap up, transforming itself through space, and land again exactly upon itself in different ways. I sought to have the students experience this dynamic action not only in their minds, as they imagined various symmetries for the figures we considered, but also physically, as they folded a paper along a line of symmetry, checking that this fold brought the figure exactly to itself. The exercises were good plain fun, and some of the kids wanted to do them again and again, the very same ones. Here is the pdf file for the entire project. To get started, we began with a very simple case, a square with two dots on it, which the girls folded and then punched through with a hole punch, so that one punch would punch through both holes at once. Next, we handled a few patterns that required two folds. I told the kids to look for the lines of symmetry and fold on them. Fold the paper in such a way that with ONE punch, you can make exactly the whole pattern. Don’t worry if the holes you punch do not line up exactly perfectly with the dots — if the holes are all touching their intended target and there are the right number of holes, then I told the kids it is great! The three-fold patterns are a bit more challenging, but almost all of the kids were able to do it. They did need some help punching through, however, since it sometimes requires some strength to punch through many layers. With these further patterns, some of the folds don’t completely cover the paper. So double-check and make sure that you won’t end up with unwanted extra holes! The second half of the project involved several cutting challenges. To begin, let’s suppose that you wanted to cut a square out of the middle of a piece of paper. How would you do it? Perhaps you might want to poke your scissors through and then cut around the square in four cuts. But can you do it in just ONE cut? Yes! If you fold the paper on the diagonals of the square, then you can make one quick snip to cut out exactly the square, leaving the frame completely intact. Similarly, one can cut out many other shapes in just one cut. The rectangle and triangle are a little trickier than you might think at first, since at a middle stage of folding, you’ll find that you end up folding a shorter line onto a longer one, but the point is that this is completely fine, since one cut will go through both. Give it a try! Here are a few harder shapes, requiring more folds. It is an amazing mathematical fact, the fold-and-cut theorem, that ANY shape consisting of finitely many straight line segments can be made with just one cut after folding. Here are a few challenges, which many of the fourth-graders were able to do during my visit. What a lot of fun the visit was! I shall look forward to returning to the school next time. In case anyone is interested, I have made available a pdf file of this project. I would like to thank Mike Lawler, whose tweet put me onto the topic. And see also the awesome Numberphile video on the fold-and-cut theorem, featuring mathematician Katie Steckles. See more of my Math for Kids! # Win at Nim! The secret mathematical strategy for kids (with challange problems in transfinite Nim for the rest of us) Welcome to my latest instance of Math for Kids! Today I had the pleasure to make an interactive mathematical presentation at my son’s school to the 7th / 8th grade Math Team, about 30 math-enthusiastic kids (twelve and thirteen years old) along with their math teachers and the chair of the school math department. The topic was the game of Nim! This game has a secret mathematical strategy enabling anyone with that secret knowledge to win against those without it. It is a great game for kids, because with the strategy they can realistically expect to beat their parents, friends, siblings and parent’s friends almost every single time! To play Nim, one player sets up a number of piles of blocks, and the opponent chooses whether to go first or second. The players take turns removing blocks — each player may remove any number of blocks (at least one) from any one pile, and it is fine to take a whole pile — whichever player takes the last block wins. For the math team, we played a few demonstration games, in which I was able to beat all the brave challengers, and then the kids paired off to play each other and gain familiarity with the game. Then, it was time for the first strategy discussion. What could the secret winning strategy be? I explained to the kids a trick that mathematicians often use when approaching a difficult problem, namely, to consider in detail some very simple special cases or boundary instances of the problem. It often happens that these special cases reveal a way of thinking about the problem that applies much more generally. Perhaps one of the easiest special cases of Nim occurs when there is only one pile. If there is only one pile, then clearly one wants to go first, in order to make the winning move: take the entire pile! Two balanced piles A slightly less trivial and probably more informative case arises when there are exactly two piles. If the stacks have the same height, then the kids realized that the second player could make copying moves so as to preserve this balanced situation. The key insight now is that this copying strategy is a winning strategy, because if one can always copy, then in particular one will have a move whenever the opponent did, and so the opponent will never take the last block. With two piles, therefore, one wants always to make them balanced. If they are initially unbalanced, then choose to go first and follow the balancing strategy. If they are initially balanced, then choose to go second, and copy whatever moves your opponent makes to rebalance them. A balanced position With that insight, it is not difficult to see that it is winning to leave a position with any number of pairs of balanced piles. One can in effect play on each pair separately, because whenever the opponent makes a move on one of the piles, one can copy the move with the corresponding partner pile. In this way, we may count such a position overall as balanced. The more fundamental game-theoretic observation to make is that balanced piles in effect cancel each other out in any position, and one can ignore them when analyzing a position. When two balanced piles are present in a possibly more complicated position, one can pretend that they aren’t there, precisely because whenever your opponent plays on one of them, you can copy the move on the other, and so any winning strategy for the position in which those piles are absent can be converted into a winning strategy in which the balanced piles are present. This idea now provides a complete winning strategy in the case that all piles have height one or two at most. One wants to leave a position with an even number of piles of each height. If only one height has an odd number of piles, then take a whole pile of that height. And if there are odd numbers of piles both of height one and two, then turn a height-two pile into a pile of height one, and this will make them both even. So any unbalanced position can be balanced, and any move on a balanced position will unbalance it. 1+2+3 counts as balanced Let’s now consider that there may be piles of height three. For example, consider the basic position with piles of height one, two and three. The observation to make here is that any move on this position can be replied to with a move that leaves it balanced (check it yourself to be sure!). It follows that this position is winning to leave for the other player (and so one should go second on$1+2+3$). It would be nice if we could consider this position itself as already balanced in some sense. Indeed, we may incorporate this situation into the balancing idea if we think of the pile of height three as really consisting of two subpiles, one of height two and one of height one. In this way, the Nim position 1+2+3 counts as balanced, since the 3 counts as 2+1, which balances the other stacks. The 1+2+3 position has two stacks of height two and two of height one, when one regards the stack of height three as having a substack of height two and a substack of height one. This way of thinking produces a complete winning strategy for Nim positions involving piles of height at most three. (And this is a strategy that can be mastered even by very young children — a few years ago I had talked about Nim with much younger children, Math for six-year-olds: Win at Nim!, first-graders at my daughter’s school, and at that time we concentrated on posititions with piles of height at most three. Older kids, however, can handle the full strategy.) Namely, the winning strategy in this case is to strive to balance the position, to make an even number overall of piles of height one and two, where we count piles of height three as one each of one and two. If you always give your opponent a balanced position, then you will win! Faced with an unbalanced position, it is a fact that you can always find a balancing move, and any move on an balanced position will unbalance it. If the game is just starting, and you are deciding whether to go first or second, you should determine whether it is balanced yet or not. If it unbalanced, then you should go first and make the balancing move; if it is already balanced, then you should go second and adopt the copying strategy, in which you re-balance the position with each move. The general winning strategy, of course, goes beyond three. The key idea is to realize that what is really going on when we represent$3$as$2+1$is that we are using the binary representation of the number$3$. To explain, I wrote the following numbers on the chalkboard $$1,\ 2,\ 4,\ 8,\ 16,\ 32,\ 64,\ \cdots$$ and was very pleased when the kids immediately shouted out, “The powers of two!” I explained that any natural number can be expressed uniquely as a sum of distinct powers of two. Asked for a favorite number less than one hundred, one student suggested$88$, and together we calculated $$88=64+16+8,$$ which means that the binary representation of$88$is$1011000$, which I read off as, “one$64$, no$32$s, one$16$, one$8$, no$4$s, no$2$s and no$1$s. This is just the same as thinking of$9572$as 9 thousands, 5 hundreds, 7 tens and 2 ones, using the powers of ten. It is interesting to learn that one may easily count very high on one hand using binary, up to 1023 on two hands! The general strategy is to view every Nim pile as consisting of subpiles whose height is a power of two, and to make sure that one leaves a position that is balanced in the sense that every power of two has an even number of such instances in the position. So we think of$3$as really$2+1$for the purposes of balancing;$4$counts as itself because it is a power of two, but$5$counts as$4+1$and$6$counts as$4+2$and$7$as$4+2+1$. Another way to describe the strategy is that we express all the pile heights in binary, and we want an even number of$1$s in each binary place position. The mathematical facts to verify are (1) any move on a balanced position in this powers-of-two sense will cause it to become unbalanced, and (2) any unbalanced position can be balanced in one move. It follows that leaving balanced positions is a winning strategy, because the winning move of taking the last block is a balancing move rather than an unbalancing move. One can prove statement (1) by realizing that when you move a single stack, the binary representation changes, and so whichever binary digits changed will now become unbalanced. For statement (2), consider the largest unbalanced power of two$2^k$and move on any stack that contains a$2^k$size substack. Since$2^k-1=111\cdots11$in binary, one can attain any binary pattern for the smaller height stacks by removing between$1$and$2^k$many blocks. So one can balance the position. As a practical matter, the proof of (2) also shows how one can find a (winning) balancing move, which can otherwise be difficult in some cases: look for the largest unbalanced power of two, and move on any pile containing such a subpile, making sure to leave a balanced position. In most actual instances of Nim, the pile heights are rarely very tall, and so one is usually considering just$1$,$2$and$4$as the powers of two that arise. A traditional starting configuration has piles of height 1, 3, 5, and 7, and this position is balanced, because one may view it as:$1, 2+1, 4+1, 4+2+1$, and there are an even number of 1s, 2s and 4s. It is interesting to consider also the Misère form of Nim, where one wants NOT to take the last block. This version of the game also has a secret mathematical strategy, which I shall reveal later on. Challenge 1. What is the winning strategy in Misère Nim? If you figure it out, please post a comment! I’ll post the solution later. One might naively expect that the winning strategy of Misère Nim is somehow totally opposite to the winning strategy of regular Nim, but in fact, the positions$1,2,3$and$1,3,5,7$are winning for the second player both in Nim and also in Misère Nim. Indeed, I claim that all nontrivial Nim positions that are winning for regular Nim (with a suitable meaning of “nontrivial”) are also winning for Misère Nim. Can you prove it? Another interesting generalization, for the set-theorists, is to consider transfinite Nim, where the piles can have transfinite ordinal height. So we have finitely many piles of ordinal height, perhaps infinite, and a move consists of making any one pile strictly shorter. Since there are no infinite descending sequence of ordinals, the game will terminate in finitely many moves, and the winner is whowever removes the last block. Challenge 2. Who wins the transfinite Nim game with piles of heights: $$1\qquad \omega+3\qquad \omega^\omega+5\qquad \omega^{\omega+3}+\omega^\omega\cdot3+\omega\cdot 5+7\qquad \epsilon_0\qquad \omega_1$$ and what are the winning moves? What is the general winning strategy for transfinite Nim? Post your solutions! You can also see my solution and further discussion. # Math for eight-year-olds: graph theory for kids! This morning I had the pleasure to be a mathematical guest in my daughter’s third-grade class, full of inquisitive eight- and nine-year-old girls, and we had a wonderful interaction. Following up on my visit last year (math for seven-year-olds), I wanted to explore with them some elementary ideas in graph theory, which I view as mathematically rich, yet accessible to children. My specific aim was for them to discover on their own the delightful surprise of the Euler characteristic for connected planar graphs. We began with a simple example, counting together the number of vertices, edges and regions. For counting the regions, I emphasized that we count the “outside” region as one of the regions. Then, I injected a little mystery by mentioning that Euler had discovered something peculiar about calculating$V-E+R$. Could they find out what it was that he had noticed? Each student had her own booklet and calculated the Euler characteristic for various small graphs, as I moved about the room helping out. Eventually, the girls noticed the peculiar thing — they kept getting the number two as the outcome! I heard them exclaim, why do we keep getting two? They had found Euler’s delightful surprise! The teachers also were very curious about it, and one of them said to me in amazement, “I really want to know why we always get two!” With the students, I suggested that we try a few somewhat more unusual cases, to see how robust the always-two situation really was. But in these cases, we still got two as the result. The girls made their own graphs and tested the hypothesis. Eventually, of course, I managed to suggest some examples that do in fact test the always-two phenomenon, first by looking at disconnected graphs, and then by considering graphs drawn with crossing edges. In this way, we were led to refine the$V-E+R=2$hypothesis to the case of connected planar graphs. Now, it was time for proof. I was initially unsure whether I should actually give a proof, since these were just third-graders, after all, and I thought it might be too difficult for the students. But when the teachers had expressed their desire to know why, they had specifically encouraged me to give the argument, saying that even if some students didn’t follow it, there was still value merely in seeing that one can mount an argument like that. Excellent teachers! The idea of the proof is that$V-E+R=2$is true at the start, in the case of a graph consisting of one vertex and no edges. Furthermore, it remains true when one adds one new vertex connected by one new edge, since the new vertex and new edge cancel out. Also, it remains true when one carves out a new region from part of an old region with the addition of a single new edge, since in this case there is one new edge and one new region, and these also balance each other. Since any connected planar graph can be built up in this way by gradually adding new vertices and edges, this argument shows$V-E+R=2$for any connected planar graph. This is a proof by mathematical induction on the size of the graph. Next, we moved on to consider some three-dimensional solids and their surfaces. With various polyhedra, and the girls were able to verify further instances of$V-E+R=2$. The girls then drew their own solids and calculated the Euler characteristic. I taught them how to draw a cube and several other solids pictured here; when the shape is more than just a simple cube, this can be a challenge for a child, but some of the children made some lovely solids: In the end, each child had a nice little booklet to take home. The images above are taken from one of the students in the class. What a great day! You can find a kit of the booklet here: See also the account of my previous visit: Math for seven-year-olds: graph coloring and Eulerian paths. # The rule-making game Let me tell you about a new game that we’ve been playing in our family, the rule-making game. It is a talking game, requiring no pieces or objects of any kind, and it can easily be played whilst walking or traveling. My children and I recently played several rounds of it walking around London on a recent visit there. The game has no rules, initially, nor even any definite procedure — it is different every time — but things usually become clear soon enough. It usually makes a better game to cooperate on the first several turns to lay the groundwork. Let me explain how to play simply by example: Papa: The first rule is that the players shall take turns making rules, and that every rule shall have a rule number, which is incremented on each turn. Horatio: The second rule is that the players must state their rules in the form, “The first rule is…” or “the second rule is…” and so on, and that players are not allowed to ask what is the current rule number, or they lose. Hypatia: The third rule is that the other players must say, “thank you” after another player makes a rule. (… “thank you”…. “thank you”….) Papa: The fourth rule is that the rules must not contradict each other, and no rule is allowed that abrogates an earlier rule. (… “thank you”…. “thank you”….) Horatio: The fifth rule is that after making an odd-numbered rule, the player must stomp on the ground. (STOMP… “thank you”…. “thank you”….) Hypatia: The sixth rule is that no player may win immediately after their own rule. (… “thank you”…. “thank you”….) Papa: The seventh rule is that right after a player stomps according to rule five, the other two players must hop. (STOMP … “thank you”…. “thank you”….HOP….HOP…) Horatio: The eighth rule is that if a player loses, then the game continues without that person. (… “thank you”…. “thank you”….) Hypatia: The ninth rule is that after stating a rule, the other two players must state a different color. (STOMP … “thank you”…. “thank you”….HOP…HOP… “blue”… “green”…) Papa: The tenth rule is that furthermore, those colors must never repeat, and they must be stated simultaneously, on the count of 1-2-3. (… “thank you”…. “thank you”…. “1-2-3: neon green / violet”) Horatio: The eleventh rule is that if there is only one player left, then that player wins. (STOMP … “thank you”…. “thank you”….HOP…HOP… “1-2-3: red/orange”) Hypatia: The twelfth rule is that every player must jump up and down (…jump…) while stating their rule. (….jump jump jump…) (… “thank you”…. “thank you”…. “1-2-3: pink/turquoise”) Papa: (jump jump…) The thirteenth rule is that (…jump…) in the case of dispute (…jump…), the question of whether or not someone has violated or followed a rule shall be decided by majority vote (…jump…). (STOMP … “thank you”…. “thank you”….HOP…HOP… “1-2-3: yellow/brown”) Horatio: (jump….) The fourteenth rule is that (…jump…) before stating their rule, the players must state a country, and that whoever repeats a country loses (…jump…) (… “thank you”…. “thank you”…. “1-2-3: black/gray”) Hypatia: (jump…) Germany. The fifteenth rule is that (…jump…) there can be at most twenty-five rules. (STOMP … “thank you”…. “thank you”….HOP…HOP… “1-2-3: sky blue / peach”) Papa: (jump…) United States. The sixteenth rule is that (…jump…) if all current players lose at the same time after a rule, then the player previous to that rule-maker is declared the “honorary winner”. (…jump…) (… “thank you”…. “thank you”…. “1-2-3: white / white”) Oh no! Since both Horatio and Hypatia said “white”, they both lose. And then Papa also loses in light of rule six. So we’ve all lost! But then, in light of rule sixteen, Hypatia is declared the honorary winner! Hooray for Hypatia! I hope you all get the idea. Please enjoy! And report your crazy or interesting rules in the comments below. # Math for seven-year-olds: graph coloring, chromatic numbers, and Eulerian paths and circuits As a guest today in my daughter’s second-grade classroom, full of math-enthusiastic seven-and-eight-year-old girls, I led a mathematical investigation of graph coloring, chromatic numbers, map coloring and Eulerian paths and circuits. I brought in a pile of sample graphs I had prepared, and all the girls made up their own graphs and maps to challenge each other. By the end each child had compiled a mathematical “coloring book” containing the results of their explorations. Let me tell you a little about what we did. We began with vertex coloring, where one colors the vertices of a graph in such a way that adjacent vertices get different colors. We started with some easy examples, and then moved on to more complicated graphs, which they attacked. The aim is to use the fewest number of colors, and the chromatic number of a graph is the smallest number of colors that suffice for a coloring. The girls colored the graphs, and indicated the number of colors they used, and we talked as a group in several instances about why one needed to use that many colors. Next, the girls paired off, each making a challenge graph for her partner, who colored it, and vice versa. Map coloring, where one colors the countries on a map in such a way that adjacent countries get different colors, is of course closely related to graph coloring. The girls made their own maps to challenge each other, and then undertook to color those maps. We discussed the remarkable fact that four colors suffice to color any map. Next, we considered Eulerian paths and circuits, where one traces through all the edges of a graph without lifting one’s pencil and without retracing any edge more than once. We started off with some easy examples, but then considered more difficult cases. An Eulerian circuit starts and ends at the same vertex, but an Eulerian path can start and end at different vertices. We discussed the fact that some graphs have no Eulerian path or circuit. If there is a circuit, then every time you enter a vertex, you leave it on a fresh edge; and so there must be an even number of edges at each vertex. With an Eulerian path, the starting and ending vertices (if distinct) will have odd degree, while all the other vertices will have even degree. It is a remarkable fact that amongst connected finite graphs, those necessary conditions are also sufficient. One can prove this by building up an Eulerian path or circuit (starting and ending at the two odd-degree nodes, if there are such); every time one enters a new vertex, there will be an edge to leave on, and so one will not get stuck. If some edges are missed, simply insert suitable detours to pick them up, and again it will all match up into a single path or circuit as desired. (But we didn’t dwell much on this proof in the second-grade class.) Meanwhile, this was an excellent opportunity to talk about The Seven Bridges of Königsberg. Is it possible to tour the city, while crossing each bridge exactly once? The final result: a booklet of fun graph problems! The high point of the day occurred in the midst of our graph-coloring activity when one little girl came up to me and said, “I want to be a mathematician!” What a delight! Andrej Bauer has helped me to make available a kit of my original uncolored pages (see also post at Google+), if anyone should want to use them to make their own booklets. Just print them out, copy double-sided (with correct orientation), and fold the pages to assemble into a booklet; a few staples can serve as a binding. See also my followup question on MathOverflow, for the grown-ups, concerning the computational difficulty of producing hard-to-color maps and graphs. Finally, check out my new book: # Doubled, squared, cubed: a math game for kids or anyone ## The number that must not be named Doubled, squared, cubed is a great math game to play with kids or anyone interested in math. It is a talking game, requiring no pieces or physical objects, played by a group of two or more people at almost any level of mathematical difficulty, while sitting, walking, boating or whatever. We play it in our family (two kids, ages 7 and 11) when we are sitting around a table or when walking somewhere or when traveling by train. I fondly recall playing the game with my brothers and sisters in my own childhood. The game proceeds by first agreeing on an allowed number range. For youngsters, perhaps one wants to allow the integers from 0 to 100, inclusive, but one will want to have negative numbers soon enough, and of course much more sophisticated play is possible. Eventually, one lessens or even abandons the restriction altogether. The first player offers a number, and each subsequent player in turn offers a mathematical operation, which is to be applied to the current number, which must not be mentioned explicitly. The resulting number must be in the allowed number range. The goal of the game is successfully to keep track of the number as it changes, and to offer an operation that makes sense with that number, while staying within the range of allowed numbers. The point is to have some style, to offer an operation that proves that you know what the number is, without stating the number explicitly. Perhaps your operation makes the new number a nice round number, or perhaps your operation can seldom be legally applied, and so applying it indicates that you know it is allowed to do so. You must offer only operations that you yourself can compute, and which do not rely on hidden information (for example, “times the number of grapes I ate at breakfast” is not really permissible). A losing move is one that doesn’t make sense or that results in a number outside the allowed range. In this case, the game can continue without that person, and the last person left wins. It is not allowed to offer an operation that can always be applied, such as “times zero” or “minus itself“, or which can always be applied immediately after the previous operation, such as saying “times two” right after someone said, “cut in half”. But in truth, the main point is to have some fun, rather than to win. Part of the game is surely simply to talk about new mathematical operations, and we usually take time out to discuss or explain any mathematical issue that may come up. So this is an enjoyable way for the kids to encounter new mathematical ideas. Let me simply illustrate a typical progression of the game, as it might be played in my family: Hypatia: one Barbara: doubled Horatio: squared Joel: cubed Hypatia: plus 36 Barbara: square root Horatio: divided by 5 Joel: times 50 Hypatia: minus 100 Barbara: times 6 billion Horatio: plus 99 Joel: divided by 11 Hypatia: plus 1 Barbara: to the power of two Horatio: minus 99 Joel: times itself 6 billion times Hypatia: minus one Barbara: divided by ten thousand Horatio: plus 50 Joel: plus half of itself Hypatia: plus 25 Barbara: minus 99 Horatio: cube root Joel: next prime number above Hypatia: ten’s complement Barbara: second square number above Horatio: reverse the digits Joel: plus 3 more than six squared Hypatia: minus 100 and so on! As the kids get older, we gradually incorporate more sophisticated elements into the game, and take a little time out to explain to young Hypatia, for example, what it means to cube a number, to take a number to the power two, or what a prime number is. I remember playing the game with my math-savvy siblings when I was a kid, and the running number was sometimes something like$\sqrt{29}$or$2+3i$, and a correspondingly full range of numbers and operations. It is fine to let the youngest drop out after a while, and continue with the older kids with more sophisticated operations; the youngsters will rejoin in the next round. In my childhood, we had a “challenge” rule, used when someone suspects that someone else doesn’t know the number: when challenged, the person should say the number; if incorrect, they are out, and otherwise the challenger is out. Last weekend, I played the game with Horatio and Hypatia as we walked through Central Park to the Natural History Museum, and they conspired in whispering tones to mess me up, until finally I lost track of the number and they won… # Largest-number contest: what is the largest number that you can describe on an index card? My recent talk, Playful Paradox with large numbers, infinity and logic, was a romp through various paradoxical topics in mathematics and logic for an audience of about 150 mathematics and philosophy undergraduate students at Fudan University in Shanghai, and since Fudan is reportedly a top-three university in China, you can expect that the audience was sharp. For a bit of fun, and in keeping with the theme, we decided to hold a Largest-Number Contest: what is the largest number that one can describe on an ordinary index card? My host Ruizhi Yang made and distributed announcement flyers for the contest a week before the talk, with a poster of the talk on one side, and a description of the contest rules on the other, and including a small card to be used for submissions. The rules were as follows: 1. A submission entry consists of the description of a positive integer written on an ordinary index card, using common mathematical operations and notation or English words and phrases. 2. Write legibly, and use at most 100 symbols from the ordinary ASCII character set. Bring your submission to the talk. 3. Descriptions that fail to describe a number are disqualified. 4. The submission with the largest number wins. 5. The prize will be$1 million USD, divided by the winning number itself, rounded to the nearest cent, plus another small token prize.

Example submissions:

$99999$.

$10*(10*99)+5$

The population of Shanghai at this moment.

The submissions were collected at the beginning of my talk, which I began with a nod to Douglas Hofstadter, who once held a similar contest in his column at Scientific American magazine.

To win, I began, it seemed clear that the submitted number must be very large, and since the prize was to be one million dollars divided by the winning number, then this would mean that the prize should become small.  So the goal is just to win, and not necessarily to win any money.  The real prize is Pride.

So, what kind of numbers should one submit?  Of course, anyone could think to write $99999$, or one million ($10^6$), one billion ($10^9$) or one trillion ($10^{12}$).  Perhaps someone would want to write a googol, which is a common name for $10^{100}$.  Written in decimal, this is $$10000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000,$$ which is one character too long for the rules, so one should use the more compact formulations.  But let’s just write it as googol.

But we can easily describe much larger numbers.  A googol plex is the number $10^{\text{googol}}$, written in decimal with a $1$ followed by a googol many $0$s. A googol plex is also describable as $10^{10^{100}}$.  An iterated exponential $a^{b^c}$ could be seen as ambiguous, referring either to $(a^b)^c$ or to $a^{(b^c)}$. These are not always the same, and so exponentiation is not an associative operation.  But which is bigger?  If the aim is to make big numbers, then generally one wants to have as big an exponent as possible, and so $a^{(b^c)}$ will be much larger than $(a^b)^c$, which is just $a^{bc}$, and so we will always associate our exponentials upward. (Nevertheless, the upward associated form is not always larger, which you can see in the case of $2^{1^{100}}$.) In general, we can apply the plex operation to any number, so $x$ plex just means $10^x$.

A googol bang means googol !, or googol factorial, the number obtained by multiplying googol $\cdot$ (googol -1) $\cdots 2\cdot 1$, and in general, $x$ bang means $x!$.  Which is bigger, a googol plex or a googol bang? If you think about it, a googol plex is 10 times 10 times 10, and so on, with a googol number of terms. Meanwhile, the factorial googol bang also has a googol terms in it, but most of them (except for 10) are far larger than 10, so it is clear that a googol bang is much larger than a googol plex.

Continuing down this road, how about a googol plex bang, or a googol bang plex? Which of these is larger? Please turn away and think about it for a bit……..Have you got it?  I think like this:  A googol plex bang has the form $(10^g)!$, which is smaller than $(10^g)^{10^g}$, which is equal to $10^{g10^g}$, and this is smaller than $10^{g!}$ when $g$ is large. So a googol plex bang is smaller than a googol bang plex.

There is an entire hierarchy of such numbers, such as googol bang plex bang bang bang and googol bang bang plex plex plex.  Which of these is bigger?  Do you have a general algorithm for determining which of two such expressions is larger?

A googol stack is the number

$$10^{10^{10^{\cdot^{\cdot^{10}}}}}{\large\rbrace} \text{ a stack of height googol},$$

where the height of the exponential stack is googol.  Similarly, we may speak of $x$ stack for any $x$, and form such expressions as:

googol stack bang plex

googol bang plex stack

googol plex stack bang

Which of these is larger?  If you think about it, you will see that the stack operator is far more powerful than bang or plex, and so we want to apply it to the biggest possible argument.  So the second number is largest, and then the third and then the first, simply because the stack operator will overrule the other effects.  Let us call this hierarchy, of all expressions formed by starting with googol and then appending any number of plex, bang or stack operators, as the googol plex bang stack hierarchy.

Question. Is there a feasible computational procedure to determine, of any two expressions in the googol stack bang plex hierarchy, which is larger?

This is an open question in the general case.  I had asked this question at math.stackexchange, Help me put these enormous numbers in order, and the best currently known (due to user Deedlit) is that we have a simple algorithm that works for all expressions of length at most googol-2, and also works for all expressions having the same number of operators.  But this algorithm breaks down for large expressions of much different lengths, since it seems that one must come to know the exact threshold where an enormous iteration of bangs and plexes can ultimately capture a stack.  Of course, in principle one can compute these numbers exactly, given sufficient time and space resources, but the point of the question is to have a feasible algorithm, such as one that takes only polynomial time in the length of the input expressions.  And a feasible such general algorithm is not known.

The fact that we seem to have difficulty comparing the relative sizes of expressions even in the simple googol plex bang stack hierarchy suggests that we may find it difficult to select a winning entry in the largest-number contest.  And indeed, I’ll explain a bit later why there is no computable procedure to determine the winner of arbitrary size largest-number contests.

The numbers we’ve seen so far are actually rather small, so let’s begin to describe some much larger numbers.  The most basic operation on natural numbers is the successor operation $a\mapsto a+1$, and one can view addition as repeated succession:

$$a+b=a+\overbrace{1+1+\cdots+1}^{b\text{ times}}.$$

Similarly, multiplication is repeated addition

$$a\cdot b=\overbrace{a+a+\cdots+a}^{b\text{ times}}$$

and exponentiation is repeated multiplication

$$a^b=\overbrace{a\cdot a\cdots a}^{b\text{ times}}.$$

Similarly, stacks are repeated exponentiation.  To simplify the expressions, Knuth introduced his uparrow notation, writing $a\uparrow b$ for $a^b$, and then repeating it as

$$a\uparrow\uparrow b=\overbrace{a\uparrow a\uparrow\cdots\uparrow a}^{b\text{ times}}= a^{a^{a^{.^{.^a}}}}{\large\rbrace}{ b\text{ times}}.$$

So googol stack is the same as $10\uparrow\uparrow$ googol. But why stop here, we can just as easily define the triple repeated uparrow

$$a\uparrow\uparrow\uparrow b=\overbrace{a\uparrow\uparrow a\uparrow\uparrow\ \cdots\ \uparrow\uparrow a}^{b\text{ times}}.$$

In this notation, $10\uparrow\uparrow\uparrow$ googol means

$$\overbrace{10^{10^{.^{.^{10}}}}{\large\rbrace}10^{10^{.^{.^{10}}}}{\large\rbrace}10^{10^{.^{.^{10}}}}{\large\rbrace}\cdots\cdots{\large\rbrace} 10^{10^{.^{.^{10}}}}{\large\rbrace}10}^{\text{iterated googol times}},$$

where the number of times the stacking process is iterated is googol. This number is a vast number, whose description $10\uparrow\uparrow\uparrow$ googol fits easily on an index card.

But why stop there? We could of course continue with $10\uparrow\uparrow\uparrow\uparrow$ googol and so on, but I’d rather define

$$10\Uparrow x=10\ \overbrace{\uparrow\uparrow\cdots\uparrow}^{x\text{ times}}\ 10.$$

So that $10\Uparrow$ googol is vaster than anything we’ve mentioned above. But similarly, we will have $10\Uparrow\Uparrow$ googol and so on, leading eventually to a triple uparrow
$$10 \uparrow\hskip-.55em\Uparrow\text{ googol } = 10 \overbrace{\Uparrow\Uparrow\cdots\Uparrow}^{\text{googol iterations}} 10,$$
and so on. These ideas lead ultimately to the main line of the Ackerman function.

So we’ve described some truly vast numbers.  But what is the winning entry?  After all, there are only finitely many possible things to write on an index card in accordance with the limitation on the number of symbols.  Many of these possible entries do not actually describe a number–perhaps someone might submit a piece of poetry or confess their secret love–but nevertheless among the entries that do describe numbers, there are only finitely many. Thus, it would seem that there must be a largest one. Thus, since the rules expressly allow descriptions of numbers in English, it would seem that we have found our winning entry:

The largest number describable on an index card according to the rules of this contest

which uses exactly 86 symbols. We seem to have argued that this expression describes a particular number, and accords with the rules, so we conclude that this is indeed provably the winning entry.

But wait a minute! If the expression above describes a number, then it would seem that the following expression also describes a number:

The largest number describable on an index card according to the rules of this contest, plus one

which uses 96 symbols.  And this number would seem to be larger, by exactly one, of the number which we had previously argued must be the largest number that is possible to describe on an index card. Indeed, how are we able to write this description at all? We seem to have described a number, while following certain rules, that is strictly larger than any number which it is possible to describe while following those rules. Contradiction!  This conundrum is known as Berry’s paradox, the paradox of “the smallest positive integer not definable in under eleven words.”

Thus, we can see that this formulation of the largest-number contest is really about the Berry paradox, which I shall leave you to ponder further on your own.

I concluded this part of my talk by discussing the fact that even when one doesn’t allow natural language descriptions, but stays close to a formal language, such as the formal language of arithmetic, then the contest ultimately wades into some difficult waters of mathematical logic.  It is a mathematical theorem that there is no computable procedure that will correctly determine which of two descriptions, expressed in that formal language, describes the larger natural number.  For example, even if we allow descriptions of the form, “the size of the smallest integer solution to this integer polynomial equation $p(x_1,\ldots,x_k)=0$,” where $p$ is provided as a specific integer polynomial in several variables. The reason is that the halting problem of computability theory is undecidable, and it is reducible to the problem of determining whether such a polynomial equation has a solution, and when it does, the size of the smallest solution is related to the halting time of the given instance of the halting problem.  Worse, the independence phenomenon shows that not only is the comparison problem undecidable, but actually there can be specific instances of descriptions of numbers in these formal languages, such that both definitely describe a specific number, but the question of which of the two numbers described is larger is simply independent of our axioms, whatever they may be. That is, the question of whether one description describes a larger number than another or not may simply be neither provable nor refutable in whatever axiomatic system we have chosen to work in, no matter how strong. In such a case, I would find it to be at least a debatable philosophical question whether there is indeed a fact of the matter about whether one of the numbers is larger than the other or not. (See my explanation here). It is my further considered belief that every largest-number contest, taken seriously, unless severely limited in the scope of the descriptions that entries may employ, will founder on precisely this kind of issue.

For further reading on this topic, there are a number of online resources:

Meanwhile, lets return to our actual contest, for which we collected a number of interesting submissions, on which I’ll comment below.  Several students had appreciated the connection between the contest and Berry’s paradox.

Your number is quite large, since we can surely describe an enormous number of digits of $\pi$.  Researchers have announced, for example, that they have now found 10 trillion digits of $\pi$.  If by “describe physically” you meant that we should say the digits aloud, however, then supposing that we speak very quickly and say five digits per second (which is quite a lot if sustained for any length of time), then one person would be able to pronounce a little over 4 million digits this way in 10 days. If all humanity were to speak digits in parallel, then we could rise up to the order of $10^{16}$ digits in ten days.  This is a big number indeed, although not nearly as big as a googol or many of the other numbers we have discussed.

Surely $\infty$ is a large number, but it is not ordinarily considered to be a positive integer or a natural number. And so I think that your submission does not accord with the rules.

This is charming, and I’m sure that your girlfriend will be very happy to see this. Depending on the units we use to measure your affection, I find it very likely that this number is extremely large indeed, and perhaps you win the contest. (But in order to stay within the rules, we should pick the units so that your affection comes out exactly as a positive integer; in particular, if your affection is actually infinite, as you seem to indicate, then I am afraid that you would be disqualified.)  One concern that I have, however, about your delightful entry—please correct me if I am wrong—is this:  did you write at first about your affection for your “girlfriends” (plural), and then cross out the extra “s” in order to refer to only one girlfriend? And when you refer to your “girlfriend(s), which values at least $\omega$,” do you mean to say that your affection for each girlfriend separately has value $\omega$, or are you instead saying that you have affection for $\omega$ many girlfriends? I’m not exactly sure. But either interpretation would surely lead to a large number, and it does seem quite reasonable that one might indeed have more affection, if one had many girlfriends rather than only one! I suppose we could accept this entry as a separate submission about your affection, one value for each girlfriend separately, but perhaps it may be best to combine all your affection together, and consider this as one entry. In any case, I am sure that you have plenty of affection.

You’ve submitted Graham’s number, an extremely large number that is also popularly known as the largest definite number ever to be used in a published mathematical proof. This number is certainly in the upper vicinity of the numbers we discussed above, obtained by iterating the Knuth uparrow. It is much larger than anything we can easily write in the googol plex bang stack hierarchy.  This is essentially in line with iterating the $\Uparrow$ operator $64$ times, using base $3$. This number is extremely large, and I find it to be the largest number submitted by means of a definite description whose numerical value does not depend on which other numbers have been submitted to the contest. Meanwhile, we discussed some even larger numbers above, such as the triple uparrow $10 \uparrow\hskip-.55em\Uparrow$ googol.

Your clever submission shows that you realize that it isn’t necessary to actually mention the largest possible number to describe, as we discussed in Berry’s paradox above, but it suffices rather to exceed only the largest actually occurring numbers that are described by others in the contest.  What are we to do, however, when two people have this same clever idea?  If either of them succeeds in describing a number, then it would seem that they both must succeed, but in this case, they must both have described numbers larger than the other. This is the same contradiction at the heart of Berry’s paradox.  Another risky feature of this entry is that it presumes that some numbers were in fact described by others.  How are we to interpret it if all other submissions are disqualified?

Your description has the very nice feature that it avoids the self-reference needed in the previous description to refer to all the “other” entries (that is, all the entries that are not “this” one).  It avoids the self-reference by speaking of “second-largest” entry, which of course will have to be the largest of the other entries, since this entry is to be one larger.  Although it avoids self-reference, nevertheless this description is what would be called impredicative, because it quantifies over a class of objects (the collected entries) of which it is itself a member.  Once one interprets things as you seem to intend, however, this entry appears to amount to essentially the same description as the previous entry.  Thus, the problematic and paradoxical situation has now actually occurred, making it problematic to conclude that either of these entries has succeeded in describing a definite number. So how are we to resolve it?

Your description is very clever, trying to have it both ways.  Of course, every participant in the contest wants to win, but you also to win with a very low winning number, so that you can maximize the prize payout. With this submission, of course you hope that all other submissions might be disqualified, in which case you will win with the number 1, achieving the million dollar payout!  In the case of our actual contest, however, there are other cards that are qualified, such as the Graham’s number submission, and so your submission seems to be in case 2, making it similar in effect to the previous submission.  Like the previous two entries, this card is problematic when two or more players submit the same description.  And indeed, even with the other descriptions submitted above, we are already placed into the paradoxical situation that if this card describes a number in its case 2, then it will be larger than some numbers that are larger than it.

Your submission is very ambitious, attempting both to win and to win the most money.   (I take the description as the lower part of this card, and the upper note is a remark about it.)  You seem to realize that there will be ties in the game, with multiple people submitting a winning entry.  And indeed, if everyone had submitted this entry, then the winning number would indeed seem to be 1, which is the smallest number that would make them all be one of the members who win the game, so the million dollars would be split amongst that group. Some clumsy person, however, might submit a googol plex bang, however, which would bump up the value of this card to that same value, which although still winning, would win much less money. Your line of reasoning about rationality is very similar to the concept of superrationality defended by Douglas Hofstadter in the context of the prisoner’s dilemma.  Perhaps one might also be tempted to say something like:

the smallest number that makes me win, with the largest possible prize payout to me.

with the idea that you’d get a bigger prize solely for yourself if you avoided a tie situation by adding one more, so as to get the whole prize for yourself. But if this reasoning is correct, then one would seem to land again in the clutches of the Berry paradox.

To my readers:  please make your submissions or comments below!

# More math for six-year-olds: Win at Nim!

The latest installment of math for six-year-olds

Win at Nim!
Fold up the bottom flap to prevent parents from learning the super-secret strategy.

This morning once again I went into my daughter’s first-grade classroom, full of inquisitive six-and-seven-year-old girls, and made a mathematical presentation on the game of Nim.

Win at Nim!

The game of Nim, I explained, begins with one player setting up a number of stacks of blocks,while the opponent chooses whether to go first or second.  Taking turns, each player removes one or more blocks from a stack of their choosing. (It is fine to take a whole stack on your turn.) The player who takes the last block wins.

We demonstrated the game by playing a number of exhibition rounds, and then the girls divided into pairs to play each other and also me.  They were surprised that I was able to win against them every single time.  In explanation, I told them that this was because in the game of Nim, there is a super-secret mathematical strategy!  Did they want to learn?  Yes!  I took as my goal that they would all learn the Nim strategy, so that they could go home and confound their parents by beating them again and again at the game.

Since this was a first-grade class, we concentrated at first on games with stacks of heights 1, 2 and 3 only, a special case of the game which can still challenge adults, but for which six-year-olds can easily learn the winning strategy.

Two balanced stacks

After gaining some familiarity with the game by playing several rounds amongst each other, we gathered again for the secret strategy session. We began by thinking about the case of a game of Nim with only two stacks. They had noticed that sometimes when I played them, I had made copying moves; and indeed I had purposely said, “I copy you,” each time this had occurred.  The copying idea is surely appealing when there are only two stacks.  After some discussion, the girls realized that with just two stacks, if one played so as to equalize them, then one would always be able to copy the opponent’s move.  In particular, this copying strategy would ensure that one had a move to make whenever the opponent did, and so one would win the game.

A balanced position

In short order, the girls also realized that if one had any number of pairs of such balanced stacks—so that every stack had a partner—then the whole position was also winning (for one to give to the other player), since one could copy a move on any stack by making the corresponding move on the partner stack.  Thus, we deduced that if we could match up stacks of equal height in pairs, then we had a winning strategy, the strategy to copy any move on a partner stack.

In particular, this balancing idea provides a complete winning strategy in the case of Nim games for which all stacks have height one or two.  One should play so as to give a balanced position to one’s opponent, namely, a position with an even number of stacks of height one and an even number of stacks of height two.  Any unbalanced position can always be balanced in this way, and any move on a balanced position will unbalance it.

1+2+3 counts as balanced

To handle positions with stacks of height three, the super-secret trick is that one can balance a stack of height three either with another stack of height three, of course, but also with two stacks:  one of height one and one of height two.   Thus, one should regard a stack of height three as consisting of two sub-stacks, one of height one and one of height two, for the purposes of balancing. Thus, the Nim position 1+2+3 counts as balanced, since the 3 counts as 2+1, which balances the other stacks.  The 1+2+3 position has two stacks of height two and two of height one, when one regards the stack of height three as having a substack of height two and a substack of height one.

In this way, one arrives at a complete winning strategy for Nim positions involving stacks of height at most three, and furthermore, this is a strategy that can be mastered by first-graders. The strategy is to strive to balance the position.  If you always give your opponent a balanced position, then  you will win!  Faced with an unbalanced position, you can always find a balancing move, and any move on an balanced position will unbalance it.  If the game is just starting, and you are deciding whether to go first or second, you should determine whether it is balanced yet or not.  If it unbalanced, then you should go first and make the balancing move; if it is already balanced, then you should go second and adopt the copying strategy, in which you re-balance the position with each move.

More advanced players will want to consider Nim positions with taller stacks than three, and we talked about this a little in the classroom.  Some of the girls realized that the copying strategy and the idea of balanced positions still worked with taller stacks.  One can balanced stacks of height four against other stacks of height four, and so one, but the trick for these taller stacks is that one may balance 5 with 4+1; balance 6 with 4+2; and 7 with 4+2+1. Mathematicians will recognize here the powers of two.

To teach the strategy to children, it is a great opportunity to talk about the powers of two. Any child knows how to count 1, 2, 3, 4 and so on, and most can count by twos 2, 4, 6, 8, 10, …; by fives 5, 10, 15, 20, …; by tens, by threes; by sevens; and so on.  , The powers of two are the numbers 1, 2, 4, 8, 16, 32, 64, 128, and so on, doubling each time.  Climbing this exponential growth, children are often amazed at how quickly one reaches very large numbers:

One plus one is two;

two plus two is four;

four plus four is eight;

eight plus eight is sixteen;

sixteen plus sixteen is thirty-two;

thirty-two plus thirty-two is sixty-four;

sixty-four plus sixty-four is one hundred twenty-eight.

For Nim, we don’t in practice need such big powers of two, since one doesn’t usually encounter stacks of height eight or larger, and usually just 1s, 2s and 4s suffice. The relevant fact for us here is that every natural number is uniquely expressible as a sum of distinct powers of two, which of course is just another way of talking about binary representation of a number in base two.  We regard a Nim stack as consisting of its power-of-two substacks.  Thus, a stack of height 3 counts as 2+1; a stack of height 5 counts as 4+1; a stack of height 6 counts as 4+2; and a stack of height 7 counts as 4+2+1.

Ultimately, the winning general strategy for Nim is always to play so as to balance the position, where one regards every stack as being composed of its power-of-two sub-stacks, and a position counts as balanced when these stacks and sub-stacks can be matched up in pairs. This is a winning strategy, since every unbalanced position can be balanced, and any move on a balanced position will unbalance it.  To balance an unbalanced stack, play on any stack containing the largest size unbalanced power of two substack, and reduce it so as to balance the parity of all the stacks.  If one thinks about it, at bottom what we are doing is ensuring that if we represent the stack heights in their binary representation, then we should play so as to ensure that the position has a even number of one digits in each place.

# Math for six-year-olds

Today I went into my daughter’s first-grade classroom, full of six-year-old girls, and gave a presentation about Möbius bands.

We cut strips of paper and at first curled them into simple bands, cylinders, which we proved had two sides by coloring them one color on the outside and another color on the inside.  Next, we cut strips and curled them around, but added a twist, to make a true Möbius band.

A Möbius band

These, of course, have only one side, a fact that the children proved by coloring it one color all the way around. And we observed that a Möbius band has only one edge.

A Möbius-like band, with two twists

We explored what happens with two twists, or more twists, and also what happens when you cut a Möbius band down the center, all the way around.

Möbius band cut down the center

It is very interesting to cut a Möbius band on a line that is one-third of the way in from an edge, all the way around. What happens? Make your prediction before unraveling the pieces–how many pieces will there be? Will they be all the same size? How many twists will they have?

Overall, the whole presentation was a lot of fun. The girls were extremely curious about everything, and experimented with additional twists and additional ways of cutting.  It seemed to be just the right amount of mathematical thinking, cutting and coloring for a first-grade class.  To be sure, without prompting the girls made various Möbius earrings, headbands and bracelets, which I had to admit were fairly cool. One girl asked, “is this really mathematics?”

It seems I may be back in the first-grade classroom this spring, and I have in mind to teach them all how to beat their parents at Nim.