Synthese (2014) 191:629–660 DOI 10.1007/s11229-013-0379-9
Satan, Saint Peter and Saint Petersburg Decision theory and discontinuity at infinity Paul Bartha · John Barker · Alan Hájek
Received: 8 October 2013 / Accepted: 29 November 2013 / Published online: 17 December 2013 © Springer Science+Business Media Dordrecht 2013
Abstract We examine a distinctive kind of problem for decision theory, involving what we call discontinuity at infinity. Roughly, it arises when an infinite sequence of choices, each apparently sanctioned by plausible principles, converges to a ‘limit choice’ whose utility is much lower than the limit approached by the utilities of the choices in the sequence. We give examples of this phenomenon, focusing on Arntzenius et al.’s Satan’s apple, and give a general characterization of it. In these examples, repeated dominance reasoning (a paradigm of rationality) apparently gives rise to a situation closely analogous to having intransitive preferences (a paradigm of irrationality). Indeed, the agents in these examples are vulnerable to a money pump set-up despite having preferences that exhibit no obvious defect of rationality. We explore several putative solutions to such problems, particularly those that appeal to binding and to deliberative dynamics. We consider the prospects for these solutions, concluding that if they fail, the examples show that money pump arguments are invalid. Keywords paradoxes
Infinite decision theory · Money-pump arguments · Decision-theoretic
P. Bartha (B) Department of Philosophy, University of British Columbia, 1866 Main Mall, E-370, Vancouver, BC V6T 1Z1, Canada e-mail:
[email protected] J. Barker Department of Philosophy, University of Illinois at Springfield, University Hall Building 3033, One University Plaza, Springfield, IL 62703-5407, USA e-mail:
[email protected] A. Hájek School of Philosophy, Research School of Social Sciences, Australian National University, Canberra, ACT 0200, Australia e-mail:
[email protected]
123
630
Synthese (2014) 191:629–660
1 Introduction Standard decision theory is finitistic in various ways. The decision problems that it addresses involve finite utilities, a finite number of possible states, and a finite set of (pure) acts from which to choose. To deliberate about a problem involving infinite utilities, infinitely many possible states or infinitely many acts is to enter the realm of infinite decision theory, which gives rise to many (if not infinitely many) headaches. Indeed, except in special circumstances, it is not obvious how to extend the basic principles of finite decision theory to infinite cases. Infinite decision theory is presently best regarded as a class of rules of thumb rather than as a well-developed theoretical framework. Yet infinite decision theory has attracted growing attention as an arena for probing general principles of decision theory. Like others in the philosophical community, we welcome the headaches produced by venerable decision problems such as Pascal’s Wager and the St. Petersburg paradox, as well as by recent puzzles such as those posed by Barrett and Arntzenius (1999), Arntzenius et al. (2004), Nover and Hájek (2004). Accordingly, our objective in this paper is to characterize a new type of headache, a phenomenon that we shall call discontinuity at infinity. A discontinuity at infinity occurs when an infinite sequence of choices, each apparently sanctioned by plausible principles, converges (in a sense to be defined precisely) to a ‘limit choice’ whose utility is distinct from, and typically much lower than, the limit approached by the utilities of the choices in the sequence.1 This type of situation provides an important test case for the extension of finite decision principles to infinite decision theory. In particular, it poses a challenge for the application, in infinite settings, of money pump and Dutch Book arguments, which are commonly used to diagnose irrational preferences or credences. Our examples also exhibit an interesting type of infinite regress.2 At each stage, the agent has a unique best choice: an act which will produce a better overall outcome than any of its competitors, regardless of how the subsequent choices get made. That is, at each stage n there is a unique An such that every sequence of subsequent choices An+1 , An+2 , . . . can be extended in an optimal way to the sequence An , An+1 , An+2 , . . .. As long as the former represents a good solution, so does the latter. This gives the agent a relative justification for each of her choices An . But unfortunately, these choices are collectively very bad. The trouble is that while the sequence of actions An , An+1 , . . ., is justified if the subsequence An+1 , An+2 , . . . is, this sequence of subsequences never terminates in anything that is independently justified. This is analogous to the familiar case of an infinite regress in justification: you have a belief B1 which is justified by a belief B2 which is justified by a belief B3 1 The expression “discontinuity at infinity” is not common, although Atkinson and Johnson (2010) use
it in much the way we intend in a discussion of the physics of large versus actually infinite ensembles. They write: “mathematically there is no paradox: there is simply a difference between the infinite limit and the value at infinity—we might call the phenomenon a discontinuity at infinity.” Similar phenomena are identified in Batterman’s discussion of asymptotic reasoning, critical points and infinite idealizations; see Batterman (2002, 2005). Norton (2012, p. 212) speaks of “the diverging of limit properties and limit systems.” 2 The following observation applies specifically to one of the examples discussed below: Satan’s apple.
123
Synthese (2014) 191:629–660
631
which..... Even though each belief in the sequence is relatively justified (i.e., justified by another belief in the sequence), it may be that none is absolutely justified. Our paper proceeds as follows. Section 2 introduces three puzzling cases that will be the focus of our attention.3 Section 3 discusses how principles based on finite decision theory might apply to these cases. Section 4 characterizes discontinuity at infinity and a related phenomenon, an analogue of intransitive preferences that we call ‘transfinite intransitivity’. We explain how this phenomenon makes the examples even more challenging, and explore its implications for decision theory. Section 5 examines a number of strategies for solving the puzzles, the two chief ones being the idea that a rational agent will bind herself to a sequence of choices that avoids disaster, and the idea that she will take the probabilities of her future actions into account in deliberating about present choices. Neither strategy is an obvious success, although we shall argue that the second is superior to the first: it offers a plausible and potentially powerful technique for making headway on at least some puzzles within infinite decision theory. If neither of these two strategies works, we offer a negative conclusion: the puzzles constitute important counterexamples to the venerable money pump argument. 2 Three puzzles 2.1 Puzzle #1: Saint Peter’s offer You are met by Saint Peter as you arrive at the gates of Heaven. Unfortunately, Saint Peter tells you, it has not been settled how long you can stay. In acknowledgment of a few acts of kindness that you once performed, he invites you to write down a positive integer as large as you please, stay in Heaven for that many days, and then return to Hell—as your initial period of thinking and writing is also to take place in Hell. To keep things simple, Saint Peter asks you to use tally notation (1 stroke = 1 day in Heaven). During each 1-second interval, your choice is simply to Write a stroke or Stop writing. Problem: When should you stop writing? Whenever m > n, the strategy Wn of writing exactly n strokes is inferior to the strategy Wm of writing exactly m strokes. Correspondingly, at each moment n, the choice Stop writing appears inferior to the choice Write a stroke: it shuts the door on all of the superior strategies, whereas going on is compatible with (for example) writing one more stroke and then stopping. Yet the sequence {Wn } ‘converges’ to a bad limit. If you never stop adding strokes, you will be stuck in Hell writing strokes forever. We can characterize the problem as follows: N = never stop writing Wn = write n str okes and then stop N < W0 < W1 < W2 < · · · → N , 3 The examples are not entirely new. Similar examples have been considered in papers such as Pollock (1983), Barrett and Arntzenius (1999), and especially Arntzenius et al. (2004), from which our third example is taken without alteration.
123
632
Synthese (2014) 191:629–660
where → denotes ‘converges to’ and < is your preference relation on strategies.4 There is no optimal strategy, but that is only the beginning of your troubles. You can see that you ought to stop writing at some point. But for each n, it seems that it is rationally impermissible to stop at n, because there is a better choice. Thus, you are disposed to make choices that reject each Wn and lead to N ‘in the limit’. Taken one at a time, your choices are compatible with ever-better strategies, but taken together, they implement the worst possible strategy. 2.2 Puzzle #2: Saint Petersburg redux In the familiar version of the St. Petersburg game, a fair coin is tossed repeatedly. If the first Heads appears on toss n, you win $2n . The expected value of this game is infinite. Saint Petersburg redux works in the same way, except that just before the action starts, the coin-tosser approaches you. Wonderful news! He is a magician, and he is able to make the coin land however he wishes. Before each toss, you will have the option of paying him a small bribe (50 cents) to guarantee a result of Tails. You may continue the bribes for as long as you please. In each time interval, you face a single choice: Bribe or No bribe. Once you stop bribing, you are done (there are no further bribes). Problem: When should you stop bribing? If you never stop, you win nothing (and pay an infinite amount in bribes). But the strategy Bn of offering exactly n bribes appears inferior to the strategy Bm , for every m > n. Correspondingly, it seems that at each choice point, No bribe is inferior to Bribe. We can characterize the problem in terms similar to the first puzzle. N = never stop bribing Bn = Bribe exactly n times (do not bribe at n + 1) N < B0 < B1 < B2 < · · · → N As before, there is no optimal strategy, and you are once again set up for an infinite sequence of seemingly rational choices that leads to disaster. Although this puzzle has a structure similar to Saint Peter’s offer, there are two important differences. First, in this puzzle, the preference ordering < is based on expected outcomes.5 The choice of Bn could lead to a better outcome than the choice of Bn+1 , since it is a matter of chance exactly when the first Heads appears. Second, Saint Petersburg redux makes sense either as a supertask or stretched out forever. As a supertask, the tosses could take place at half a minute, three-quarters of a minute, and so forth, so that the entire game is finished within one minute. In the stretched-out 4 In this puzzle, the outcome is determined entirely by your (ultimate) choice of strategy. So we identify your preference ordering on strategies with your preference ordering on outcomes. 5 In fact, each B has infinite expectation. However, for each prize of $2k with k = n + 1, the strategy n Bn+1 guarantees twice the chance that Bn does of winning at least that much money. Thus, although both Bn and Bn+1 have equal infinite expectation, the fact that the latter offers uniformly better chances seems to justify the preference ordering B0 < B1 < · · ·. We acknowledge that this preference ordering rests
on less secure ground than that in the other two examples; accordingly, in the remainder of our paper, we concentrate on Puzzle #1 and Puzzle #3.
123
Synthese (2014) 191:629–660
633
version, the game continues for as long as it takes to produce a result of Heads and potentially forever (if Heads never appears). It makes no difference. By contrast, Saint Peter’s offer ceases to be paradoxical if implemented as a supertask: if the task were to be completely carried out in one minute, there would be no puzzle. By writing a stroke in every time interval, you could guarantee an infinite stay in Heaven, the best possible outcome. 2.3 Puzzle #3: Satan’s apple [as formulated by Arntzenius et al. (2004)] Satan cuts an apple into a countable infinity of slices and offers it to Eve, one piece at a time. Each slice has positive utility for Eve. If Eve eats only finitely many pieces, there is no difficulty; she simply enjoys her snack. If she eats infinitely many pieces, however, she is banished from Paradise. To keep things simple, we may assume that the pieces are numbered: in each time interval, the choice is Take piece n or Don’t take piece n. Furthermore, Eve can reject piece n, but take later pieces. Taking any countably infinite set leads to the bad outcome (banishment). Finally, regardless of whether or not she is banished, Eve gets to keep (and eat) her pieces of apple. Call this the original version of Satan’s apple. We shall sometimes discuss a simplified version of Satan’s apple, different from the original version in two respects. First, Eve is banished only if she takes all the pieces. Second, once Eve refuses a piece, she cannot take any more pieces. These restrictions make Satan’s apple a close analogue to the two earlier puzzles. Problem: When should Eve stop taking pieces? The situation is now familiar; the devil is in the details. If Eve never stops, she is ejected from Paradise. Yet the strategy Tn of taking exactly the first n pieces is inferior to the strategy Tm , for each m > n. Correspondingly, for n ≥ 1, the momentary choice Don’t take piece n appears to be inferior to Take piece n, since the latter is part of superior strategies. Formally, in the original version of the puzzle, we have the following: C = T ake a countable in f init y o f pieces Tn = T ake exactly the f ir st n pieces C < T0 < T1 < T2 < · · · → C In the simplified version, we have the same structure with C replaced by ALL, where ALL = Take every piece. A few comparisons between Satan’s apple and the other puzzles are in order. First, notice that Satan’s apple is a kind of dual to Saint Peter’s offer: the apple problem has bite only if it is set up as a supertask. If Eve’s choices were stretched out forever, there would be no puzzle, since Eve could then take every slice without fear of the consequences. Second, unlike the two earlier puzzles, the relevant utilities in Satan’s apple are bounded (above and below). For definiteness, we make the following assumptions about Eve’s utilities: • Eve’s overall utility U is the sum of the utility derived from the portion of apple taken and the utility derived from staying in or being banished from Paradise.
123
634
Synthese (2014) 191:629–660
• The utility of staying in Paradise is 0; the utility of banishment is −1,000. • The utility of any portion of apple is the sum of the utilities u n of the pieces n belonging to that portion. Specifically, u 1 = 5 and u n = 5/(n − 1)−5/n for n > 1. These assumptions imply the following: • • • •
The utility derived from eating the whole apple is 10. The utility of the status quo in which Eve takes no pieces, U (T0 ), is 0. U (Tn ) = 10 − 5/n for n > 0. U (AL L) = 10 − 1000 = −990.
Note that there is no worst outcome. The preference ordering looks something like the following picture (the higher on the ladder, the better). There is a top section that proceeds upwards from T0 , getting better as Eve adds pieces. There is a gap between T0 and ALL, representing the drop in utility from 0 to −990. Finally, there is a bottom section that proceeds downwards from ALL, getting worse as Eve omits pieces (though still taking infinitely many).
To summarize: in each of our puzzles, the agent faces an infinite sequence of choice points with the following structure: 1. At each point in the sequence, the number of options is finite and decision theory offers a clear prescription of what to do. 2. There is a countably infinite set of strategies A0 < A1 < A2 < A3 < . . . that is totally ordered (in the agent’s preference ordering), but no optimal strategy. 3. There is a ‘discontinuity at infinity’: the sequence of strategies A0 , A1 , . . . ‘converges’ to a limit strategy, A, whose utility is distinct from—in fact, worse than— the limit of the sequence of utilities U (A0 ), U (A1 ), . . . Symbolically: A0 < A1 < A2 < · · · → A, BUT limn→∞ U (An ) > U (A).
123
Synthese (2014) 191:629–660
635
4. For any finite n, the first n prescribed choices are part of An (and of An+1 , An+2 , . . .), but the strategy comprised of all of the prescribed choices is A. Although our focus will remain on the sequential (diachronic) version of the three puzzles, each of them also has a non-sequential (synchronic) version. In the nonsequential version, the agent makes a single decision by selecting a strategy, i.e., a complete sequence of choices. Of course, that decision is problematic because there is no optimal strategy. A crucial point in all three examples is the claim (point 1 of our summary) that decision theory offers a clear prescription at each choice point. There is an inference from the total ordering on strategies to the conclusion that at each choice point, one act (stopping/rejecting) is inferior to the other (going on/accepting). Because so much turns on this inference, we are about to give it careful scrutiny (Sect. 3). First, however, we pause briefly to articulate why philosophers should be concerned about such far-fetched cases, rather than dismissing them as ‘don’t cares’. We might ask the same of the St. Petersburg Paradox—or for that matter, Twin Earth or the Chinese Room, or the case of being hooked up to a famous violinist for nine months to save his life, or any number of other fanciful philosophical thought experiments. In fact, appealing to far-fetched cases is especially apposite in decision theory, since its own foundations are far-fetched! Our cases are unrealistic in virtue of their involving infinitude; but the same could be said of the countable additivity axiom of probability theory, and various ‘richness’ preference axioms of decision theory, such as the Jeffrey (1983) ‘splitting condition’. Furthermore, one of the main justifications for maximizing expected utility comes from the strong and weak laws of large numbers, which wear their involvement with infinitude on their sleeve. Another foundational assumption is that Bayesian agents are supposed to be logically omniscient; but don’t expect to find any such agents around here! In this regard, decision theory is in good company. Computability theory is founded on the notion of a Turing machine; but don’t expect to find any infinite tapes around here! If decision theory were a purely practical enterprise—to provide guidelines for daily life, say—perhaps we could rightly ignore problems generated by discontinuities at infinity. For that matter, if computability were a purely practical enterprise—to provide guidelines for computer technology, say—perhaps we could rightly ignore the halting problem, or uncomputable functions. While we’re at it, we could also turn in our philosopher’s badges. As philosophers interested in the coherence of our concepts, the commitments of the norms to which we subscribe, and the consequences of our theories, it is part of our job to take seriously foundational theoretical problems—even when they reside at infinity. 3 Infinite decision theory In this section, we review decision principles for finite decision problems and the plausibility of their extension to infinite decision theory. We then consider how these principles apply to our examples. For the sake of simplicity, we set aside St. Petersburg redux and focus on the other two puzzles, which are deterministic: the outcomes depend entirely on choices made by the agent. We take for granted the concepts of possible
123
636
Synthese (2014) 191:629–660
acts, possible states, decision tables and decision trees.6 We assume that the agent has a complete weak preference ordering (‘at least as good as’) on outcomes. 3.1 Finite decision problems 3.1.1 Non-sequential decisions: dominance reasoning Suppose that there are only finitely many (pure) acts available to the agent. In the absence of probabilistic information, dominance reasoning provides the paradigm for making rational decisions. Given a set of possible acts and a set of possible states, act B dominates act A (written A < B)7 if and only if: (i) the outcome of performing B is at least as good as A in each possible state; and (ii) the outcome of performing B is strictly better in at least one possible state. An act A is dominated if there is some act B such that A < B. An act D is dominant if and only if D dominates every other available act. The following two principles of dominance reasoning apply in finite decision problems: (REQ. DOMINANT) It is rationally required to choose a dominant act, if there is one. (IMPERM. DOMINATED) It is rationally impermissible to choose a dominated act. Notice that (IMPERM. DOMINATED) implies (REQ. DOMINANT); (REQ. DOMINANT) is the weaker principle. There is a familiar but important caveat: the set of possible states must be appropriate in order to apply either dominance principle. Specifically, the possible states must be causally independent of the agent’s choice. In a deterministic decision problem, of course, things are simpler. There is just one possible state (and no issue about independence). We have a trivial dominance ordering on available acts: it coincides with the agent’s preference ordering on the outcomes produced by those acts (given the single possible state). 3.1.2 Finite sequential decision problems: backwards induction A finite decision tree consists of a finite set of choice points or choice nodes.8 A strategy or profile is a complete sequence of choices, i.e., a complete branch of the tree.9 Suppose that the agent assigns a utility (or expected utility) to each strategy, 6 See Luce and Raiffa (1957) or Resnik (1987). 7 While we previously used the ‘<’ symbol to denote the operative preference relation on strategies, its
usage here is appropriate, since the preference ordering on strategies is also a dominance ordering (given that there is only one state). 8 We continue to focus on the deterministic case, so there is no need for chance nodes. 9 The term ‘strategy’ is sometimes used for a function that maps each choice point in the tree to some act
available at that choice point, while ‘profile’ is reserved for a complete sequence of choices. Here, we use the terms ‘strategy’ and ‘profile’ interchangeably, in the latter sense.
123
Synthese (2014) 191:629–660
637
based on the outcomes from some or all of the prescribed acts. Strategy B is superior to strategy A (written A < B) if and only if the expected utility of B is greater than the expected utility of A. In non-probabilistic cases, we can also say that B dominates A, since we may regard < as the (trivial) dominance ordering in the non-sequential problem of selecting a strategy. In the finite setting, backwards induction provides clear guidance on what to do at each choice point. Working backwards from the ultimate nodes and assuming the agent seeks to maximize utility with each choice, we have the following principles: (REQ. BELONGS OPTIMAL) It is rationally required to choose an act that belongs to (is part of) an optimal strategy, if one is available (i.e., consistent with an available act). (IMPERM. BELONGS DOMINATED) It is rationally impermissible to choose an act that belongs only to strategies all of which are dominated by some other available strategy. The argument for (REQ. BELONGS OPTIMAL) is obvious: to violate it is to ensure the selection of an inferior strategy even though the agent has the power to guarantee the selection of an optimal strategy. (IMPERM. BELONGS DOMINATED) is supported by similar reasoning: if act b is part of an available strategy that dominates all of the strategies to which a belongs, then to choose a is to ensure the selection of a strategy that is worse than one that the agent has the power to secure by choosing b. In a finite sequential decision problem there is always an optimal strategy, so (REQ. BELONGS OPTIMAL) and (IMPERM. BELONGS DOMINATED) yield the same verdict as to which acts are rational. (IMPERM. BELONGS DOMINATED) is, however, more generally applicable: even if the agent somehow makes a wrong choice that rules out optimal strategies, she can still be guided by (IMPERM. BELONGS DOMINATED). (IMPERM. BELONGS DOMINATED) ensures that each of the agent’s decisions is locally optimal. 3.2 Infinite decision problems 3.2.1 Infinite non-sequential decision problems Consider a non-sequential version of St. Peter’s offer: Pick a finite number and you can stay in Heaven for that many days. The default is 0 (if you fail to designate a finite number). Here, the possible acts H0 , H1 , H2 , . . . are the designation of 0 days in Heaven, 1 day, 2 days, and so forth. In this version, there is no problematic sequence of decisions and no discontinuity at infinity. But we still have a dominance ordering with no optimal act: H0 < H1 < H2 < · · · Clearly, (REQ. DOMINANT) and (IMPERM. DOMINATED) come apart. (REQ. DOMINANT) is vacuously satisfied no matter what you do; (IMPERM. DOMI-
123
638
Synthese (2014) 191:629–660
NATED) is violated no matter what you do. For infinite decision problems, we should retain (REQ. DOMINANT) and drop (IMPERM. DOMINATED). Faced with a forced choice among acts every one of which is dominated, you should not be convicted of irrationality no matter what you do. The following principle, intermediate between (REQ. DOMINANT) and (IMPERM. DOMINATED), seems plausible: (REQ. UNDOMINATED) It is rationally required to choose an undominated act, if there is one. (REQ. UNDOMINATED) may be helpful in some decision problems, but unfortunately it provides no guidance for St. Peter’s offer or our other puzzles, where there is no undominated profile. Note that giving up (IMPERM. DOMINATED) need not imply that decision theory must be silent, or that every choice of profile is permissible in St. Peter’s offer. One might still maintain that it is irrational to pick 0 days in Heaven. And in Satan’s apple, one might maintain that it is irrational for Eve to take all the pieces. These verdicts appear ad hoc, however, unless they can be backed up by some plausible substitute for (IMPERM. DOMINATED). For instance, we might defend some form of satisficing (Slote 1985, 1989; Meacham 2010). As our focus is on sequential decision problems, however, we pass on. 3.2.2 Infinite sequential decision problems Consider a decision tree consisting of an infinite set of choice points, but crucially with a finite number of acts available at each choice point. Assume that the agent can assign a value (a utility or expected utility) to each strategy, and hence has an ordering on the available strategies. In contrast to what we saw for non-sequential decision problems, it is plausible to extend both of our decision principles from the finite to the infinite case. The justification for (REQ. BELONGS OPTIMAL) is unproblematic: we want to make individual choices consistent with an optimal strategy, if there is one. What about (IMPERM. BELONGS DOMINATED)? Here is a re-statement: (IMPERM. BELONGS DOMINATED) It is rationally impermissible to choose act a at a choice point if there is some other act b, available at the choice point, that is part of a better strategy than any of the strategies containing a. (IMPERM. BELONGS DOMINATED) defines a partial ordering on the acts available at a choice point. Given that there are only finitely many such acts, there will be maximal elements with respect to this partial ordering. Just as in the finite case, the appeal of (IMPERM. BELONGS DOMINATED) is that it keeps our better strategic options open, even when there is no best option. In deterministic cases, (IMPERM. BELONGS DOMINATED) keeps the better options open in a very strong sense: they can be guaranteed by later choices of the agent alone, with no help from other agents or chance occurrences. Although (IMPERM. BELONGS DOMINATED) seems unimpeachable, later in the paper, we raise the question of whether there are grounds for rejecting it.
123
Synthese (2014) 191:629–660
639
3.3 Application to the puzzles Our puzzles involve decision trees consisting of an infinite set of choice points, with a binary decision at each choice point. In each case, the agent has no difficulty assigning a value to each strategy and providing an ordering on the available strategies. We first consider St. Peter’s offer and then the two versions of Satan’s apple. In St. Peter’s offer, (IMPERM. BELONGS DOMINATED) leads directly to the conclusion that at each stage, you should write another stroke. Suppose you are at step n. Stop writing is part of only one strategy: write exactly n − 1 strokes. Write a stroke is part of many strategies that are strictly better: write exactly n strokes, write exactly n + 1 strokes, etc. In light of (IMPERM. BELONGS DOMINATED), write a stroke is rationally mandated. In the simplified version of Satan’s apple, the reasoning is similar. To reject piece n is to embrace the strategy of taking exactly n−1 pieces. To accept piece n is compatible with many strictly better strategies, namely, those that accept exactly m pieces, where m ≥ n. (IMPERM. BELONGS DOMINATED) thus prescribes that Eve should accept piece n. In the original version of Satan’s apple, Eve has to make a separate choice for each piece, independently of the others. This suggests an appealingly simple dominance argument for taking each piece: Take piece n dominates Reject piece n relative to any fixed set of choices for all pieces other than piece n. The fixed sets of past and future choices for other pieces constitute the possible states for Eve’s present choice. A good way to appreciate this argument (suggested by Arntzenius et al. 2004) is to consider the following partition of possible states: {Eve takes finitely many slices besides n; Eve takes a countable infinity of slices besides n} In either element of the partition, Eve does better by taking slice n than by rejecting it. This dominance argument is acceptable only if there is no violation of the familiar caveat for dominance reasoning: what Eve does with the other pieces must be independent of what Eve does with piece n. We have to accept that, in evaluating her current choice, Eve can legitimately view propositions whose truth or falsity she can guarantee by her later choices as independent states of the world. Of course, this assumption can be questioned. Happily, for anyone with misgivings about the dominance argument, there is also a direct justification for Take piece n that uses only (IMPERM. BELONGS DOMINATED). The argument relies upon Eve’s particular utility function (see Sect. 2), but it will be immediately clear that a parallel argument exists regardless of the mathematical details, so long as her utilities are bounded. Suppose Eve contemplates the act: Reject piece 1. Consideration of Eve’s utility function shows that all strategies consistent with rejecting piece 1 have utility strictly < 5. Hence, all are dominated by the strategy Accept exactly piece 1, which has utility 5. By (IMPERM. BELONGS DOMINATED), Eve is rationally required to accept piece 1. Similarly, suppose Eve contemplates the act: Reject piece n, where n > 1.
123
640
Synthese (2014) 191:629–660
From Eve’s utility function once again, all strategies consistent with rejecting piece n have utility strictly less than 10−[5/(n − 1)− 5/n]. But Eve can come as close to 10 as she pleases, and thus closer to 10 than this, by taking some sufficiently large finite set of pieces. So all of the strategies that reject piece n are dominated by some strategy that includes taking piece n.10 By (IMPERM. BELONGS DOMINATED), Eve is rationally required to accept piece n. Furthermore, this argument succeeds for any utility function that assigns positive utility u i to piece i, so long as u i is finite. To summarize: each of our puzzles has a sequential version and a non-sequential version. We have offered no guidance to solving the non-sequential versions. For the sequential versions, however, we have a very plausible requirement of rationality (IMPERM. BELONGS DOMINATED) that requires the agent to ‘go on’ at each stage. Where does this leave us? If the above arguments are correct, decision theory puts us in a very strange place. The analysis of the sequential versions shows that there is only one rational course of action in each puzzle, and in each case that course leads to a sequence of choices that corresponds to the very worst strategy!11 We have avoided outright inconsistency so far (phew!), because we have given up (IMPERM. DOMINATED). Perhaps choosing any profile, even the worst one, is rationally permissible. We know that disaster is sometimes the price of rationality.12 But this is surely cold comfort. And things are about to get worse.
4 Discontinuity and (transfinite) intransitivity Roughly speaking, here is why things get worse. In Satan’s apple, at each choice point, Eve is (in principle) disposed to pay a small amount of money for the next slice of apple. Suppose that she does. When the supertask is completed and Eve finds herself expelled from Paradise, she is disposed to pay a further sum of money in order to be restored to the situation in which she started: back in Paradise, with no pieces of apple. Worse yet, Satan might start the whole process going again. In short, Eve’s situation is disquietingly similar to that of an agent with intransitive preferences: she can be turned into a money pump. Just as in the case of intransitive preferences, the money pump argument suggests, in a vivid manner, that there is something defective about Eve’s preferences. If the uneasy compromise reached at the end of Sect. 3 was disturbing, the situation is worse now because the conclusion appears to be that Eve is guilty of irrationality. In this section, we define three notions that allow us to make the above argument precise: convergence of a sequence of profiles, discontinuity at infinity, and transfinite intransitivity. We then re-examine whether the money pump argument succeeds, i.e., whether Eve’s preferences are indeed irrational.
10 In fact, all of them are dominated by any strategy Accept exactly the first N pieces, provided N > n(n−1). 11 Slight qualification: in the original version of Satan’s apple, taking all the pieces is a terrible outcome,
but Eve can do slightly worse by omitting some pieces while still taking infinitely many. 12 This appears to be the position of Arntzenius et al. (2004) for agents incapable of binding (see Sect. 5.1);
the predicament of such agents is analogous to that of two-boxers in the Newcomb problem.
123
Synthese (2014) 191:629–660
641
For brevity’s sake, we focus exclusively on the simplified version of Satan’s apple. In this version, Eve has only the following available profiles: Tn ≡ Take exactly the first n pieces. AL L ≡ Take every piece. Recall that the utility of eating the whole apple is 10, U (Tn ) = 10 − 5/n, and U (AL L) = −990. We begin by defining the relevant notion of convergence. Clearly, we should have Tn → ALL. In order to characterize what it means for a sequence An of profiles to converge to a limit profile A, we start with a convenient representation. A profile A is a sequence < a1 , a2 , . . . >, where each ai represents an action. In a simple case such as Satan’s apple, we let ai = 1 if Eve takes piece i and ai = 0 otherwise. Each profile is then a function from N to {0, 1}. Tn becomes the function Tn (i) = 1 if i = 1, · · · , n, and 0 otherwise, and ALL becomes the function AL L(i) = 1 for all i. The appropriate notion of convergence of Tn to ALL (written Tn → ALL) is pointwise convergence of the sequence of functions {Tn } to the function ALL. The sequence T1 , T2 , . . . converges pointwise to ALL iff for each positive integer i, the sequence of numbers {Tn (i)} converges to ALL(i), i.e., Tn (i) → 1. This condition for pointwise convergence is trivially satisfied, since Tn (i) = 1 for all n ≥ i. More generally (and more formally): Definition 4.1 Convergence of profiles. Let X1 , X2 ,… be spaces of actions equipped with a topology, and suppose that An =< an1 , an2 , . . . > and A =< a1 , a2 , . . . > are profiles with ani , ai ∈ Xi . Then An converges to A, written An → A, iff for each i, the sequence {ani } consisting of the ith members of the profiles An converges to ai . Convergence of profiles is just pointwise convergence of functions. Formally, this is simply convergence in the space X of profiles, which is just the product space of the Xi equipped with the product topology. In the case of Satan’s apple, each Xi = {0, 1} and the relevant topology is the discrete topology. With the definition of convergence in hand, it is easy to define continuity. Definition 4.2 Continuity (and discontinuity) of preferences. Suppose {An } is a sequence of profiles and An → A. Suppose that the agent’s preferences over profiles are represented by a real-valued utility function, U . Then the agent’s preferences are continuous at A if limn→∞ U (An ) = U (A)
123
642
Synthese (2014) 191:629–660
and discontinuous at A if13 limn→∞ U (An ) = U (A) (or the limit does not exist). Formally, this is just the definition of continuity applied to the function U : X → R; discontinuity at A is discontinuity at a point in the space X. In the case of Satan’s apple, we have Tn → AL L , limn→∞ U (Tn ) = 10, and U (ALL) = -990. So Eve’s preferences are discontinuous at ALL. Our definition of discontinuity at a profile applies to the non-sequential decision problem in which Eve chooses between profiles. We use the phrase discontinuity at infinity for the companion situation in the sequential version of the problem, where a sequence of choices, whose finite subsets belong to ever-better profiles, leads to a limit profile that is a point of discontinuity for the utility function. Recall that a preference ordering ≺ is transitive if for all outcomes A, B and C, if A ≺ B and B ≺ C, then A ≺ C. It is intransitive if for some outcomes A, B and C, A ≺ B and B ≺ C, but not A ≺ C. The analogous notion for infinite decision problems is stated for preferences among acts (or profiles) rather than outcomes. Definition 4.3 Transfinite (in)transitivity. An agent’s preference ordering < on acts is transfinitely transitive if whenever (1) An → A and (2) A1 < A2 < A3 < . . . , then An < A for all n. The ordering < is transfinitely intransitive if for some A1 , A2 , . . . and A, (1) and (2) hold but A < Ak for some k. Theorem 4.4 If the ordering < generated by utility function U is transfinitely intransitive, then U is discontinuous. Specifically, if An → A and A1 < A2 < A3 < . . . , but A < Ak for some k, then U is discontinuous at A. Proof Clearly, U (A) < U (Ak ). If U were continuous at A, then U (Ak ) ≤ limn→∞ U (An ) = U (A), a contradiction.
The converse of Theorem 4.4 is false, since discontinuity at A requires that limn→∞ U (An ) = U (A) (if the limit exists), but not that U (A) < U (Ak ) for any k. Consider a version of Satan’s apple with a bonus (instead of a penalty) for taking all the pieces. Eve’s preferences in this version would be discontinuous but not transfinitely intransitive. Of course, in this case, Eve would not face any paradoxical decision; the rationality of every sequential choice Take piece n would align with the rationality of the limit profile, ALL. For a partial converse to Theorem 4.4, see Appendix 1. 13 Continuity of preferences can also be defined directly from the preference ordering, without invoking a
utility function (see Appendix 1).
123
Synthese (2014) 191:629–660
643
4.1 Application to the puzzles In all three puzzles, the agents have transfinitely intransitive (and hence discontinuous) preferences on the relevant set of profiles. As we have seen in the case of Satan’s apple, this condition leads to the paradoxical situation and the vulnerability of the agent to a money pump set-up. Suppose we agree that it is rational for Eve to pay some small price for each piece: (∀RAT) For all n, it is rational to pay the price for piece n. It is tempting to infer from this that it is rational for Eve to pay the prices for all of the pieces: (RAT∀) It is rational to pay the price for all of the pieces. This is a paradoxical conclusion, since (given Eve’s preferences) it implies that it is rational for Eve to become a money pump. We might try to avoid the paradoxical conclusion in the following way. Our reasoning assumes that rational decisions agglomerate. Perhaps the lesson of the puzzles is simply that rational decisions may fail to agglomerate: the choices a1 , a2 , . . . may be individually rational (as we saw in Sect. 3), but the profile A =< a1 , a2 , . . . > consisting of all of these actions is not rational (as indicated by the money pump argument). Thus, it may be that each choice an = Take piece n is rational for Eve, but taking all the pieces is not rational.14 In other words, (∀RAT) may be true while (RAT∀) is false. While the diagnosis of a ‘failure of agglomeration’ in Satan’s apple seems correct, there are difficulties with the view that denial of (RAT∀) blocks the inference to the paradoxical conclusion. First, the diagnosis itself is debatable. For instance, Arntzenius et al. (2004) maintain that if Eve is incapable of binding (see Sect. 5.1), then she is rationally required to take all the pieces, in which case (RAT∀) is true. Indeed, as we saw in Sect. 3, the presupposition that taking all the pieces is irrational appears ad hoc, unless we can support it with a plausible decision principle. More significantly, in the sequential version of Satan’s apple, where Eve never needs to make a once-and-for-all choice of strategy, the paradoxical conclusion does not depend upon (RAT∀)! The existence of a money pump for Eve follows directly from (∀RAT) together with the discontinuity at infinity. Hence, the truth of (∀RAT) is all we need to cast doubt on the money pump argument. That argument proceeds from the premise: An agent is vulnerable to a money pump in virtue of her set of preferences to the conclusion: That set of preferences is irrational. Eve is vulnerable to a money pump in virtue of her set of preferences. But if (∀RAT) is true, her set of preferences is not irrational. The money pump argument proves too much! This observation in turn has an important implication for decision theory: if Satan’s apple is a counterexample to the money pump argument, then the money pump argu14 This point is discussed in Arntzenius et al. (2004) and in Hájek (2008).
123
644
Synthese (2014) 191:629–660
ment is unsound. Or, more charitably, the money pump argument needs to be carefully circumscribed so that it does not apply to cases like Satan’s apple. Finally: whether or not the diagnosis of a failure of agglomeration is correct, the problem is selecting the right treatment. What should Eve do? What should decision theorists say about money pump arguments? One possible prescription is to play it safe and stick to finite decision theory. This seems too restrictive. Theoretically important arguments in decision theory, such as the Dutch Book argument for countable additivity, make use of infinite sets of bets with no obvious problem. A better approach is to identify conditions under which money pump and Dutch Book arguments are legitimate. We suggest: Money pump arguments are legitimate in decision problems where the agent’s preferences are continuous (in the sense of Definition 4.2). Problematic cases invariably involve points of discontinuity. In answer to the question about Eve’s situation or any decision problem that involves discontinuity, we suggest: Find a plausible strategy for eliminating the discontinuity at infinity. In the case of Satan’s choice, this means denying (∀RAT). This would eliminate both the money pump and the discontinuity at infinity. The key question is whether this can be done without distorting the original decision problem to the point where the solution looks like cheating. The next section explores how this might be done for Eve, in the case of Satan’s apple. In the conclusion, we return to the question of what an agent should do when there is no way to eliminate discontinuity. 5 Possible solutions In this section, we consider a number of ways in which Eve might try to ‘solve’ the problem of Satan’s apple and avoid the threat of being exploited as a money pump. All four approaches have in common the denial of (∀RAT). All imply that there is some n such that Eve may rationally prefer Reject piece n over Take piece n. All must interpret the arguments for the rationality of Take piece n (see Sect. 3) as establishing only a prima facie case that is somehow overturned by taking a wider perspective about Eve’s choices. Thus, another thing that the approaches have in common is that all of them eliminate the problem by bringing to bear considerations not obvious in the original formulation of the puzzle. The first proposed solution (Sect. 5.1) involves binding. An agent binds herself if she commits herself ‘irrevocably’ to some strategy in advance. Perhaps Eve is rationally justified in stopping if she can bind herself to some high-level plan that specifies a stopping point. There is no money pump because Eve is not (even in principle) willing to pay anything to receive additional pieces beyond the stopping point. Section 5.2 briefly explores a ‘mixed-strategy’ approach for selecting the stopping point. Sections 5.3 and 5.4 introduce forms of deliberative dynamics. Suppose Eve can coherently assign credences to her future choices and make her decision at each stage by maximizing expected utility. If there is a stage at which utility is maximized
123
Synthese (2014) 191:629–660
645
by rejecting a piece of apple, then the argument of Sect. 3 for taking each piece is defeated. 5.1 Binding The binding approach starts with the conviction that, fundamentally, the difficulty in Satan’s apple and the other puzzles is a conflict between decision theory applied globally and locally. The conflict emerges when we contrast the sequential and nonsequential versions of the puzzles. At the sequential (local) level, at each choice point, as we have seen, there is a clear decision-theoretic argument for going on. In the nonsequential version, despite the lack of any clear guidance from decision theory, you know that you should eventually stop writing strokes, that you should not pay infinitely many bribes, and that Eve shouldn’t take countably many slices of apple. Even though there is no optimal global profile, some profiles are clearly irrational, even though they are constituted solely by locally rational actions. This contrast suggests the following two-step approach to avoiding the bad outcomes: (1) Solve the non-sequential version of the problem by selecting—never mind how— a profile that specifies a stopping point (i.e., that settles for finitely many days in Heaven/bribes/slices of apple). (2) Solve the sequential version by binding yourself to the profile selected at step (1). This means that you commit yourself to the plan and stick to it when you come to the stopping point. Not all agents are capable of binding, but the point at issue is whether agents capable of binding act rationally by choosing a stopping point in advance and sticking to it. And there is a plausible argument that they do. In regards to step (1), we have already gone halfway (in Sect. 3) by dropping the principle (IMPERM. DOMINATED) that it is rationally impermissible to choose a dominated act. The problem remains: what should we substitute in its place? Let us set this problem aside, however, and grant that any reasonable decision principle would rule out writing strokes forever, or paying infinitely many bribes, or taking all the slices. For step (2), which returns us to the sequential version of the problem, proponents of binding restrict their attention to agents capable of binding. For such agents, the dominance-based reasoning encapsulated in principle (IMPERM. BELONGS DOMINATED) (see Sect. 3) ceases to be a requirement of rationality. When the agent reaches her stopping point, she rationally chooses an act (stopping) that would be impermissible under (IMPERM. BELONGS DOMINATED). Presumably, Arntzenius et al. will reject their own dominance argument for taking each piece in the case of agents who can bind. Recall that their dominance argument presupposes that what Eve does with the other pieces is independent of what Eve does with piece n. They will reject this presupposition for agents who can bind, and this (it seems) allows them to reject (∀RAT). But they will uphold the presupposition and (∀RAT) for agents incapable of binding. Indeed, the story is different for agents incapable of binding. For these agents, dominance reasoning remains a norm of rationality. Even if they find a way to select a stopping point at step (1), they are unable rationally to stick to that stopping point at
123
646
Synthese (2014) 191:629–660
step (2). Arntzenius et al. (2004) think that this is no problem for decision theory. In their view, the fact that such agents will be led to ruin through a series of individually rational choices just reinforces a lesson that causal decision theorists have learned from Newcomb cases: sometimes rationality is punished.15 Binding will seem an attractive solution to many people, but we offer three reasons to think that it fails.16 First, there is the problem identified at step (1): where to bind? What stopping point? This problem is acknowledged by some notable advocates of binding, but not solved. Here is what Arntzenius et al say about Eve’s predicament in Satan’s apple: Suppose that for each offer, “taking” dominates “rejecting”: each offer is such that—no matter what other offers she takes or rejects—she does better to take it than to reject it. We saw that it does not follow that she should accept all of the offers. Instead she should select a combination of offers that produces an outcome that she desires. (269) But which combination and which outcome? Eve desires each slice of the apple. Presumably, then, for each finite number n, taking exactly n slices produces “an action that she desires.” Is there some principle that licenses as rational the selection of some profile with a stopping point, while barring as irrational the profile with no stopping point? Second, binding solves too much. Suppose that you have finitely intransitive preferences: for some A, B, and C, you prefer A to B, B to C and C to A. A money pump threatens. But if you are able to bind yourself, you have nothing to fear! Simply commit in advance to turning down an opportunity to ‘upgrade’ from one option to another one that you prefer, tempting though it will be. For example, you may bind yourself to turn down an offer to trade A for C, however cheaply, should that offer ever arise. The would-be money pumper has no hold on you; the pump runs dry at exactly that point! Or consider the objection to the Dutch Book argument that the probabilistically incoherent agent ‘will see the Dutch Bookie coming’, and will refuse to take all the bets that guarantee her loss. If binding is fair game, this objection would seem to have real force. And yet we believe it doesn’t; nor does binding save the day for intransitive preferences. The real problem lies with the probabilistic incoherence itself, or the intransitive preferences themselves. The Dutch Book, or the money pump, dramatizes the problems, but they are merely symptomatic of them. And so it is with Eve. The paradoxical nature of Satan’s apple is all about Eve’s preferences. Suppose that for some reason she can act against her preferences. Still, the problem with those preferences remains. Binding does not solve that problem. Eve really is, as it were, in a bind. If, however, you insist that in virtue of her act of binding, Eve acts in accordance with her preferences, then this just exposes another problem with the binding solution, our third reason for thinking that it fails. So understood, it involves Eve’s changing her preferences, and it is thus no solution to the original problem in which her preferences were stipulated to be a particular way. Suppose, for example, that Eve binds herself 15 Whatever norm is used to find the profile Take all the pieces rationally impermissible must be restricted to agents capable of binding. 16 Meacham (2010) offers a sustained critique of the binding approach; here, we focus on objections distinct from his.
123
Synthese (2014) 191:629–660
647
to take 17 slices and no more. Yet supposedly she genuinely prefers 18 slices—this follows from the statement of her decision problem. Why, then, doesn’t she take the 18th slice when she is offered it? Why doesn’t she do what she prefers? Assuming that she still has free will, and that she is not weak-willed, the only way this makes sense is for her preferences to have changed in virtue of her act of binding herself. But that is not playing fair—it is to change the original decision problem. Compare: it would be no solution to that problem to be told that Eve takes a pill so that she gets sick of apple slices after taking 17 of them, and so she should refrain from taking the 18th. We may pose this as a dilemma: Either Eve’s preferences change in virtue of the act of binding, or they do not. If they do change, then we have not solved the original problem, but changed it to an easier one in which she has an unproblematic preference structure. If they do not change, then in stopping at a binding point she acts against her preferences. We may ask why this is: does she suddenly lose her free will, or does she suddenly become akratic? In any case, she is surely no longer rational, and as such she is unfit to teach us lessons about what rationality demands or permits. There is another way to view the dilemma for the binding solution. Let’s return to our earlier observation that Arntzenius et al. rightly reject their own dominance argument for taking piece n in the case of agents who can bind. This seems to give them room to reject (∀RAT). But Sect. 3.3 presented a second and more direct argument that Eve is rationally required to take piece n, and that argument is not dismissed as easily as the dominance argument. It depends upon just two assumptions: Eve’s stipulated preferences (specifically, her preference for taking exactly n + 1 pieces rather than taking exactly n pieces) and (IMPERM. BELONGS DOMINATED). It follows that in order to reject the conclusion that Eve is rationally required to take piece n, proponents of binding must either alter Eve’s preference ordering over strategies or reject (IMPERM. BELONGS DOMINATED). And this generates a dilemma similar to the one posed in the preceding paragraph. To alter Eve’s preferences changes the decision problem, while to reject (IMPERM. BELONGS DOMINATED) abandons a principle of rationality without any plausible rationale. 5.2 Using a mixed strategy Here is a fanciful proposal to help with the first of the objections directed at the binding approach: where to bind? Let X be a uniform, normalized distribution over the natural numbers, assigning equal infinitesimal probability to each positive integer. Then P(X > n) ≈ 1 for all n. Specifically, if η is a hyperreal infinitesimal, we define a non-standard probability distribution on finite and co-finite sets of natural numbers in the following way (where E represents the complement of E):17 P(X ∈ E) =
η|E|, i f E is f inite 1 − η|E|, i f E is co−finite.
17 This is one simple way to achieve a distribution with the desired properties; note in particular that
P(X ∈ N) = 1.
123
648
Synthese (2014) 191:629–660
The idea is that Eve should use the uniform distribution X as a mixed strategy to select her stopping point. She avoids the disastrous outcome of taking all the pieces. She avoids the charge of arbitrariness that dogs the agent who wants to bind herself but has no way to select a stopping point ‘at random’. Furthermore, for any N , the probability of choosing an integer larger than N is infinitesimally close to 1, so Eve can be confident of obtaining a very large number of pieces if she proceeds in this way.18 There is an obvious objection: we have no randomizing device that implements such a distribution. But even if we set aside this difficulty (after all, we are talking about rather fanciful scenarios), there are two philosophical obstacles that make this proposal as problematic as binding. The first is that to introduce a mixed strategy changes the problem. Indeed, it is arguable that if the mixed strategy is used to select a stopping point, Eve needs to bind herself to this far-off stopping point. Hence, all of the objections that apply to binding apply to this proposal. The second obstacle is equally devastating: which non-standard distribution X should we employ? Why not pick a uniform distribution over the natural numbers minus {1, …, N } for large N ? This is the very problem of arbitrariness that this proposal was meant to address, showing up at a higher level. 5.3 Evidential decision theory In contrast to the binding approach, there is a very different way for Eve to apply the brakes at some step in the infinite series of choices. On the evidential decision theory approach, Eve views her past choices as evidence about her future choices. She should stop going on when the ‘news value’ of choosing the next piece is sufficiently bad. Her credence for ALL (Eve takes all the pieces) rises with each slice she takes. Her computation of the expected utility of taking the next slice should take this evidence into account. If the conditional credence in ALL converges to 1, then Eve will come to a piece n such that the evidential expected utility of Take piece n is less than that of Reject piece n. This leads to re-examination, and ultimately rejection, of the argument presented at the end of Sect. 3 for taking the next piece. That argument fails because it fails to take her credences into account, and thus fails to make expected utility maximization the criterion of rationality. Let us make this argument precise for the simplified version of Satan’s apple, in which Eve has to stop as soon as she refuses any piece. Eve has only the following profiles, which also represent the possible outcomes of the game: AL L ≡ Eve takes all the pieces. Tn ≡ Eve takes exactly n pieces. It is convenient to introduce notation for the sequential decisions: Sn ≡ Eve takes piece n. 18 Defining Eve’s expected utility raises technical difficulties, so we put the point in terms of the number of slices expected.
123
Synthese (2014) 191:629–660
649
We assume that Eve assigns a non-trivial prior probability that she takes all the pieces: P(AL L) = p
where 0 < p < 1.
Here, P represents Eve’s subjective probability assignment to a suitable family of propositions. Let p1 , p2 , . . . represent Eve’s distribution over the possibilities Ti if she does not take all the pieces: P (Ti / ∼ AL L) = pi where
∞
pi = 1.
i=1
Then Eve’s updated probability for ALL if she takes piece n (having already taken piece 1 through n − 1) is obtained via conditionalization:19 P (AL L/ Sn ) = p/[ p + (1 − p) i≥n pi ]. Since i≥n pi converges to 0 as n → ∞, P(AL L/Sn ) converges to 1. Note that nothing more than countable additivity is needed here. If Eve is going to stop, then (by countable additivity) it is vastly more likely to happen ‘early’ than ‘late’ in her sequence of choices. Now suppose that Eve uses the news value of her contemplated actions to compute expected utilities. She has already taken pieces 1 through n–1 and contemplates taking piece n. Considering all of the ultimate possibilities, she computes: EU(Sn ) = k≥n P(Tk /Sn )U (Tk ) + P(AL L/Sn )U (AL L) EU(∼ Sn ) = 10 − 1/(n − 1). Since U (Tk ) ≤ 10 and U (ALL) = −990, there will clearly be a stage n for which EU(∼ Sn ) > EU (Sn ). At this stage, if Eve is guided by evidential expected utilities, then she will not take piece n. Does this argument for (eventually) stopping succeed? We think not, for many reasons. We might wonder whether Eve can treat her patterns of future choices as states to which she can assign probabilities.20 A second concern is that Eve should make her decisions based on their anticipated causal consequences, not their news value. But even if we set aside these concerns, there is a more fundamental problem: Eve’s probability assignment is self-undermining. Specifically, any probability assignment with P(ALL) > 0 justifies an inference about what Eve will do that is inconsistent with that very probability assignment! In more detail: if Eve uses evidential decision theory to make her decisions, the evidence that she uses to assign probabilities p and pi to her actions is also, via the P(Sn /AL L)P(AL L) 19 This follows from P(AL L/S ) = n P(Sn /AL L)P(AL L) + P(Sn /∼AL L)P(∼AL L) , since P (Sn /AL L) = 1 and P (Sn / ∼ AL L) = i≥n P (Tn / ∼ AL L) = i≥n pi . 20 See Spohn (1977), Gilboa (1994) and Levi (1997) for related discussion. In fact, while these commen-
tators prohibit the assignment of a credence to acts about which one is presently deliberating, it is not clear whether the ban would apply to future acts.
123
650
Synthese (2014) 191:629–660
calculation just given, evidence that she will stop and, indeed, evidence that tells her which stage will be the stopping point. If she takes this evidence into account, then Eve should revise her probabilities. The updated values are pn = P (Tn ) = 1 if she will take exactly n pieces (stopping at stage n + 1), and p = 0 and pi = 0 for the other stages. But this is not the probability assignment that led to Eve’s decision to stop with n pieces.21 Eve’s probability assignment P is self-supporting if it does not change under the evidential decision theory calculation: P = P. We’ve just seen that if P(ALL) > 0, P is not self-supporting. But if P(ALL) = 0, we face a similar problem of instability. Recall that in deciding whether to take piece n, Eve compares two expected utilities: EU (Sn ) = k≥n P (Tk /Sn ) U (Tk ) + P (AL L/Sn ) U (AL L) and EU (∼ Sn ) = U (Tn ) = 10 − 1/ (n − 1) . Suppose that the conditional probabilities in the first equation are all defined. Since P(ALL / Sn ) = 0, EU(Sn ) is an average of utilities U (Tk ) all greater than or equal to U (Tn ); hence, EU(Sn ) > EU(∼ Sn ). For any n, Eve should take piece n. But this provides evidence that Eve will never stop, undermining P(ALL) = 0.22 In some decision contexts, arguably, it makes sense for an agent to assign probabilities to her future choices. We have seen, however, that such an assignment is unacceptable if used as input for a decision that can alter those very probabilities, unless it is self-supporting under expected utility reasoning. A straightforward evidential decision-theoretic approach appears doomed. But there is an alternative approach in which Eve evaluates her present and future choices using deliberative dynamics. 5.4 Deliberative dynamics Arntzenius et al. consider the hypothesis that Eve’s present choice could causally influence her future choices: It may be that Eve thinks that her present choice can influence her subsequent choices. If so, Eve should not hold fixed her subsequent choices in deciding what to do. Instead she should take into account how her present choice will influence them (2004, 266). They contrast this with the ‘no influence’ case:
21 There is also a separate problem of determining the values of the conditional probabilities P T /S k n+1 and P AL L/Sn+1 when P Sn+1 = 0. 22 This argument fails if P(S ) = 0: we could then have P(ALL) = 0 but P ( AL L/S ) = 0. Rather than n n
explore this possibility here, we consider a similar idea in the next section.
123
Synthese (2014) 191:629–660
651
…if she always believes that her present choice has no influence over her future choices—call this the ’no influence’ case—then she is rationally required to take every piece (p. 267). Arntzenius et al. consider only one form of influence: binding. In this section, we consider a different approach that takes into account both the influence of Eve’s present choice over her future choices and the influence of her expectations about her future choices on her present choices. In the context of deliberation, influence flows in both directions. The basic idea, due to Skyrms (1990), is that the process of deliberation itself may yield evidence about what the agent is likely to do, evidence that is relevant to the calculation of expected utilities. Agents move towards maximizing expected utility in a way that utilizes this evidence. Applying this idea to Satan’s apple, suppose that Eve has concerns that she will never stop taking pieces of apple. As in Sect. 5.3, she assigns credences to her future acts. In contrast to Sect. 5.3, however, those credences ‘evolve’ dynamically in the course of her deliberations. She uses calculations of her future expected utilities as evidence about what her future choices will be (or would be if she were to get that far)—that is, as evidence for whether she will (or would) take or reject pieces of apple offered in the future. This may lead to revised credences about those acts, and to revised expected utility calculations. A necessary condition for making any firm decision is that prior to making it, these credences constitute an equilibrium. This approach differs significantly from Binding. Eve does not change her preference ordering on outcomes: she still prefers more pieces to fewer pieces. Nor does she change her preference ordering on profiles. What changes for Eve is only her momentary preference for Take piece n over Reject piece n. Even though it takes future acts into account, the perspective of Eve’s decision problem remains local. She bases her choice on equilibrium expected utilities for her current options. On this analysis, the argument of Sect. 3 for taking each piece is fallacious because it ignores expected utilities. The deliberative dynamics thus rejects (IMPERM. BELONGS DOMINATED) in a principled way. The deliberative dynamics also differs from the simple evidential decision theory approach of Sect. 5.3. Both approaches involve the assignment of credences to future choices and appeal to expected utility as the criterion for choice. In the case of deliberative dynamics, however, Eve is guided by expectations of her future choices rather than by the evidence of her past choices. We begin with a brief review of deliberative dynamics (Skyrms 1990). In simple cases, beliefs are modified according to the following iterative process (assuming negligible processing costs):23 1. The agent selects a probability distribution P over presently available acts, reflecting an initial state of indecision. 2. The agent computes the utility of the status quo U (S Q) = P (Ai ) U (Ai ), a weighted average of the utilities attached to the available acts. This is just the expected utility of the mixed act corresponding to the state of indecision. 23 More generally, the agent revises her credences for states as well as acts (Skyrms 1990).
123
652
Synthese (2014) 191:629–660
3. The agent raises the probability of acts that have expected utility greater than the status quo, and lowers the probability of acts that have expected utility lower than the status quo. The process is made explicit by a dynamical updating rule φ that maps P to another probability distribution P . 4. The agent halts deliberation when the probabilities reach stable (equilibrium) values, i.e., when P = P. Skyrms’ dynamical updating rule is given by24 : P (Ai ) = P (Ai ) · [U (Ai ) /U (S Q)] . Skyrms (1990) offers a Bayesian defense for this particular updating rule, the “replicator dynamics.” Here we take it for granted. We note, however that most of our arguments depend only upon two features of this rule. First, there is no change for extreme probabilities (0 and 1). Second, for all acts with probability intermediate between 0 and 1, the agent raises/lowers the probability if the expected utility is greater/smaller than the utility of the status quo. How should the deliberative dynamics be applied to Satan’s apple? To keep things manageable, we restrict our attention to the simplified version in which, once Eve refuses a single piece of apple, the process is over. The principal modification is that Skyrms’ ideas must be applied to the whole sequence of choices, since that sequence enters into Eve’s deliberations about what to do. Adapting the steps just listed, we have the following process: 1 Eve assigns credences (or conditional credences) to all of her possible acts. As before, let Sn ≡ Eve takes piece n. We represent Eve’s assignment as25 q1 = P (S1 ) and qn = P(Sn /S1 · S2 · . . . · Sn−1 ) if n > 1. , this is a complete assignment. Since the only options at stage n are Sn and ∼ Sn Notice also that Eve assigns prior probability q = ∞ n=1 qn to taking all the pieces. 2 For each stage, Eve computes expected utilities for her two possible choices. Letting Tk ≡ Eve takes exactly k pieces, we have the following:26 EU (Sn ) = k≥n P (Tk /Sn ) · U (Tk ) + P (AL L/Sn ) · U (AL L) EU (∼ Sn ) = 10 − 5/ (n − 1) if n > 1 (or 0, if n = 1) Notice that we can compute the relevant probabilities from the values qn : 24 Note: this requires a utility function that is positive and bounded away from 0. 25 The conditional probabilities are primitive; they are defined even if the proposition conditioned upon
has probability 0. Indeed, qn may be interpreted as Eve’s credence that she would take piece n if she were to take pieces 1, …, n − 1. These probabilities are not updated based on the evidence of Eve’s past choices; instead, as we shall see, they are updated on the basis of her deliberations. 26 Note that the replicator dynamics requires a utility function that is positive and bounded away from 0.
This is easily obtained by adding +1001 to Eve’s utility function as it was defined in Sect. 2.
123
Synthese (2014) 191:629–660
653
P (Tn /Sn ) = (1 − qn+1 ) P (Tn+1 /Sn ) = qn+1 (1 − qn+2 ) , etc. ∞ P (AL L/Sn ) = qi . i=n+1
Notice also that only credences about later choices play a role in Eve’s calculation; her credences for acts at stage n have no relevance to the deliberations at stage n. The value of the status quo follows: EU (SQn ) = qn EU (Sn ) + (1 − qn ) EU (∼ Sn ). 3 Eve revises her credences by applying the replicator dynamics to each stage. Let the revision rule be φ. We have: φ (qn ) = qn = qn EU (Sn ) / EU (SQn ) . 4 Eve halts deliberation when the probabilities reach stable equilibrium values: φ (qn ) = qn = qn for all n. As an important special case, and to illustrate how the dynamics works, suppose that Eve starts with a fixed probability r of taking each piece of apple, assuming that she has accepted all prior pieces. That is: qn = r for all n. If 0 < r < 1, then, at each stage, the probability of taking all subsequent pieces is 0. This implies (via 2 ) that the expected utility of Take piece n is greater than that of Reject piece n, for every n. This in turn implies that Eve should raise all of her credences in Take piece n, which, according to the deliberative dynamics, implies that the value of each qn should be revised upwards. It follows that no such assignment constitutes an equilibrium. The cases r = 0 and r = 1 are special. Suppose first that qn = 1 for all n > 1 (the value of q1 plays no part in Eve’s deliberations). Eve foresees that she will take each piece if she takes piece 1, and also (from 2 ) that the expected utility of Reject piece n is greater than that of Take piece n, for every n. The revision rule (3 ), however, leaves her extreme credences unchanged: qn = qn = 1 for all n > 1. We have an equilibrium—our first example—in which Eve opts to Reject piece 1. If instead qn = 0 for all n > 1, then Eve foresees that she will reject each piece if she takes piece 1, and also (from 2 ) that the expected utility of Take piece n is greater than that of Reject piece n, for every n. Once again, the revision rule (3 ) leaves extreme credences unchanged: qn = qn = 0 for all n > 1. We have another equilibrium, but this time expected utility computations lead Eve to Take piece 1. Eve foresees that she will freely choose to take each piece, all the while maintaining credence zero that she will do so. We shall have more to say about this very strange equilibrium shortly. Clearly, every extreme assignment—where for every n, qn = 0 or qn = 1— constitutes an equilibrium, because such assignments never change under the replicator dynamics. It turns out that the converse is also true. Theorem 5.1 Every equilibrium is an extreme assignment. (For all conditional probabilities qn = P(Sn /S1 · . . . · Sn−1 ), either qn = 0 or qn = 1.) Proof See Appendix 2.
123
654
Synthese (2014) 191:629–660
Our first observation is that there are no stable equilibria. Every one of these extreme assignments is unstable: the equilibrium is not always restored when the assignment is altered in a small way. For instance, suppose all qn = 1. If Eve starts instead with a ‘nearby’ assignment that assigns qm < 1 (slightly < 1) to a single qm and qn = 1 for all n = m, the revision process leads to a different equilibrium in which qm = 0. Similarly, suppose all qn = 0. If Eve starts with a ‘nearby’ assignment that assigns qm > 0 (slightly above 0) to a single qm , the revision process leads to a different equilibrium in which qm = 1. More formally, we have the following definition. Definition 5.2 Stable equilibrium. An assignment {qn } is a stable equilibrium if (a) φ(qn ) = qn for all n, and (b) for some ε
> 0, if {qn *} is any ‘nearby’ assignment (qn * = qn for all but finitely many n and |qn ∗ −qn | < ε), then {qn *}, {φ(qn ∗)}, {φ(φ(qn ∗))}, … converges pointwise to {qn }.27 The arguments of the preceding paragraph extend to a proof that no extreme assignment can be a stable equilibrium. Suppose that {qn } is an extreme assignment. Either limn→∞ qn = 1 or not. In the former case, there is an N such that qn = 1 for n > N . If one of these values (beyond N ) is modified to a value qm slightly < 1, the revision process leads to an equilibrium in which qm = 0. In the latter case, there are infinitely many qn = 0. If one of these is modified to a value qm > 0, the revision process leads to an equilibrium in which qm = 1. The basic requirement of the deliberative dynamics is that a rationally permissible decision should maximize expected utility, using an equilibrium assignment of probabilities to all of Eve’s subsequent choices. The further requirement of stability has some appeal. After all, it seems unrealistic for Eve to hold fixed and unchangeable beliefs about her future choices. But given that all equilibria fail the stability test, we have to reject the restriction to stable equilibria. It does seem, however, that some extreme assignments are less objectionable than others. For instance, the assignment qn = 1 for all n > 1 is less objectionable than the assignment qn = 0 for all n > 1. With the former, expected utility calculations lead Eve to reject the first piece, but she can consistently maintain that had she accepted that piece, she would have gone on to accept all of the others. With the latter, expected utility calculations foreseeably lead Eve to take every piece, yet she somehow holds on to her zero credences! This is surely a form of inconsistency. A similar difficulty exists for any assignment where qi = q j = 0 for any i < j: expected utility calculations foreseeably lead Eve to take piece i, all the while maintaining qi = 0.28 Let’s say that an extreme assignment {qn } for Eve is practically consistent if it does not imply the existence of a choice that is both foreseeable and directly contrary to the credences {qn }. All practically consistent equilibria have either the form 27 This definition is meant to correspond to the familiar notion of an evolutionarily stable strategy; see Skyrms (1996). 28 She takes piece i because the probability of taking the later piece j is zero, hence Pr(ALL / S ) = 0, and i therefore (from 2 ) the expected utility of taking piece i is higher than that of not taking it. She maintains
qi = 0 because, as we have seen, extreme assignments never change under the updating rule.
123
Synthese (2014) 191:629–660
655
{1, 1, . . . , 1, 1, 1, 1, . . .} or {1, 1, . . . , 1, 0, 1, 1, . . .} , in which qn = 1 for all but one n. If Eve has such an assignment, her expected utility calculations will foreseeably lead her to take m pieces and then stop, where m is the position of the single 0 (and m = 0 in the case where qn = 1 for all n). It seems reasonable to adopt an amended version of the deliberative dynamics, requiring that Eve’s assignment at equilibrium should be one of these practically consistent assignments. Strikingly (and ironically), it appears that we have come full circle, back to the same problem that we started with and the same problem that confronts advocates of binding. There is one assignment corresponding to each finite stopping point. Which one should Eve select? Actually, it is not quite the same problem. For one thing, in contrast to the binding approach, the amended deliberative dynamics rules out as irrational any assignment that leads Eve to take all the pieces, whether or not she is capable of binding. (Recall that if she is not capable, then the binding approach deems it to be rationally required for Eve to take all the pieces.) Eve has a clear ‘local’ justification for stopping at piece m: according to her equilibrium probabilities, stopping has a higher expected utility than taking piece m. Even better: if Eve reaches one of these equilibrium assignments as the limit of updating from her initial set of credences, the deliberative dynamics offers a response to the problem of which equilibrium assignment she should select. Eve does not face the problem of selecting one of these equilibrium assignments on neutral rational grounds. Instead, her probabilities naturally evolve to such an assignment from a given starting point. There are significant obstacles to this line of reasoning. We need to spell out the revision rule for the amended deliberative dynamics. More significantly, our discussion has been largely confined to the characterization of equilibria. A complete analysis should examine convergence behavior, i.e., whether successive updating on any initial set of credences leads to an equilibrium. Finally, additional challenges may exist for the original version of Satan’s apple and for puzzles that involve unbounded utilities. This is as far as the deliberative dynamics takes us at present. 6 Conclusion Within infinite decision theory, we find a hierarchy of increasingly difficult sequential decision problems. First, there are infinite decision problems where the agent’s utility function is continuous (in the sense of Definition 4.2) over the domain of available strategies. We conjecture that finite decision theory extends straightforwardly to such problems. (We say ‘extends’ because the utility function is necessarily continuous for a finite decision problem: if there are only finitely many profiles X = {A1 , …, A N }, and {An } converges pointwise to A, then An = A for large enough n, so U (An ) converges to
123
656
Synthese (2014) 191:629–660
U (A).) There is no discontinuity at infinity, and there are no failures of agglomeration such as those which occur in Satan’s apple. ‘Continuous’ infinite decision problems thus represent a safe first stage in the hierarchy. Then there are infinite decision problems where continuity fails, but in a benign way, as illustrated by the variant of Satan’s apple in which Satan offers Eve a bonus for taking infinitely many pieces of apple. Decision problems with benign discontinuities pose no additional difficulty. The next step is the one taken here, to decision problems involving discontinuities in which the agent’s preferences are transfinitely intransitive. That is the situation in all three of our puzzles. We have offered no help in dealing with the synchronic (nonsequential) versions of the puzzles. For the diachronic (sequential) versions, however, the analysis of Sects. 4 and 5 suggests two possibilities. The first possibility is that the agent can legitimately transform and solve the decision problem using one of the techniques explored in Sect. 5. We believe that the Deliberative Dynamics approach introduced in Sect. 5.4 is the most promising one, for reasons that stand out clearly when we compare it to Binding. While both approaches solve Satan’s apple by adding something beyond the initial formulation of the problem, only Binding alters that formulation. Since Binding requires Eve to elevate one strategy above the rest, it either changes her initial preference ordering on strategies (thus eliminating the discontinuity at the heart of the puzzle) or compels her not to take her own preferences seriously. Thus, Binding does little to advance our understanding of Eve’s predicament or, more generally, of infinite decision theory. By contrast, the Deliberative Dynamics requires no initial choice of any strategy. It leaves Eve’s preference ordering on strategies (and the discontinuity) in place, but rejects the inference from this ordering to the conclusion that Eve is rationally required to take piece n, for all n. The Deliberative Dynamics provides a rationale (the requirement of equilibrium credences) for a stopping point, and it offers a clear framework for thinking about Satan’s apple and related puzzles. At present, it remains unclear how far the Deliberative Dynamics approach can be extended to other puzzles, particularly those involving unbounded utility. Indeed, this suggests another level of difficulty in the hierarchy, for problems involving a discontinuity with transfinitely intransitive preferences and unbounded utilities. One further comparison between Binding and the Deliberative Dynamics is in order. As noted earlier, Eve’s vulnerability to a money pump leads us to ask: what defect of rationality does she exhibit? The Binding approach provides no means of answering this question for agents who cannot bind, and a non-illuminating answer for agents who can bind, namely, that Eve is irrational unless she arbitrarily chooses a strategy and binds herself to it—that is, unless she arbitrarily changes her preferences from the ones that were given at the outset, or arbitrarily fails to act on those preferences. The Deliberative Dynamics suggests a more subtle answer: the money pump indicates a lack of harmony among Eve’s credences (since the money pump disappears at equilibrium). We turn, finally, to the second possibility: that neither Binding nor the Deliberative Dynamics provides a legitimate solution to the infinite decision problem. If we encounter any such problem, then the conclusion of Sect. 4 appears unavoidable: the money pump argument in general is unsound.
123
Synthese (2014) 191:629–660
657
Our final conclusion, then, may be stated as a disjunction. Either the Deliberative Dynamics offers a powerful technique for solving decision problems of this difficult type, or else they provide a compelling objection to unrestricted money pump arguments, and it then becomes an important task for decision theory to determine their legitimate scope. Either way, the distinctive headaches caused by discontinuities at infinity have something to teach us about decision theory. Acknowledgments We thank especially Sharon Berry, Rachael Briggs, Kenny Easwaran, Christopher Hitchcock, Yoaav Isaacs, Ralph Miles, Daniel Nolan, Paolo Santorio, Wolfgang Schwarz, Julia Staffel, and Orri Stefánsson for very helpful discussion. One of us (Paul Bartha) also acknowledges support from Australian National University in the form of a Visiting Research Fellowship.
Appendix 1: Continuity and transitivity In this section, our main objective is to state and prove an interesting partial converse to Theorem 4.1. To state this converse, we need to define two notions of limit. First, let U be a real-valued function on a topological space X, not necessarily continuous. Then the limit of U (x ) as x → x (or for short, the limit of U at x) is defined to be the unique real number r with the following property: for any open interval (a, b) around r , there is an open neighborhood V of x such that U maps V − {x} into (a, b). If no such r exists, the limit of U at x is undefined. The idea here is that we can make U (x ) as close as we like to r by making x sufficiently close to x, provided x = x . Continuity is definable in terms of limits: U is continuous at x iff the limit of U at x is U (x). Transitive intransitivity implies that the limit of U at A either does not exist or is less than U (A). When the limit of U at A exists, this necessary condition for transitive intransitivity is almost sufficient. To formulate a sufficient condition, we need one more definition. A topological space X is said to be first countable if for each point x, there is a countable family {Vi : i = 1, 2, …} of open neighborhoods of x such that for any open neighborhood V of x, Vi ⊆ V for some i. Such a family is called a countable base at x. All of the spaces we are concerned with in this paper are first countable. We then have the following result. Theorem 7.1 Let U be a real-valued function on a first countable topological space X, and for all x ∈ X, let U (x) be the limit of U at x. If U is well-defined throughout X (i.e., limits exist everywhere), then the ordering ≤ generated by U is transfinitely intransitive iff there is an x ∈ X such that (a) U (x) < U (x), and (b) every neighborhood of x contains a point x such that U (x ) < U (x). Proof Left-to-right: Let x < x1 < x2 < · · · → x. Then U (x) < U (x1 ) < U (x2 ) < · · · . Clearly, the sequence U (x1 ), U (x2 ), . . . converges to U (x), and this value is clearly greater than any of the U (xi ), and thus greater than U (x). Thus, (a) holds. For (b), note that any open neighborhood V of x contains some of the points xi , and since U (xi ) < U (x), (b) holds. Right-to-left: Suppose x satisfies (a) and (b), and let {Vi : i = 1, 2, …} be a countable base at x. By (b), for each i we can pick yi ∈ Vi such that U (x) > U (yi ). Any open neighborhood of X contains some Vi , and hence contains all but finitely
123
658
Synthese (2014) 191:629–660
many of the yi , so the sequence y1 , y2 , . . . converges to x. Also, for every i, there is a j > i such that U (yi ) < U (y j ). To see this, simply pick an interval (a, b) around U (x) that excludes U (yi ). By the definition of the limit, there is an open neighborhood V of x such that U maps V −{x} into (a, b), and there is some V j ⊆ V ; this gives us the required j > i. Thus, we can pick a subsequence x1 , x2 , . . . of y1 , y2 , . . . such that U (x1 ) < U (x2 ) < · · · . This subsequence clearly converges to x, and the sequence U (x1 ), U (x2 ), . . . obviously converges to U (x). By (a), U (x) < U (x), and thus,
U (x) < U (xi ) for some i. Hence, U is transfinitely intransitive. If we drop the condition that U has a well-defined limit everywhere, we are not aware of any simple necessary and sufficient condition for the transfinite intransitivity of the generated ordering ≤. There is a second connection between transfinite intransitivity and continuity. Instead of assuming that the preference ordering ≤ comes from a utility function, let us simply assume that ≤ is complete, meaning that for all A, B ∈ X, either A ≤ B or B ≤ A. There is then a standard definition of continuity for preference orderings. Specifically, for any given point A, let U A = {B: A ≤ B} and L A = {C: C ≤ A}. Then ≤ is said to be continuous if for every A, the sets UA and LA are topologically closed. (A set S is closed iff whenever a sequence of points in S converges, it converges to a point in S.) Now suppose that ≤ is transfinitely intransitive, and let A < A1 < A2 . . . → A. Then the set {B : A1 ≤ B} is not closed, because there is a sequence of points from that set whose limit, A, is not in that set. Thus, transfinite intransitivity of ≤ implies discontinuity of ≤ in the standard sense, further justifying our talk of “discontinuity at infinity.” Appendix 2: Extreme equilibria The purpose of this section is to prove Theorem 5.1. We first prove a result based upon an elementary fact about infinite products.
∞ Lemma lim
n→∞ P (AL L/Sn ) = 1 iff n=1 (1 − qn ) < ∞, and P(ALL / Sn ) = 0 ∞ for all n iff n=1 (1 − qn ) = ∞.
∞ Proof If 0 ≤ u n < 1, then ∞ n=1 (1 − u n ) > 0 iff n=1 u n < ∞ (Rudin 1974, p. 322). ∞ qi . If qi = 0 for infinitely many i, then Recall that P(AL L/Sn ) = i=n+1 (1 − qn ) = ∞. If qi =
0 for finitely many i, it P(AL L/Sn ) = 0 for all n and ∞ n=1 ∞ ) = 0 and follows from (Rudin 1974) that either P(AL L/S n i=n+1 (1 − qi ) = ∞,
∞ or P(AL L/Sn ) > 0 and i=n+1 (1 − qi ) < ∞. In the latter case, P(AL L/Sn ) must converge to 1 as n → ∞.
Theorem 7.2 Every equilibrium is an extreme assignment. (For all conditional probabilities qn = P(Sn /S1 · . . . · Sn−1 ), either qn = 0 or qn = 1.)
∞ Proof Suppose that
∞{qn } is an equilibrium
∞probability assignment. Either n=1 (1 − qn ) < ∞ or n=1 (1 − qn ) = ∞. If n=1 (1 − qn ) = ∞, then P(AL L/Sn ) = 0 for all n. It follows that for all n, the expected utility calculations in 2’ favour Take
123
Synthese (2014) 191:629–660
659
piece n. The dynamics implies upward pressure (qn > qn ) unless qn = 0 or qn = 1. Since we have an equilibrium, it follows that qn = 0 or qn = 1 for all n. Hence, we have an extreme
assignment. If instead ∞ n=1 (1 − qn ) < ∞, then limn→∞ P(AL L/Sn ) = 1; also, limn→∞ qn = 1. Combining these facts with 2’, Eve reaches a point N beyond which each qn is close to 1 and P(AL L/Sn ) is so large that her expected utility computations favour Reject piece n. For n > N , the dynamics implies downward pressure (qn < qn ) unless qn = 1. Since we have an equilibrium, it follows that qn = 1 for all n > N . So the ‘tail end’ of the probability assignment, from N + 1 onwards, consists entirely of 1s. To finish up, we show that, for this case, each of q1 , . . ., q N must also be 0 or 1. If not, let k be the largest index such that 0 < qk < 1. If qn = 0 for some n > k, then the dynamics implies upward pressure on qk : since P(AL L/Sk ) = 0, EU(Sk ) > EU(∼ Sk ) and qk > qk , contradicting the fact that we have an equilibrium. If qn = 1 for all n > k, then the dynamics implies downward pressure on qk : since P(AL L/Sk ) = 1, EU(Sk ) < EU(∼ Sk ) and qk < qk , again contradicting the fact that we have an equilibrium. Hence, qk = 0 and each of q1 , . . ., q N must also be 0 or 1. Hence, the equilibrium assignments for Eve are exactly the extreme assignments in which all of her conditional probabilities qn = P(Sn /S1 · . . . · Sn−1 ) are either 0 or 1.
References Arntzenius, F., Elga, A., & Hawthorne, J. (2004). Bayesianism, infinite decisions, and binding. Mind, 113, 251–283. Atkinson, D., & Johnson, P. (2010). Nonconservation of energy and loss of determinism. II. Colliding with an open set. Foundations of Physics, 40, 179–189. Barrett, J., & Arntzenius, F. (1999). An infinite decision puzzle. Theory and Decision, 46(1), 101–103. Batterman, R. (2002). The devil in the details: Asymptotic reasoning in explanation, reduction and emergence. New York: Oxford University Press. Batterman, R. (2005). Critical phenomena and breaking drops: Infinite idealizations in physics. Studies in History and Philosophy of Modern Physics, 36(2), 225–244. Gilboa, I. (1994). Can free choice be known? In C. Bicchieri, R. Jeffrey, & B. Skyrms (Eds.), The logic of strategy. Oxford: Oxford University Press. Hájek, A. (2008). Dutch book arguments. In A. Paul, P. Prasanta, & P. Clemens (Eds.), The Oxford handbook of rational and social choice (pp. 173–195). Oxford: Oxford University Press. Jeffrey, R. (1983). The logic of decision (2nd ed.). Chicago: University of Chicago Press. Levi, I. (1997). The covenant of reason: Rationality and the commitments of thought. Cambridge: Cambridge University Press. Luce, D., & Raiffa, H. (1957). Games and decisions: Introduction and critical survey. New York: Wiley. Meacham, C. (2010). Binding and its consequences. Philosophical Studies, 149(1), 49–71. Norton, J. D. (2012). Approximation and idealization: Why the difference matters. Philosophy of Science, 79(2), 207–232. Nover, H., & Hájek, A. (2004). Vexing expectations. Mind, 113, 305–317. Pollock, J. (1983). How do you maximize expectation value? Noûs, 17, 409–421. Resnik, M. (1987). Choices: An introduction to decision theory. Minneapolis, MN: University of Minnesota Press. Rudin, W. (1974). Real and complex analysis (2nd ed.). New York: McGraw-Hill. Skyrms, B. (1990). The dynamics of rational deliberation. Cambridge, MA: Harvard University Press. Skyrms, B. (1996). Evolution of the social contract. Cambridge: Cambridge University Press. Slote, M. (1985). Common-sense morality and consequentialism. London: Routledge and Kegan Paul.
123
660
Synthese (2014) 191:629–660
Slote, M. (1989). Beyond optimizing: A study of rational choice. Cambridge, MA: Harvard University Press. Spohn, W. (1977). Where Luce and Krantz do really generalize Savage’s decision model. Erkenntnis, 11, 113–134.
123