JOSEPH ALMOG
WOULD YOU BELIEVE THAT?* "Why sometimes I've believed as many as six impossible things before breakfast." (Lewis Carroll, Alice in Wonderland)
1. G E N E R A L P R E L I M I N A R I E S In recent years certain prominent logicians and linguists, working within the framework of the so-called "model-theoretic semantics for natural language", have come to think that the semantics of the verb "believes" (and other attitudinal verbs) casts a grim shadow on the whole project of framing a model theoretic semantics for natural language fragments. They have noted that belief contexts resist all the successive improvements of formalizations of the concept of "meaning", and that even in the most sophisticated theories, standard rules of inference still fail when applied in the province of belief contexts. In the present paper, I shall try to consider their objections and advance a new solution to the semantics of belief couched in a general departure from the present, allegedly unsuccessful, theories. However, before we can proceed to the discussion of belief contexts, some general remarks on the formal background relative to which these discussions take place, are in order. Common to all the variants of this tradition is the following methodology: in order to formalize the semantics of a given language, the semanticist proceeds in the way one proceeds when giving a formal semantics to a formal language. That is, one constructs models for the given language and defines, relative to them, the central semantic notions: truth, validity, consequence, etc. In the simplest case, a first order (extensional) predicate language, models take the form of an ordered pair ( U, [ ]), where U is the set of individuals, the domain of the model, and [ ] is the semantic value function: it assigns semantic values (individuals in U for individual constants, sets of individuals in U for predicates) to expressions of the language. In such a language, there is only one hidden variable in the truth definition: the choice of U influences the truth values which sentences Synthese 58 (1984) 1-37, 0039-7857/84/()581-0001 $03.70 © 1984 by D. Reidel Publishing Company
2
JOSEPH
ALMOG
will be assigned relative to the model. However, we can enrich our base language by various locutions whose semantics calls for the enrichment of our models. A standard extension takes place by adding intensional operators, like modal locutions ("necessarily", "possibly"), temporal locutions ("it was the case", "it will be the case"), locational operators ("everywhere it is the case", "somewhere it is the case") etc. The most common practice is to consider modal extensions. In order to assign semantic values to modal locutions we enrich (U, [ ]) by a modal structure: a triple (w, W, R), where W is an index set of worlds, w is a distinguished member of W and R is a relation on W. The semantic value function is now indexed by a world. In more complicated cases temporal or locational locutions are added as well, and the model is enriched in the proper way, turning the indexing of [ ] to be an indexing by an n-tuple of a world, time, location, etc. (Kaplan's "circumstance of evaluation"). We should note that the enrichment of our extensional language by intensional locutions generated an ascent in what I shall call levels of meaning. Originally, we had a one-level theory of meaning: meaning was identified with extension, the meanings of expressions were the extensions assigned to them by [ ]. N o w we find ourselves with two levels of meaning: we have at the first level our old extensions, but the second level is filled up with functions, which, when applied to a world, yield an extension, viz., for an expression " E " , [El applied to w yields [E]w. These functions on possible worlds (which yield syntactically appropriate extensions) are usually called intensions. A two level theory of meaning is capable of assigning two types of semantic values to its expressions: extensions and intensions. Some intensional locutions, like modal operators (and, according to some, belief operators), call for the intensions of expressions, not only the extensions, in computing the semantic value of the complex locution. D. Kaplan (1977) has argued that the addition of indexical locutions to the language calls for a further ascent in the hierarchy of meanings. So far, we had the domain U and the possible circumstance of evaluation w as semantic determinants, hidden variables, of the truth definition. To define truth for indexicals, we need a new semantic determinant: the context of use. Therefore, we enrich our models by a contextual structure: An n-tuple, whose first member is the index set of contexts C, and the rest of the members are functions on C, which retrieve the speaker of C (S(c)), its addressee (A(c)), the set of objects demonstrated (D(c)), the time of utterance, the place, etc. Now,
WOULD
YOU
BELIEVE
THAT?
the semantic value function is indexed both by w ~ W and by c ~ C. We have reached the third level of meaning. We now have functions, characters, from contexts to the second level entities: intensions. Setting the character of " E " , [E], in a context c, we get an intension [E]c. Setting the latter in a circumstance, we get an extension, [E] .... (The recent approach of Perry and Barwise, the so called "Situation Semantics", is also a three level theory in this sense.) 1 It is against this formal background that current discussions take place. Some have suggested that intensions are the proper formalizations of meanings and hence should serve as objects of the belief-attitude. Some have claimed that characters are the meanings of expressions, and hence belief operators should attach to characters. All this will be discussed below in detail. Here it suffices to note that: (a) The present discussion assumes the hierarchy, and does not argue for it. That is, the model theoretic apparatus is assumed and, as such, the discussion is from within, viz. from a stand point which is committed to a certain formalization strategy. (b) Inasmuch as a fourth level of meaning will be mentioned, the numbering of the levels refers to the hierarchy discussed above. This will be a theory which calls for an ascent over and above characters.
2.
BELIEF
SENTENCES:
SOME
PRELIMINARIES
Semantic theories of natural language have usually sought to connect belief and meaning in the following way: given two sentences S, S' of the language in question, if M is the meaning of S and M' the meaning of S' and M = M', then if j is a competent speaker, if he believes that S, he believes that S'. This is a very rough statement the nature of which can be clarified only when we try to make the notion of "meaning" precise, viz. say what M, M' are. The history of semantics is full of such attempts. At the simplest level, we find Russell's attempt to tie meanings to extensions and see whether the prediction about the substitutivity (from S to S') is justified. In the Principia, Russell didn't fail to notice that even though ext(S) may be ext(S'), ext (j believes that S) may not be ext (j believes that S'). This is especially clear ::hen one identifies, as in a FregeCarnap semantics, the extension of a sentence with its truth value. The same phenomenon reappears in the second level of meaning
4
JOSEPH
ALMOG
where even though int ( S ) = int (S'), we still have the possibility that ext (j believes that S) = ext (] believes that S'). T h e r e are two types of instances which can serve as examples here: (i) Iogico-mathematical cases, (ii) Kripke-type metaphysical cases. In the first category we find logically equivalent sentences which share their intensions, and in the second category we find terms which share extensions necessarily. Thus even though int (Mark T w a i n ) = int (Samuel Clements), John may be a competent speaker who believes that Twain is a famous writer but who fails to believe that Clements is a famous writer. Similarly, int (Gold) = int (The substance with atomic number 79), but John may believe that Gold is precious without having such an opinion of the substance with atomic number 79. Carnap's subsequent improvement of his original proposal was aimed at counterinstances of type (i). His proposal was to use a stricter notion of intensional identity. Basically, that notion incorporated the syntactic structures of the sentences in question, so that a notion of intensional isomorphism emerged as an alleged guarantee for substitutivity: two sentences are intensionally isomorphic if they share intensions and are built (syntactically) in the same way from their constituent phrases. But then came Mates' puzzle: John may believe that all Greeks are Greeks but he may fail to believe that all Greeks are Hellenes. Mates' puzzle was originally phrased with iterated belief contexts but at present I ignore this further complication which we shall discuss in detail below. At this stage it suffices to note the problem as it occurs already at the level of first order belief reports. In that case, John believes that S and fails to believe that S', where S, S' have no occurrences of attitudes and S and S' are intensionally isomorphic. T h e problem can be replicated even at Kaplan's third level. T h e r e are two cases which are of interest here: (i) Cases with substituends which aren't indexical (as in the original case of Mates). Since the character of such expressions is a constant function on contexts, it reduces to their intension and we are back at the original case. (ii) E m b e d d e d sentences which involve indexicals. T h e two substituends, even though they are indexicals and have the same character, may not be intersubstitutible in belief contexts. (I will discuss this case in detail below since it involves an independently problematic issue of finding two indexicals which share characters.)
WOULD
YOU
BELIEVE
THAT?
This sketchy presentation of the substitution problem is of course not immune from criticism. Some may question the very data by claiming that in some cases we wish attitudinal contexts to be transparent in some sense, and surely intensional (or character) equivalence is sufficient for substitution. Such objections reflect the fact that various senses of the verb "believes" may be distinguished and that much care needs to be taken to make it clear with respect to which sense substitution fails. We must make clear whether what matters is the preservation of the truth value of the belief, the content of the belief, or even the very belief state. Also, much care needs to be taken to make it clear whether our attributee doesn't suffer from linguistic incompetence which would easily explain the failures of interchangeability in a case like Mates'. I completely agree that suggestions that this o r that type of equivalence (first level, second level, or even third level) do not suffice can be justified only after one has stated one's position on the various background issues mentioned in the last paragraph. I shall try to relate to these issues before I advance any thesis as to the sufficiency of certain levels for the depiction of the semantics of belief sentences. But at this preliminary stage I am more concerned with recording the pre-theoretically intuitive data, viz. the fact that for any level of meaning in the hierarchy, one c a n imagine failures of substitutivity for a linguistically competent attributee. The strategy of this paper will be the following: I start by reviewing "Frege's problem" viz. how can sentences of the form " a = a " be uninformative, while their " a = b" counterparts are informative? Then I will discuss attempts to show that neither intensional identity nor character identity suffices for substitutivity. The latter case will bring two important puzzles to our attention: Mates' puzzle and Frege's version of the paradox of analysis. I will suggest that the failure of current semantic theories to treat these puzzles reflect their failure to depict correctly the notion of "'linguistic meaning". This will suggest two major points: (i) any formal semantics for natural language should solve these two puzzles as a condition for its adequacy qua a theory of linguistic meaning; (ii) it can be seen why attitudinal semantics has been taken to cast doubt on the very feasibility of a formal semantics for natural language. That is, the semantics of the attitudes present, as if in a microcosm, the basic conditions which natural language semantics should satisfy as a whole, and so the
6
JOSEPH ALMOG
feasibility of an attitudinal semantics hints on the feasibility of the more general enterprise. Having made these points, I will try to solve the puzzles while adhering to certain adequacy constraints which, I believe, any solution must satisfy. If the solution is sound, I hope that the apparatus it introduces will offer a hint on what may be a proper model theoretitic treatment of " m e a n i n g " in general.
3. FREGE'S PROBLEM
Most of the issues in contemporary formal semantics can be traced back to Frege. Our problem is not different in that respect. What I shall call "Frege's problem" was the following riddle that he posed:
How could identity sentences of the form " a = a" be uninformative, while identity sentences of the form " a = b" are informative? Some aspects of the way in which Frege originally posed the problem call for a possible confusion. Chief among these are his use of the term 'informative". This seems to me true both at the level of an informal explanation of informativeness and at the level where one tries to formalize the notion of "information". Starting at the informal level, it seems to me that that are at least two ways to understand informativeness:
1. The Modal Reading This is not an epistemic notion of informativeness. Relative to a space of possible worlds (metaphysically possible worlds) we can define a measure of informativeness as a function of the cardinality of the subset of the space in which the relevant sentence is false. H e n c e a sentence is uninformative if it is true in all worlds and informative if there are worlds in which it fails.
2. The Epistemic Reading In this case, we understand informativeness as an epistemic modality. A sentence is uninformative if it is a priori. This does not confer on it the
WOULD
YOU
BELIEVE
THAT?
status of necessity: in a given context, the sentence may express contingent propositions, viz. propositions which are not true in all possible worlds. But, the proposition which the sentence expresses at a context c will always be true in the world of c (the actual world of c). Such sentences can be known a priori: pure reflection guides the speaker to recognize their truth. There is little doubt that Frege was interested in the epistemic reading of "informativeness". First, he did not really believe that alethic modalities are irreducible: he thought that they are epistemic modalities in disguise. Second, he did not think that the a priori and the necessary are not coextensive notions. As far as one can make out, there was nothing to necessity beyond aprioricity in Frege's writings. Third, Frege was interested in finding substituends which can be substituted in attitudinal contexts and as such was interested in truths which can be known on pure reflection. Thus to the modern semanticist trained in possible worlds semantics with senses reconstructed (g la Carnap) as intensions, "the morning star" case may be a bit baffling. The explanation of the difference in the cognitive significance of "the morning star = the evening star" and "the evening s t a r = t h e evening star" turns on difference in intensions viz. the existence of a world where the extensions of "the morning star" and "the evening star" differ. But this seems to me to miss the point, especially for a true Fregean. For him, two terms " a " , " b " which share intensions may make the point as well, if their intension-sharing is not recognizable a priori. The anti-Fregean (the direct reference theorist) should not be happy with the example either. For him, epistemic informativeness does not depend on knowledge of metaphysical data. To know the latter would accord the ,competent speaker an access to essences. But it is exactly this access that Kripke, Putnam, Kaplan and others deny to the ordinary speaker. Hence the existence of a world where the identity fails isn't an explanation of the informativeness. Indeed one may come up with cases where there is no such world and nevertheless there seems to be a difference in cognitive significance. The case may involve rigid designators like proper names, natural kind terms, and, in the simplest case, indexicals. Thus John may believe that Tully is bald but fail to believe that Cicero is bald. John may believe that Joseph is Joseph but may not believe that I am Joseph, etc.
8
JOSEPH ALMOG
The explanation of informativeness in formal theories (like the mathematical theory of communication) is also open to the ambiguity noted above. Information is in this case reduced to probability and the ambiguity shows up as to how to interpret the probability measure. For our purposes the "syntactic" theory of Shannon and Weaver is irrelevant, but the semantic theory of Carnap and Bar HiUel is not. Very roughly put, we operate with the conception that:
informativeness(P) = 1 - probability(P) Having said this, we face the need to interpret "probability measures". We face a very similar bifurcation: 1". The Modal Reading Under the modal reading we define probability by reference to a space of possible worlds but assignments of values reflect the possibility that an event could occur in the metaphysical sense of "could". That is, we reflect dispositions of the "world itself" to behave in such and such a way. 2*. The Epistemic Reading In this case, assignments reflect what is conceivable or what an ideal conceptual system would bet under given circumstances. The system is surely not irrational if it does not assign probability 1 to identities like Cicero = Tully, though it is irrational if it fails to do so for Cicero-Cicero. In other words, probability judgements go with what is knowable on pure reflection and not with what is metaphysically necessary. Thus on the formal perspective we get the same picture: informativeness does not reflect a probability measure which depicts metaphysical data, if "informativeness" aims to explain cognitive significance. 4. KAPLAN'S PROBLEM
D. Kaplan has advanced his three level theory precisely because of such problems. The idea was that characters would explain cognitive significance. Thus even though I may believe I am not a fool, reading some earlier philosophical material I find on my shelf I am amazed at
WOULD
YOU
BELIEVE
THAT?
the author's stupidity and I believe he is a fool, even t h o u g h . . . I am him! The explanation would be that the characters which determined in this context a necessary proposition, determine in other contexts metaphysically impossible propositions. Let me make four remarks on Kaplan's approach: (i) It is clear that, as it stands, the solution is not interesting enough. The existence of other arbitrary contexts where a false proposition is determined is not explanatory. To capture the epistemic situation we must require that such a falsifying context is somehow relevant to the original context. That is, we must require that the agent of the falsifying context is in the same evidential situation as the one in the original context, otherwise it would not be relevant for him that "I am he" can determine a false proposition. In other words, the falsifying context must be a member of an equivalence class of contexts in which the agent is, in some loose sense, a doppelganger of the original agent. (ii) Kaplan's original motivation was to solve the puzzle of informativeness of identities involving demonstratives. Thus, I point twice to Venus (say, from different angles) and say "This is F " and "That is not F " . Kaplan's original move was to treat demonstratives in a different way than his treatment of pure indexicals. To treat them as pure indexicals (associated with a simple character rule in the metalanguage like "the speaker of the context", etc.) wouldn't have been helpful to the explanation of the identity puzzle. Therefore he thought to build into the very terms of the object language (viz. the terms we evaluate) a codification of the epistemological situation. Hence, each demonstrative is followed and completed by a demonstration term which encodes the epistemic circumstance. For instance, "This (the planet seen in the morning from here)" vs. "That (the planet seen in the evening from here)". More generally, Kaplan introduced his famous functor "Dthat" which is a demonstrative whose reference is fixed by the demonstration term which follows it. Then sentences like "Dthat(a)--Dthat(b)" will be necessarily true or false. But their cognitive significance will be explained by their characters, viz. the difference in character that the two demonstration terms ("a", "b") will generate. But the question is: will they generate such a difference? To my mind, Kaplan applied his three level apparatus to what may be called "the easy prey", viz. cases where " a " , " b " differ already at the level of intension. (in the above example: there are worlds where the
10
JOSEPH ALMOG
planet seen in the morning from here isn't the planet seen in the evening from here.) If the two terms differ already in intensions then, of course, they will differ in character. The true test for Kaplan's case is when " a " and "b" share intensions, but, nevertheless, "Dthat(a) = Dthat(b)" is informative. One should concentrate on terms which share intensions: "The F as seen from here now" and "The G as seen from here now", where " F " and " G " share extensions in all worlds in virtue of some metaphysical truth (and not in virtue of language). Sadly, in that case, the two characters are not going to be different. The indexicals ("here" and "now") do not help: even though each of them varies in value across contexts, they covary in values across contexts. Thus in all contexts the two terms will come up with the same value and no explanation for the cognitive significance of the identity will be given on the character level. 2 (iii) Proper names are good candidates for the role of completing " a " , " b " (after the Dthat-occurrence) which share intensions. For instance, "Dthat(Tully)", "Dthat(Cicero)" share intensions but differ in cognitive significance. The problem can be seen immediately: in Kaplan's systems proper names have a stable character (they are represented by a constant function on contexts). Thus if names share extension, they share intensions (Kripke's rigidity) and if they share intensions, they share characters. But John may believe that Tully is bald and Cicero isn't. One escape route is to claim that names are indexicals. This may help to make their characters play the role of cognitive significance. Some features of names are suggestive in this direction and Kaplan's three level system can accommodate them as indexicals (for details see my 1981). Stalnaker (1981) has recently suggested using this feature to explain the informativeness of "Cicero = Tully": even though, in our context, the sentence determines a necessary proposition, there are other contexts in which it determines an impossible proposition. I don't think that this explanation is powerful as it stands because of reasons pointed out in (i). But even if the suggested improvements were to be added to Stalnaker's solution, I don't think that they will save the cognitive role of characters in the case of names. The reason is this: against my earlier judgement, I am now convinced by Kaplan that proper names shouldn't be treated as indexicals. (Kaplan has claimed this all along and accepted that characters do not explain the cognitive significance of names.) It would be a long story to enumerate here the
WOULD
YOU BELIEVE
THAT?
11
reasons for not letting names be treated as indexicals. Basically, the reasons boil down to two main points: (a) The character rules of indexicals are given in the metalanguage of English where the metalinguistic side ("the agent of the context", "the time of the context") is used, not mentioned. But character rules for names would have to mention (in the metalanguage) the name of which they are character rules. (This may involve charges of circularity and infinite regress.) To disambiguate between various homonymous names we would have to let the character rule involve reference to a token-use of the name, and so rules of the language would involve reference to particular uses. How could language be mastered if this were to be the case? (b) I believe that the difference between "Aristotle" (the philosopher) and "Aristotle" (the ship magnate) is not analogous to the difference between "I" said by myself, and "I", said by you. Rather it resembles the difference between "bank" (side of a river) and "bank" (financial institution). The dimension over which these differ is not the context, in the sense of indexicals, but rather a totally different parameter. Hence, "Aristotle" is not really an indexical which varies across contexts, but rather its variation is of a different kind. Thus names do not have a non-stable character: consequently their character cannot explain their cognitive significance (a result Kaplan himself admits). (iv) My final point against characters as cognitive value identifiers is this. One can imagine languages in which indexicals have synonyms. For all we know, maybe even English is such a language as the pair "you" and "thou" suggest. One can imagine that I believe that you are a fool but that I fail to believe that thou art a fool. Similarly, "the actual and present speaker" and "I" share characters. But you may believe that I am a fool without thereby believing that the actual and present speaker is a fool. Of course, this last category brings us very close to Mates' cases. This can be seen by considering the supporters of the "reality" of Mates' case and deniers of its relevance. The supporters agree that the two substituends above share linguistic meaning and nevertheless the inference seems to fail (viz. character identity is not enough), The deniers will wish to deny the data just as they do in Mates' case: if I believe that you are hungry but disbelieve that thou are hungry, then I am probably not a linguistically competent speaker.
12
JOSEPH 5.
MATES'
PUZZLE
AND
ALMOG
THE
PARADOX
OF ANALYSIS
What about the original cases of Mates in Kaplan's three level system? These cases did not involve any indexical terms. The common nouns used in these examples have a stable character and so, if they share intensions (which they do) they share characters. Thus Mates' puzzle (henceforth: MPZ) is another case where characters fail to explain cognitive significance: "vixen" and "female fox" share their character, but John may fail to believe that all vixens are female foxes even though he surely believes that all vixens are vixens. Some authors suggest that there is a special significance to the fact that M P Z was originally cast with interated belief contexts (see Kripke 1978 fn. 45). I believe that we should distinguish two issues here. First, there is the question whether iterated contexts introduce special logical problems to a given formalism (such as Frege's notion of sense, or possible worlds semantics). As far as this is concerned, it seems to me that there are such special problems, as the literature on the subject testifies. 3 The second issue is whether we need more than first order reports to make the point about the failure of interchanging linguistic synonyms. On this, I think the answer is different: to study M P Z and the issues it raises, it suffices t o relate to first order belief reports like the obvious truth of "John believes that vixens are vixens" and the potential falsehood of "John believes that all vixens are female foxes". These short remarks may dispense with some potential confusion concerning the relation between the object language and metalanguage of belief reports. But in fact I believe that a much more detailed comment on these relations between object-and-metalanguage is in order. Intuitively, one would have liked to posit the following relation between the verb "believes" and the predicate "informative" as a first step in the explanation of MPZ: if "oh" is a sentence which is uninformative, then for any linguistically competent speaker j, " j believes that of" is true. On the other hand, if " i f " is informative, there might be a competent speaker j for such that " j believes that if" is false. Such a relation would have explained the fact that if (2) is true, so is (3), but the truth of (4) allows us to envisage the falsehood of (5): (2) (3)
"every vixen is a vixen" is uninformative; j believes that every vixen is a vixen;
WOULD
(4) (5)
YOU
BELIEVE
THAT?
13
"every vixen is a female fox" is informative; j believes that every vixen is a female fox.
This postulated relation seems to fly in the face of a Fregean principle which was held dear by most modern semanticists, a principle I will dub "(F)": (F) If two expressions of the language share linguistic meaning the
competent speaker knows (believes) it. In particular, if "A" and " B " share linguistic meaning, then if John believes that all A's are A's, be believes that all A's are B's. The real strain between (F) and (2)-(5) emerges in the form of an alleged puzzle: (6) (7) (8)
The meaning of "vixen" is the meaning of "female fox". "Every vixen is a vixen" is uninformative. "Every vixen is a female fox" is uninformative.
But (8) seems false. At this point, some readers will be unimpressed by this result since they regard the frame " is informative" as quotational. Hence (6)-(8) is a counterpart of: (9) (10) (11)
"Tully" is a five letters word. Tully is Cicero. "Cicero" is a five letter word.
No one is surprised to find that (11) is false, and so no one should be worried to find that (8) is false. This dismissal of a form of the paradox of analysis should not surprise the reader who knows Church's approach to MPZ. Church dismisses the whole puzzle by noting that John's failure to believe that all vixens are female foxes is really a metalinguistic failure involving the words "vixen" and "female fox". Thus, the relation between MPZ and the paradox of analysis (which, I shall claim below, is very intimate) would be vindicated in a negative way. Both would be dismissed as metalinguistic deliberations. However, just as I think we should resist trivializing MPZ into a metalinguistic problem, I will resist the temptation to discard the paradox of analysis as being a metalinguistic problem. I hope that the dividends of this approach will be clear enough once the present analysis is advanced.
14
JOSEPH
ALMOG
Thus, in my analysis, informativeness will not be codified by a metalinguistic predicate, but rather with an indirect discourse propositional operator. Read in this way, Frege's problem of analysis regains the status of a true puzzle: (12) (13) (14)
It is uninformative that every vixen is a vixen. Every vixen is a female fox. It is uninformative that every vixen is a female fox.
where (12)-(13) are true, but (14), the conclusion, is false. 4 The appearance of "analysis" in the name of this puzzle testifies for its non-Fregean origin. Analytic philosophers, like G. E. Moore, were interested in the following problem: how could conceptual analysis be informative? They imposed the following two intuitive adequacy constraints: (i) the analysans and analysandum should be synonymous, in some sense; (ii) the analysis should be informative. But the lurking of (F) in the background made the outcome puzzling: How could analysis involve synonyms and yet be informative? Even though Moore's problem is interconnected with Frege's problem in (12)-(14), I will not discuss it here because of two reasons. First, it is not obvious to what extent Moore's problem does not involve reference to the very words used in the analysis, viz. whether one condition for analysis, in his sense, is that different verbal means will be used in the analysans. This leads one to think that Moore's problem may involve, at least in part, quotation. (This point is due to Ackermann, 1981.) Second, it seems to me that any solution of Moore's problem will have to relate to the problem of "what are the desiderata of philosophical analyses?" a question which is much too complicated to be taken up here in passing. Hence I will deal only with Frege's version of the paradox of analysis (henceforth FPA). Let us rephrase our original intuitions on the relation between FPA and MPZ. First of all, let us be clear on what we mean by M P Z and FPA. The MPZ cases will take the following form: (MPZ) j believes that all A's are A's but may fail to believe that all A's are B's, even though "All A ' s are B's" is a linguistic truth, i.e. a truth in virtue of language.
WOULD
YOU
BELIEVE
THAT?
15
The FPA cases will take the form of: (FPA) It is uninformative that all A's are A's but it is informative that all A's are B's, where "All A's are B's" is a linguistic truth. This correlation will have practical effects on the rest of the paper. That is, an adequacy test on any solution to MPZ will be that it must indicate how FPA is solved, and indeed we shall see that the proposed solution for MPZ, if sound at all, immediately solves FPA. It is interesting to note that there is an historical edge to the relation between FPA and MPZ. Indeed Fregeans, like Church, have regarded both problems as being solvable by the very same mechanism: a hierarchy of senses over and above first level senses. What is perhaps really interesting is not the actual solution which Church advanced in the two cases. Rather, it is the method which was followed in both cases. The method is crucial to the particular solution because it encodes the very form of argument which lead Frege to postulate (first order) senses in the first place. 5 This method may be called "Frege's method", and it codifies the basic argument which Frege followed in postulating first order senses, but, as we shall see, it gives one a "master key" which can be applied in full generality for every candidate for the representation of "meaning". The argument goes like this: (FM) (FM1) Assume that some notion X is the meaning of an expression. (FM2) The meaning of a complex phrase is a function of the meanings of its constituent phrases. (FM3) If two expressions have the same linguistic meaning, the competent speaker knows it. (FM4) There are two sentences S,S' differing to the extent that S has " A " everywhere that S' has " B " and S,S' have the same X. (FM5) Some competent speaker may believe S and not believe S'. (FM6) Hence, X is not the meaning of expressions. To get Frege's original case substitute "extension" for " X " and to get high level senses, substitute "first level sense" for " X " . As I said, some Fregeans used (FM) to argue for a hierarchy of senses as a solution for MPZ and FPA. I will not try to pursue this route here for three reasons:
16
JOSEPH
ALMOG
(i) Other Fregeans deny the need to postulate high order senses (Dummett 1973). (ii) Inasmuch as we are interested in natural language, a hierarchy of senses is said to be inadequate because it seems that it would make the language unlearnable. (iii) The notion of first order sense is not clear at all. Why build high level floors in a skyscraper before securing solid underground foundations? I do not wish to suggest that Church's hierarchy cannot be used to solve MPZ and FPA. From the present perspective, however, it seems that to concentrate on "senses" is not helpful at all. Recent arguments by Kaplan and Perry make it clear that senses bifurcate at least into (a) intensions and, (b) characters. Hence it seems much more proper to study (FM) directly in the context of intensions and characters and see whether these notions offer a way to solve the puzzles. As we shall see the study of the three level hierarchy will direct us to a solution, in the model theoretic framework, which uses neither an infinite set of semantic hierarchies nor the troubling features of Church's systems referred to in (i). Hence, if sound, our solution can avoid the objections listed in (i)-(iii). 6.
THE
FOURTH
LEVEL
OF MEANING
Returning to Kaplan's theory, we can see that (FM) can be used to show that intensions cannot be the linguistic meanings of expressions. All we have to do is substitute "intension" for " X " in (FM) and the result follows. Thus I can believe that I am here without believing that I am in California, even though that is where I am. Further cases involving names and kind terms are not hard to c o m e by. 6 Kaplan is therefore led to postulate characters on top of intensions to explain "cognitive significance". But, now, if I am right, MPZ and FPA lead one to doubt whether three levels (in other words, characters) are enough for the representation of cognitive significance. If the intuitions behind MPZ and FPA are sound (if there are such competent speakers for whom substitutivity fails), then when we put "characters" for " X " in (FM) we can, very clearly, start to doubt the sufficiency of characters. This may look to some a very big "if". Kripke, for instance, has stressed that MPZ does not get off the ground without linguistically incompetent speakers in the role of the attributee. Let me emphasize that I do not wish to deny that there may be reasons to
WOULD
YOU
BELIEVE
THAT?
17
legislate and suggest that " c o m p e t e n t speaker" means, ipso facto, " o n e who does not fail in Mates' cases." This was Carnap's own reaction to Mates' case against his theory. H e suggested that what we deal with is a rational reconstruction in which "believes" is like a theoretical term. T h e fact that one can legislate does not entail in any way that language actually works this way. It seems that most semanticists are interested in depicting the senses of the verb "believes" which occur in actual natural laguage use. Of course, this may still disregard actual performance, but there is a long way from actual performance to ruling out M P Z in the name of " c o m p e t e n c e " . In fact, if M P Z is ignored why not go the whole way Hintikka originally took and concentrate on ideal knowers who are logically perfect? T h e legislative and descriptive projects are not incompatible. T h e y can be pursued in parallel. But it is one thing to discern two projects and another thing to exorcise cases like Mates', as Kripke suggests (1978, fns. 15, 23, 28, 46). T h e latter approach seems to me radically off the point if our analysis is to codify the semantics of "believes" as used in natural language. Such escape routes remind one of another distinction worth making. When we talk about failures of substitutivity, we should be clear about which senses of the verb "believes" we are talking. Indeed, failures of substitutivity are relative to the sense in question. T h e r e is a sense of belief reports where we are interested just in the preservation of the truth value of the belief. For this sense coextensionality suffices. This may not be a very frequent use, but still one can encounter some cases where all that matters is truth-value preservation. Secondly, one can be interested in the preservation of the content of the belief, regardless of the attributee's state. In such a case, cointensionality (sameness of proposition) is all that is needed. However, we are interested in the true attributions of belief states, and so we are interested in the conditions in which we can attribute to an agent a belief state that ~ given that he is in a belief state that ~b. Thus, inasmuch as we investigate here conditions for substitutivity, we are interested in substitutivity for belief states. W h e n we thus use (FM) to investigate whether a level on top of character should be postulated, we motivate the discussion by reference to belief states. Now, M P Z and F P A seem indeed to point in such a direction. Before we discuss whether this m o v e is justified, let me look at the question of postulating a higher level of meaning from a more general perspective.
18
JOSEPH
ALMOG
It would seem that there is pressure from other quarters to add a level of meaning on top of characters. T h e r e is a whole set of locutions which seem to suggest that we can talk about characters in the very object language (English). Such locutions involve r e f e r e n c e to meanings as in: (15) (16) (17)
If "table" meant what " c a r " standardly means, " A table has four wheels" would be true. " T a b l e " could mean what " c a r " means "Oculist" means what " e y e doctor" means
Of course, such discourse is usually said to occur in the metalanguage and the fact that the metalanguage of English uses English should not confuse us. But before we relegate such locutions into the wilderness of the metalanguage, it seems to me that we should try to depict one uncontroversial fact: we talk in English about meanings, just as we talk about mountains and electrons. T h e cognoscenti among us may say that there is no use in depicting this fact because it will lead to paradoxes. But we should not be so quick as to push the whole issue under the rug of metalanguage. Prima facie, it is not obvious how one would use the usual diagonalization procedure to derive paradoxical sentences about meaning, and even if this were shown to be feasible, it would still be far from obvious that the very idea of discourse about meaning should be abandoned. Paradoxes in set theory have not made us "afraid" of talking about sets. T h e y made us realize that precautions should be taken before we engage in such an enterprise. This is not the place to assess the options which are open to one who wishes to avoid the possibility of gener~tting self reference. First a formal structure as to what meanings are should be erected. Then, one can try to see whether paradoxes can be generated and only then, having seen the paradoxes, one can envisage how to block them. With proper type theoretical restrictions on characters (the objects of discourse about meaning is) or a cautious build up of the set of all characters (if there is such a set) such "shadows" can be blocked. But, at present, all this seems like putting the cart before the horses. All we do have is the set of locutions which demand the ability to refer to characters as values of certain expressions. In other words, we look for a mechanism which satisfies the following desiderata: (i) It contains a set of new determinanats, on top of worlds and contexts, which are such that at a given value of this parameter an
WOULD
YOU
BELIEVE
THAT?
19
expression is assigned a character. We need a set of such determinants because we want to enable expressions to have different characters than those they in fact have. (ii) It contains a determinant at which expressions get their standard characters. (iii) It codifies the assignment of characters at these determinants by functions from the determinants into the characters. Let me tentatively call the set of these determinants the set of possible dictionaries of a language, and let us call the functions from dictionaries to characters interpretations. Also, let us call the dictionary at which expressions are assigned their standard characters, the standard dictionary. Now, two crucial questions face us immediately: (i) How do we discern the set of admissible dictionaries for a given language, as opposed to dictionaries which we bar from (even possibly) interpreting the language? (ii) Having specified the set of admissible dictionaries, how do we discern the standard dictionary of the language, within that set? I believe that these two questions force us to consider general issues in the semantics of natural language whose role is crucial for the semantics of belief sentences. In particular, I will try to argue below that the natural way to proceed with respect to (i)-(ii) is to look for "axioms" about dictionaries which will delimit the range of admissibility and, hopefully, single out the standard dictionary. The latter task is much more complex and would surely involve one in constraints beyond the axioms designed just to delimit the set of admissible dictionaries. The main point that we seem to face when we come to deliberate about these constraints is to single out our intuitive perception of the components that the notion of meaning has. It is my belief that two such components stand above anything else, and their proper depiction by the theory is a crucial adequacy constraint for the theory. Interestingly enough, we shall see the following emerge: the two components which come up when we analyze, in full generality, the notion of meaning in natural language are precisely the two components which will come up later, in our solutions to MPZ and FPA, in the restricted sphere of the semantics of belief. Indeed we shall see that this is far from being accidental. The fact that the semantics of belief should satisfy two adequacy constraints which are of the same nature as those constraints the semantics of
20
JOSEPH
ALMOG
natural language, as a whole, should satisfy, may be explanatory to the crucial role of the semantics of belief for the general enterprise. I believe that the key to these questions lies in the realization that a character, or that level which codifies "what the speaker knows when he knows the meaning of a word", has two aspects: a public aspect and a private aspect. Let me try and give a rough sketch for the motivation for examining such a view. The reader should bear in mind that this is just an argument for the potential interest that such a view may have and not an argument in order to establish this view. My intention is to justify putting forward this suggestion, and then, in the rest of the paper, try to defend it and see what it amounts to. Put crudely, model theoretic analyses of meaning, or indeed any other formal codification of "meaning", seems to be torn between two opposite requirements: (i) Condition 1 - emerging from the semantics of ordinary words, viz. non-attitudinal contexts, where we ought to guarantee that various users of the language, at different times and places, use the words of the language with enough common meaning to allow for communication. Hence, it seems that any formal analysis of "meaning" should endow the notion of meaning with a component which guarantees that words have the "same meaning" across users in different times and places. (ii) Condition 2 - emerging from the semantics of attitudinal verbs: it seems that reports of the beliefs, sayings, etc. of others raise the possibility that various users (or the same user on different occasions) use the same word of the language with "slightly" different meanings. Such differences do not lead us to classify them as "linguistically incompetent". Rather than regarding such cases as deviant exceptions, we recognise that they are part and parcel of the notion of "meaning". Let us accept both conditions as conditions of adequacy on the representation of "meaning". I do not claim to have argued for either of them, but rather I suggest to accept them as tentative, pre-theoretical intuitions that we seem to have on "meaning" and whose investigation should figure in any semantic analysis of natural language. The level at which these conditions should be codified is character, at least if one works within the model theoretic tradition discussed above. This suggests the following way to proceed: each character (for a given well formed expression) is an ordered pair, consisting of O) a core meaning and (ii) the completion of the core.
WOULD
YOU
BELIEVE
THAT?
21
(i) is the candidate for the satisfaction of condition 1. A l l uses of an expression share their core meaning. This would be an obligatory constraint on being a competent user of the language. The core is what is speaker-invariant and it is this component which encodes the "public" aspect of the meaning of an expression. The completion of the core should satisfy condition 2. The completion is that private connotation, or association, which the speaker may associate with an expression of the language on a given occasion (so that completions differ not only across users but also across different uses of the same speaker). If successful, completions may account for puzzles of attitudinal contexts where both the reporter and the attributee use words with the same core meaning, but differ as to completions. This, of course, requires that attitudinal verbs attach to the full character (core and completion) of an embedded sentence. At the moment, however, my point is not to offer hints of potential dividends. I wish to offer a very crude picture of a proposal which will serve as no more than a "bold working hypothesis". Therefore, we should not try to rush such crude pictures to help us solve very delicate puzzles, before working out the details of the suggested proposal. Thus, the working hypothesis advanced here aims to let characters consist of two components, the core meaning and its private completion. Now, there are various pre-theoretical intuitions built vaguely into these two components, which require a much clearer statement. In particular, how should we codify the core-meaning invariance of expressions across language uses? How precisely does this crude picture relate to the notion of a dictionary? If characters are assigned in dictionaries, what is the difference between two dictionaries r,r' which assign the same core meaning to an expression"E" (but may differ as to the completions), and a pair of dictionaries r,r", which differ even as to core-meaningassignment? Moreover, if we are in search of the standard dictionary, what do we really mean by this? Do we mean that there is a single standard full meaning to be assigned to an expression and we are in search of the dictionary at which this full meaning is assigned? But how could this be right, if "full meaning" includes a private component? So perhaps "standard" refers only to core-meaning-assignments, and what we are after is a set of dictionaries at which the expression gets a "standard" core meaning, but such that different members of the set differ as to completions they assign to the expression?6a
22
JOSEPH
ALMOG
All this calls for precise analysis. In the following, I will try to embark on certain aspects of such a project. So far we have considered structures with the form (U, ~ , ~ , [ ]), where U is the domain of the structure, c~ is a contextual structure, ~F is an intensional structure and [ ] is the semantic value function defined relative to contexts c and possible worlds w, [ ] . . . . We now add to these structures dictionary structures (DS). A dictionary structure is a triple (R, N, to) where: (i) (ii) (iii)
R is a set (possible dictionaries); N : R --~ 2 R (the neighborhood function); ro 6 R (the ideal atomic standard dictionary). 7
For each r ~ R, Nr is the neighborhood of feasibile variants of r. 8 We require that expressions of the language be evaluated at triples r, c, w. Take an expression " E " . We call [E] ..... the extension of "E" at r, c, w. We call [E]r.c the intension of " E " at r, c, and we call [E]~ the character of " E " at r. Finally we call [E], the function from R into characters, the interpretation of "E". Regarding the character, each character is an ordered pair (S, CO) where for an expression " E " , S(E) is the core meaning of " E " and CO(E) is the completion of S(E) (of course, when the dictionary of evaluation isn't definite, we index both S(E) and CO(E) by r, "the core meaning of " E " at r" and "the completion of "E" at r", respectively). We can also use the following notation: Since [E]~ is the character of " E " at r, we can refer to core meanings and their completions by reference to their "coordinates" in the ordered pair (S(E)r, CO(E)~) = [E]r viz. by letting [E]~ stand for S(E)r and [E] 2 for CO(E)r. We define N~ (for every r e R) for every expression " E " in the following way:
V r'(r'~ N ~
, S(E)~ = S(E)e)
(Thus Vr' ~ N~ [E]~ = [E] 1, and consequently for Vr' c Nr [E], and [E]e can differ at most as to [E]~, [E]2,.) The question of the nature of the constraints can be posed now in a more coherent form. We ask: what are the conditions which a structure (R, N, ro) must satisfy in order to count as an admissible structure of the language? Whatever one's answer to this query may be, it seems to me that we have already made a step in the right direction (compared to two level
WOULD
YOU
BELIEVE
THAT?
23
theories). Now the question is well posed: we ask what should constrain dictionary structures, and in so doing, we do not confuse constraints on possible worlds with constraints on dictionaries. The task of formulating these two types of constraints seems to me to bifurcate into very different enterprises. When we constrain possible worlds, we delimit what can count as a possible state of affairs, what can be a fact. In other words, we investigate the ways the world itself could turn to be. This is a metaphysical inquiry (in Kripke's sense): we inquire into the metaphysical structure of the world and its potentialities. A very different enterprise is to delimit the range of "possible meanings". Thus one point is worth making here: no matter what one's convictions regarding the proper constraints on dictionaries are, a definite step forward was achieved here by the very fact that the constraints are going to be on dictionaries and not on possible worlds, which do a double job they should not do. That is, the constraints will be on the proper determinant. It is in this sense that "meaning postulates" have been misused, as far as I can see, ever since their introduction by Carnap and Kemeny. They did two jobs at a time: they constrained what can be a possible state of affairs and they constrained meaning relations. Possible worlds played a double role: as determinants of facts and as determinants of meanings. With a single indexing of the semantic value function one cannot even get such a system to do the proper formal work needed when one wishes to fix the meaning and vary the facts vs. when one wishes to fix the facts and vary the meaning. To do this, one needs at least double indexing by worlds, and if the language has indexicals, even triple indexing is called for. But even then, the error would be at the conceptual level: just as Kaplan has shown that contexts should not be represented by worlds, we should not let dictionaries be represented by worlds, either. 9 Several questions arise: Could any dictionary count as a possible dictiohary? What about logical constants? Can we admit as possible (for natural language) a dictionary in which there are more than finitely many primitive basic meanings? Should we single out a subclass of the set of all possible dictionaries (say, the admissible dictionaries) in which analytic truths such as "a vixen is a female fox", "a doctor is a physician", etc. are always true? (Even though the cores assigned to them among these dictionaries might vary, they would vary simultaneously, or covary.) Should we allow a word of syntactic category k to
24
JOSEPH
ALMOG
get in a possible dictionary, different from the standard one, a core meaning of a word of a different syntactic category k'? (In other words, s h o u l d assignment of core preserve grammaticality?) 1° T h e task of facing these questions will not be undertaken here. I shall confine myself to much more elementary features of the dictionarystructure. Before they are sorted out there is no point in imposing extra constraints (axioms, meaning postulates or what not) on the set as a whole. Needless to say, the particular way in which these questions will be treated determines much of the content of the proposal as a whole, since particular solutions to the questions raised above may increase or decrease the plausibility of the whole structure. I do not wish to deny this, and I believe that such issues merit a very cautious treatment. Yet what I wish to stress is this: (i) until such answers are furnished, none of the present suggestions can, or should, be taken as more than a working hypothesis; (ii) in view of (i), it seems to me still true to say that there is an independent interest in looking at the incomplete picture outlined so far and in focussing on its details. It is worth noticing, however, that I do not postpone only the discussion on the constraints or postulates on admissibility of neighborhoods (the question, what is a " c o r e meaning"?). T w o other issues are not discussed here, either: First, it seems t h a t recent interest, in cognitive science, as to what serves in the role of "that which is the minimal amount of shared meaning (across users) which guarantees communication" may offer us an interesting candidate as to what these core meanings are. What I have in mind, very vaguely put, is t h e idea that core meanigs are stereotypical descriptions associated with words of the language. On such a proposal it would be clear how the core and the completion work in tandem. T h e core is that public element which secures communication. It is that element whose mastery is a necessary condition for being classified as a competent speaker. It is the minimal amount of c o m m o n meaning with which words are endowed across uses, and so make communication possible. T h e completion would be that element which is added to the core on an individual occasion of use as the private connotation, or association, which the user has on that occasion. Be it as it may, this cognitive relevance of stereotypes raises the question whether cores can be identified with stereotypes. As I said above, I shall not face this issue here. (There are very problematic aspects in such a proposal. In particular, stereotypical information,
WOULD
YOU
BELIEVE
THAT?
25
unlike linguistic meaning, underdetermines reference.) A second issue concerns the completions. I have noted that core meanings are to be restricted by some constraints which classify neighborhoods of dictionaries as inadmissible, if any constraint is violated. But what about the completion? As things stand, one might think that any completion is acceptable. But this surely can't be right if we are to block absurdities. Private as they are, completions still obey some constraints. In fact, it would appear that for a given core (i.e., a given neighborhood) all the variant completions shouldn't differ to more than a certain degree. Such a condition would bring out the truly topological nature of the neighborhoods: so far they were just equivalence classes and so no use was made of the neighborhood idea. But the fact that all the completions must be similar to some degree eliminates the equivalence relation in favor of this more elastic constraint. Again, I fear that I cannot discuss here the precise way in which completions are to be restricted. My only excuse is that such a discussion would have to take us into the very intricate issue of how to "measure" variability between private entities which is beyond the scope of this paper.
7.
SOLUTIONS
FOR
MPZ AND
FPA
It seems that a key observation on the way to a solution for MPZ and FPA is the following point: we should differentiate between the types of operators in the scope of which the substitution is supposed to take place. In particular, one major differentiation is absolutely crucial: the distinction between metaphysical operators (M-operators) and epistemological operators (E-operators), This formal distinction is, of course, a reflection of a deep philosophical distinction drawn by Kripke in his attempt to differentiate metaphysical and epistemological modalities. Here I will not try to justify the distinction, but, rather, I will assume it. Hence, inasmuch as an argument against Kripke's insights can be mounted, much of what follows may seem much less attractiveJ 1 It is interesting to note that the distinction between M- and Eoperators is not peculiar to issues at the fourth level of meaning. Indeed, it is a distinction which cuts across all levels. Let me give two examples: first, at the second level (intension) one can see that if th < > ~bis true (where "< >" stands for logical equivalence) and [~th,
26
JOSEPH
ALMOG
substitution into [--]$ is valid. Yet such a substitution is invalid for belief contexts. A second case comes from one level higher. In an insightful paper, dedicated to the extension of his indexical semantics to epistemological contexts, Kaplan (unpublished) has argued for very different strategies for the treatment of metaphysical operators with indexicals in their scope as compared to epistemological operators with indexicals in their scope. Thus to explain the validity of: (18) (19) (20)
Necessarily I am Joseph. I am he. Necessarily he is Joseph.
all one needs is the semantics of "I" and "he" together with what Kaplan calls "haecceitism", viz. the ability to trace individuals across worlds. Hence, Kaplan says, haecceitism suffices for indexical modal (alethic) logic. But this explanation will not do for the following case: (21) (22) (23)
John believes that I am Joseph. I am he. John believes that he is Joseph.
The reason is that the semantics of attitudinal verbs cannot be given in terms of possible worlds, and, consequently, haecceitism is of no help here. 12 My thesis in the present paper is that the fourth level is not different on this issue. Thus, very different strategies are called for when we analyze (24) and (25): (24) (25)
"Vixen" could have failed to mean what "female fox" standardly means. John does not believe that all vixens are female foxes.
The metaphysical possibility referred to in (24) can quite easily be analyzed in terms of quantifiers (in the metalanguage) over (neighborhoods of) dictionaries. For instance, (24) is true iff there is a neighborhood such that in any variant dictionary in it, the core meaning of "vixen" is not the core meaning of "female fox" in the standard neighborhood, Nro. This is surely possible: there is a dictionary (and a neighborhood thereof) where "vixen" means what "plumber" means in ro and where "female fox" still means what it means in ro.
WOULD
YOU
BELIEVE
THAT?
27
But such metaphysical possibilities are irrelevant to the analysis of E-operators. As we shall see immediately below, E-operators call only for those dictionaries which assign meanings so that the holder of the attitude, using those dictionaries to interpret the relevant expressions, employs the linguistically proper core meaning (and is therefore a competent speaker of the language), but still fails to recognize the truth of the substitution. Other neighborhoods, where expressions have different core meanings, have nothing to do with John's potential failure to believe that all vixens are female foxes. Put otherwise, while M-operators relate to the space of all dictionaries (neighborhoods), we are going to see that E-operators relate only to the very neighborhood of the dictionary where the evaluation starts. Let me now proceed to an exposition of the solution of MPZ and FPA on the present approach. First, it seems to me that any solution of MPZ and FPA must satisfy the following adequacy constraints: (A) The embedded sentences (like "Every oculist is an eye-doctor", "A vixen is a female fox", etc.) should come out as truths in virtue of the rules of the language, i.e. linguistic truths. (B) A competent speaker can fail to believe that all A's are B's, even though he believes that all A's are A's, and it is a linguistic truth that all A's are B's. As the reader can easily see, (A), (B) are special cases, within the sphere of attitudinal semantics, of the general conditions I layed down above on any theory of meaning of natural language, viz. the simultaneous encoding of the public and private aspects of meaning. The fact that (A)-(B) should be satisfied by any attitudinal semantics may explain in part why the semantics of attitudinal verbs is seen by many semanticists as a case of "to-be-or-not-to-be" for formal semantics in general. Indeed, the possibility of satisfying both (A) and (B) in framing a semantics for these verbs may hint at the very feasibility of a semantics of natural language as a whole, where both the public and the private aspects of meaning are encoded. In other words, attitudinal semantics is really, in some sense, a microcosm of semantics as a whole, and its success may give us an estimate of the feasibility of the whole enterprise. I believe that (A)-(B) should be jointly satisfied. This is to block two possible escape routes which explain MPZ and FPA by giving up one of (A)-(B). The first escape route I will call "The denial of the premise", the second "The denial of the conclusion". The first denies (A), the
28
JOSEPH
ALMOG
second denies (B). Typically, those who deny the premise suggest that it is not obvious that the two substituends have the same linguistic meaning. Hence, they try to argue that the identity premise (or its variant in MPZ) is simply false. If I understand correctly T. Parsons' recent discussion of FPA, his view codifies the most sophisticated defence of this escape route (see Parsons 1981). The other escape route is to classify believers like John in: "John doesn't believe that all vixens are female foxes" as linguistically incompetent. This is Kripke's way out (see Kripke 1978, fns. 15, 23, 28, 46), which I criticized above. The first way out, Parsons', is much more attractive to me, basically, because if properly taken, it is not really an escape route. Rather, it amounts to a recognition of the fact that the idea of sameness of linguistic meaning is much more subtle than it is often assumed to be. Yet, I believe this escape should be resisted precisely because the unique nature of MPZ (FPA) is that they involve linguistic truths (i.e. truths in virtue of the rules of the language). This differentiates them from Kripke-type puzzles (where the believer fails to recognize metaphysical truths), or from purely logical cases, where a perfect logical acumen would suffice. With (A)-(B) accepted as our "rules of the game", let us advance the details of the present solution. The first phase is to introduce a new type of operator in addition to the M- and E-operators: the class of indirect discourse locutions typified by "It is true in virtue of l a n g u a g e . . . " or, "It is a linguistic truth t h a t . . . " . According to the present proposal, such operators (call them "Loperators") codify the sense of "sameness of meaning" which is presupposed by our assumption (A). The particular feature of such an operator is that it attaches to the core meaning of the operand. Hence, if we evaluate at the standard dictionary, the sentence "it is a linguistic truth that all vixens are female foxes" will be true, because "vixens" and "female foxes" share their core meaning. Formally, we can give the semantics of such an operator by using our neighborhoods. Suppose we codify the operator by N, then " N ( A = B)" (where "A", " B " are common nouns) is true at riff in all variants of r, the core meaning of " A " is the core meaning of "B". In other words, [N(A = B)] ..... = 1 if[ Vr' ~ Nr ([A]~, = [B]~,). But now, "John believes t h a t . . . " is analyzed differently. It attaches to the full meaning of the operaud (the whole character, both core and completion) in the dictionary-variant used by the attributee. In other words, "John believes t h a t . . . " is a dictionary shifter: it shifts the
WOULD
YOU
BELIEVE
THAT?
29
evaluation to the variant used by John. Now, even though this variant, call it rj, assigns the same core meaning to "vixen" and "female fox" (and hence John is competent and we satisfy condition (A)), the completions of "vixen" and "female fox" at rj may differ, and, consequently, the full meaning of "vixen" and "female fox" will differ at rj. Thus John may fail to believe that all vixens are female foxes, because at his variant the two full meanings (to which belief attaches) are different. Hence, two expressions can be substituted for each other in the scope of " N " if they share core meaning across the neighbourhood. But to guarantee substitution in the context of "John believes t h a t . . . " the two substituends should share completions at the attributee's (viz. John's) variant, viz. where "qJ" is just like "~b", except for having " b " everywhere "~b" has "a", "~b" can be substituted by "~O" in "John believes t h a t . . . " iff [~b]rj = [q~]rs (full meaning identity at rj). We can look at this from another angle. I have just said above when two expressions are intersubstitutible in the context of "John believes...". The other side of this coin is to ask, for two arbitrary expressions, what are the conditions under which they are interchangeable in any context. The answer is: this is guaranteed only if they share completions in all the variants of the neighborhood of evaluation, if for all r'~ N , [~b]e = [~b]~, (identity of the full, not just core meaning). Finally, let us see how this applies to FPA. Not surprisingly, everything depends on how we look upon the type of the operator "it is uninformative that". If it is read epistemically and information is analyzed in "subjective" terms relating to the cognitive state of some information processing system, then the puzzle is explained away in the way MPZ was. This is as it should be. If informativeness is epistemic, what matters is the full meaning of the substituends for the cognitive system in question. Since the full meaning is not the same, the inference fails. Of course, if there are two subsfituends which have the same full meaning all across the neighborhood, then indeed even though "it is uninformative" is read epistemically, the inference is valid for any attributee who is linguistically competent. The other possibility is to read the operator "objectively" as being an agent-independent information measure relating to what symbols carry when transmitted, regardless of our cognitive situation. The operator operates then just on the public part, relating just to the core meaning.
30
JOSEPH ALMOG
In that case, the inference follows. As far as public meaning goes, "vixen" and "female fox" are on a par. The inference in the scope of "it is uninformative" is v a l i d . . , as it should be!
8. CONCLUDING
REMARKS
AND COMPARISONS
I would like to close with some comments on two other works in which similar issues have been discussed. The comparison is worth making not just for the sake of "comparative semantics". Rather, it would seem that if similar problems arise within very different theoretical frameworks, one can be pretty certain that one faces a "real" problem in one's own backyard, not just a puzzle arising as an internal issue in one's own modelling. Basically, as far as I know, the present approach is original in the model theoretic tradition. Yet, ha two different frameworks two scholars seem to have confronted, recently, similar issues. I will discuss here the work of B. Partee and D. Ackerman. 13 Partee's work on belief consists of a series of papers from the early seventies. I will not review it here though its order in time is significant. Initially, Partee (1977) tried to solve the Kripke-type and Mates-type cases within possible worlds semantics, arguing for a more refined notion of intension: instead of using one place functions on possible worlds, she suggested two place functions on worlds and individuals. This relativization was supposed to account for failures in belief contexts of substitutions of terms with the same (old notion of) intension. In subsequent papers, Partee realized that this is too easy an escape route which does not really face the full spectrum of conceptual issues involved in the failures of substitutivity. Thus she suggested that attitudinal contexts testify for the following problem: we cannot distinguish a normal belief expressed with non standard meanings from a weird belief expressed with standard meanings. At that phase Partee was still concerned with speakers who are potentially linguistically incompetent, for instance with children who say "Clouds are alive" (see Partee 1980). Should we regard them as linguistically incompetent or as having weird beliefs? This was the point where the Quinean shadow made its appearance in
WOULD
YOU
BELIEVE
THAT?
31
the clear sky of possible worlds semantics. But stormier weather was still to come. In her recent (1982) Partee discusses sentences like: (26)
Thomason believes that semantics is mathematics and Loar believes semantics is psychology.
Partee uses this sentence to argue for the following point. Thomason and Loar are surely competent speakers. They can also communicate with the word "semantics". Yet, they somewhow attach to it different associations so that for each of them the lexical meaning of the word is different. I do not know how much Partee would like to stretch the point. Officially she refers to words which are "theoretical enough". That is, she lets herself into the range of the indeterminacy of translation but not as far as the inscrutability of reference. But, as far as I can see, natural kind words are not immune from this type of argument. Unlike the word "semantics", we may be able to discern for them clear essences. But this is relevant only at the level of intension, not at the level of lexical meaning. At the level of lexical meaning, kind terms are associated with linguistic meanings. Suppose their linguistic meaning (the core) is really something like a Stereotype and the extension. Why shouldn't two competent speakers share the stereotype of the word "tiger" and yet differ enough as to their private completions, so that counterparts of (26) can be arrived at? Communication will succeed because it relies on the core, but beliefs, which attach to the full meaning, will diverge. What Partee tries to show is that such cases suggest that we should give up the idea that we can finitely represent the lexical basis of the language, viz. have only finitely many basic meanings. Perhaps so, though the question is very intricate. In fact, one might think that her conclusion is not that obvious. She uses a single entity for "meaning" without inner demarcations and so she is forced to give up, where she does not have to. Assuming the core-completion distinction, we see that the core meanings of words are finitely representable, though the completions might not be. But this should not be worrying: what matters for language learning is the linguistically agreed upon, public meaning. What matters for communication is also the core. It is finitely representable. On the other hand, there is nothing surprising in the fact that the private completions might not be so representable: in fact their very nature seems anyway to require that. 14
32
JOSEPH
ALMOG
However, besides the finite representability issue, Partee's Quinean motto raises a very basic question for the very enterprise described above: are we able to discern one dictionary as the standard dictionary? Of course, this issue opens a pandora box: one must first state "the rules of the game" (which evidence is acceptable when trying to discern the intended dictionary), since as any reader of Quine knows, much of the results depend on what is allowed in the game. In addition to these methodological issues, we must also ask: can we discern such a single intended dictionary? I believe that the present framework offers good prospects for answering this question, at least by discerning two different questions: (i) Can we discern the standard neighborhood? (ii) Can we discern within this neighborhood the standard dictionary? This differentiation seems to me crucial. Indeed the answer to (i) may (and, intuitively, should) be affirmative, while the answer to (ii) may (and, in my mind, should) be negative. This is not the place to go into these issues. What should be noted here is that Partee's remarks make it clear that model theoretic semantics for natural language must face a question which any model theoretic treatment of an axiomatic theory has faced long ago, viz. how to discern intended from non intended models. I hope to return to these issues in future work. 15 The second work I wish to refer to is Ackerman's recent work on the concept of analysis. I cannot review it here in detail and I will confine myself to pinpointing the interconnections between Ackerman's work (see her 1979, 1981) and the present results. Basically, Ackerman has in mind a very similar strategy to the one employed here, though for independent reasons. The strategy is to separate the a priori from that which guarantees substitutivity in attitudinal contexts. For the latter she defines the concept of "connotation" ("4)" and " ~ " share connotations if they are intersubstitutable salva veritate in attitudinal contexts). On the other hand, conceptual analysis may go with different conditions. The analysans and analysandum should be "close enough" for the relation to count as "being an analysis of", yet "far enough" for sameness of connotation to fail. Ackerman suggests that for " A " and "'B" to stand in this relation " A is B" (or "every A is a B") should be (i) alethically necessary and (ii) a priori. " A is B" can meet these requirements even if they fail to share connotations. Ackerman's cases include classical philosophical analyses:
WOULD YOU BELIEVE THAT?
(i)
(ii)
33
The analysis of names: "Cicero" and "The object standing at the other end of the causal chain leading to my use of "Cicero". The analysis of "knowledge" as "justified true belief".
I cannot discuss these matters here as they obviously involve very problematic questions by themselves. However, I should like to emphasize that I regard the possibility of extending the present apparatus to cases which she treated so adequately, as an adequacy test for the present suggestion.
NOTES * My debt to D. Kaplan is literally ineffeable, so I will stick to the advice to pass in silence over what one shouldn't try to say, though this paper wouldn't have been possible without him. Fie probably objects to the very strategy of this work. I also wish to express deep gratitude to D. Scott and H. Kamp for being patient with my sloppiness and to R. Marcus (in UCLA) and H. Wettstein (in Stanford) for spotting some of the mistakes I tried to propagate. Finally, I wish to thank P. Suppes and the Institute for Mathematical Studies in the Social Sciences for his kind hospitality. This version has benefited from the very thorough criticism of an anonymous referee and the Editor, E. Saarinen. i In fact, the first to have suggested a three level theory is Montague. It is significant because his best known intensional logic (see chap. 8, 1974) is often taken as the paradigm of a two level theory. The reader of 'Universal Grammar' (chap. 7, 1974) finds the full three level hierarchy. Note also that as early as 1967 Montague was aware of the indexical validity of sentences like "I exist" for whose treatment one needs a three level theory. Such a sense of validity due to Montague and Kamp (who was interested in representing the indexical validity of "NOW ~b~ ~ th") is reported in Montague's 1968 paper 'Pragmatics' (chap. 3, 1974). 2 From conversation with Kaplan, I understand that he is no longer that sympathetic to the use of demonstrations in the semantics of demonstratives and that he would prefer to treat terms like "this" on the model of pure indexicals. On this new approach the informativeness of identities will be explained in a different way, which, I believe, will not be open to the problem mentioned above. For lack of space, I cannot go into this new approach here. 3 Such problems have been known to arise in different logical quarters. For instance, if one follows the Frege-Church trail, it seems that in order to treat iterated belief contexts, a hierarchy of senses should be postulated. This is known to generate some deep problems for a semantic treatment of such a language, some of which were dealt with in Kaplan's dissertation (Kaplan 1964). Other logical problems arise in possible worlds theories of iterated belief sentences. For instance: (i) Linsky has shown how Hintikka's rules of model sets constructions cannot accommodate iterated belief contexts where the innermost
34
JOSEPH
ALMOG
clause is a metaphysical falsehood (e.g., sentences like "John believes that Tom believes that Tully isn't Cicero"). (ii) In possible worlds semantics the meaning of the verb "believes" is, roughly, a two place function taking an individual and a proposition into a proposition. But now if we consider iterated contexts, the function associated with "believes" will have to apply to itself. Yet, the metalanguage of such object languages is usually given in set theories like ZF where such self application is known to raise problems. I should note that I do not discuss these technical issues in the present work. 4 The reader has noticed that now we do not talk about meanings in the object language by using locutions like "the meaning of ' A ' " whose extensions are meanings (senses). Rather we use an ordinary object languag e and leave the philosophical comment "the words used in this case, viz. 'vixen' and 'female fox' have the same meaning" to the metalanguage of (12)-(14). 5 Indeed, Church (1946) notes that high order senses should have been postulated by Frege anyway. 6 Kripke has given such interesting cases in his (1978). A much simpler and yet insightful case is due to D. Kaplan: Suppose that Esa gets up in the morning in Los Angeles and tells his wife Lisa that he is going to San Diego for a conference. Lisa is a trustful wife so we conclude (after the appropriate time for the drive has lapsed) "Lisa believes that Esa is in San Diego". Yet, unknown to her, Esa decides that morning to join D. Kaplan who drives to Santa Barbara. Esa plans to put a beard on his face, change his trousers for an old pair of jeans, and then go to the Santa Barbara nightclub and play the guitar and sing as a guest singer. Incidentally, Lisa decides to go to Santa Barbara that day and she ends up watching the entertaining show in that nightclub. D. Kaplan, sitting at a table behind her tells a friend: (pointing to the platform) "Lisa believes he is in San Diego". Recognizing a familiar voice, Lisa turns and says to Kaplan: "Do you think I am out of mind? Of course I don't believe he is in San Diegot!". 6a I would like to stress, in view of the obvious philosophical delicacy of the notion, that by 'private' I do not mean 'logically private', but rather 'idiosyncratic' (and hence, graspable in principle by others). 7 A word on terminology: every neighborhood of dictionaries will have a "base", the dictionary of which it is a neighborhood. Hence, this base is the "atomic" dictionary. I use "ideal" because, as we shall ~ee, it isn't obvious that the postulation of such an atom isn't idealizatory, ro isn't only an ideal atomic dictionary, it is also standard, the ideal atomic dictionary of the standard neighborhood. See below for discussion. 8 For what a neighborhood is, see the discussion below. 9 It is interesting to note that Kaplan says that at the end of his life Carnap became aware of the need to draw a distinction between models (worlds) and interpretations (what I call here "dictionaries"). See Carnap (1963, pp. 902-3). Also, Carnap might have rejected all the discussion above by saying that different dictionaries are really different languages. I think that two major points should be said on this escape route. Even if all that is involved is a set of languages, we could still do everything I do with the dictionaries - by saying that we study a set of languages which share their vocabulary and syntax and generate the same wffs. So the terminology would be different, but the essence of the discussion would be the same (this happens when philosophers use the term "idiolect"). But to move to the real issue, I think the objection is misconceived. As Kripke says elsewhere (1978), Frege's suggestion that two persons who associate different senses with
WOULD YOU BELIEVE
THAT?
35
a name are talking different languages is deeply implausible. I think that the fact that individuation criteria for languages may be fuzzy in some cases doesn't mean that we have to throw the baby out with the water and suggest that difference in interpretation is difference in language. We really study here objects with quite a rigid syntax, morphology and phonetics and we are interested in how this system of signs could assume different meanings. There is a very neat model for this conception: the logician's notion of an uninterpreted calculus which may be interpreted in various ways. If one is working out a semantics for natural language within the model theoretic tradition it is almost a necessity to admit a variety of interpretations for a single calculus. Of course, the spectrum of variability of such interpretations may be, in the formal case, much larger. Various cognitive, empirical and evolutionary factors may narrow down the range of admissible dictionaries for a natural language, in a much more radical way. If this is true, so much the better for the constraints on admissibility which will not leave us with a vast range of possible dictionaries. This is, however, already an internal question on what the constraints should be. Here we deal with sceptics who question the very plausibility of the whole enterprise, a scepticism which seems to me illfounded, from a conceptual point of view. As far as the formally inclined semanticist is concerned, he can save himself the debate and agree that he deals with a set of languages similar in features, in the way described above. lo The question, which "axioms" should constrain dictionaries is considered in detail in my (forthcoming), where I also compare the semantic case to axiomatizations of informal notions in mathematics. 11 To assume the distinction doesn't mean endorsing the priorities of Kripke, Kaplan et al. in "doing semantics" (viz. the wish to get the metaphysical operators right, and only then, to try and work out the epistemic operators). One can accept the distinction but reverse the priorities. See for instance Saarinen (1982). 12 This, of course, is just Kaplan's view. There is tradition originating in Hintikka that attitudinal semantics can be given by possible worlds semantics, whether the original model theory (where one had maximal worlds) or Hintikka's later adoption of urn models to allow for failures of logical omniscience. Still, the main point would be whether Hintikka can treat attitudes involving indexical embedded clauses, a question I cannot go into here. I no longer agree that "haecceitism" does not suffice for the treatment of (21)-(23). It does, but only when understood properly, i.e., as involving the generation of singular propositions. It does not, when understood in terms of identity across worlds. The point is that haecceitism may be stated independently of possible worlds or times. 13 It is interesting to note here that a third work, Dummett's recent book, offers an approach to meaning which is very similar to the present one. In his (1981) Dummett considers the issue whether Frege was right in claiming that two users, who associate different senses with the same name, Use different languages. Dummett finds this far from plausible. To him it is the tip of an iceberg: the fact that Frege missed a whole component in his notion of "sense" viz. the fact that the public element of language depends on use and communication. One finds here traces of Dummett's anti-realism. He accepts that senses should be public, but why should they be abstract objects? In particular, if they are arbitrary functions (non constructive functions) how will they be grasped? Frege could have guaranteed communication by making senses that which is exchanged in communication and that which is used by competent speakers. This would have made senses
36
JOSEPH ALMOG
have two components: the public part (manifested in use of the language) and the private part, needed to account for different associations with the same name. Dummett's proposal is very attractive to a nonrealist but before we really think of it as parallel to the present line, we must be cautious on some delicate issues. First, it is cast within Frege's system and so it isn't obvious at all that Frege would have settled for two nonrealistic ingredients of the proposal: (i) The determination of the referent by the sense is algorithmic, while Frege might have reserved himself the possibility of its being a nonconstructive process. (ii) The fact that senses are abstract objects is essential to Frege's view of them and so he wouldn't have been too happy with denying them this platonistic existence. There are other problems in Dummett's proposal which I cannot go into here, especially those concerned with his extension of the view from the case of proper names to the case of common nouns or verbs. I hope to elaborate on this elsewhere. Also I fear that I do not have sufficient Frege-scholarship to evaluate bow far Dummett departs from Frege. (See Dummett 1981, pp. 108-9.) 14 This, by the way, is also a response to an argument by Saarinen in (1983) where he makes a point similar to Partee's. I thank him for bringing Partee (1982) to my attention. 15 In fact, H. Putnam has tried recently to treat this question both in the philosophy of mathematics and in the philosophy of language with a uniform approach. I face these issues in a detailed response to him, see my (forthcoming). REFERENCES Ackerman, D.: 1979, 'Proper Names, Propositional Attitudes and Non-descriptive Connotations', Philosophical Studies 37, 55-69. Ackerman, D.: 1981, 'The Informativeness of Philosophical Analysis', in Midwest Studies in Philosophy, Vol. 6, Minnesota U.P., Minneapolis. Almog, J.: 1981, 'Dthis and Dthat', Philosophical Studies 39, 347=-81. Almog, J.: (forthcoming) 'Semantical Anthropology', to appear in Midwest Studies in Philosophy, Minnesota U.P., Minneapolis. Carnap, R.: 1947, Meaning and Necessity, Chicago U.P., Chicago. Carnap, R.: 1963, The Philosophy of Rudolf Carnap, edited by P. A. Schilpp, La Salle, Illinois. Church, A.: 1946, Reviews, Journal of Symbolic Logic 11, 131-2. Dummett, M.: 1973, Frege: The Philosophy of Language, Duckworth, London. Dummett, M.: 1981', The Interpretation of Frege, Duckworth, London. Frege, G.: 1892, 'On Sense and Reference', in Translations from the Philosophical Writings of Gottlob Frege, translated by P. T. Geach and M. Black, Basil Blackwell, Oxford, 1956. Hintikka, J.: 1962, Knowledge and Belief, Cornell U.P., Ithaca. Kaplan, D.: 1964, Foundations of Intensional logic, unpublished Ph.D. dissertation, UCLA. Kaplan, D.: 1975, 'How to Russell a Frege Church?' Journal of Philosophy 72, 716-29. Kaplan, D.: 1977, Demonstratives, Draft 2, unpublished U C L A mimeo. Kaplan, D.: (unpublished) 'Epistemological Remarks on Indexicals', November 1981. Kripke, S.: 1978, ' A Puzzle about Belief', in A. Margalit, (ed.) Meaning and Use, D. Reidel, Dordrecht.
W O U L D YOU B E L I E V E T H A T ?
37
Montague, R.: 1974, Formal Philosophy, edited by R. Thomason, Yale U.P., New Haven. Parsons, T.: 1981, 'Frege's Hierarchies and the Paradox of Analysis', Midwest Studies in
Philosophy 6. Partee, B.: 1977, 'Possible Worlds and Linguistic Theory', The Monist 60. Partee, B.: 1980, 'Montague Grammar, Mental Representations and Reality', in S. Kanger and S. Ohman (eds.), Philosophy and Grammar, D. Reidel, Dordrecht. Partee, B.: 1982, 'Belief Sentences and the Limits of Semantics', in S. Peters and E. Saarinen (eds.), Processes, Beliefs and Questions, D. Reidel, Dordrecht. Putnam, H.: 1980, 'Models and reality', Journal of Symbolic Logic 45, 464-82. Saarinen, E.: 1982, 'How to Frege a Russell-Kaplan?' Nofis 60. Saarinen, E.: 1983, 'Propositional Attitudes Are Not Attitudes Towards Propositions', Intensional Log/c: Theory and Applications (Acta Philosophica Fennica 35), Helsinki. Stalnaker, R.: 1981, 'Indexical Belief", Synthese 49, 129-51.
Center for the Study of Language and Information Stanford University Ventura Hall Stanford, CA94305 U.S.A.