Journal of Automated Reasoning 29: 59–66, 2002. © 2002 Kluwer Academic Publishers. Printed in the Netherlands.
59
More Proofs of an Axiom of Łukasiewicz JOHN SLANEY Australian National University, Canberra, ACT 0200, Australia (Received: 15 November 2001) Abstract. This paper reports results and some new problems in one of the domains to which automatic first-order theorem provers have been most successfully applied: axiomatics of nonclassical propositional logics. It is well known that one of the standard axioms of the denumerable-valued pure implication logic of Łukasiewicz becomes derivable from the remainder in the presence of negation. Here it is shown that the same axiom is similarly derivable using conjunction and disjunction instead of negation. This closes a problem left open by Harris and Fitelson (Journal of Automated Reasoning 27, 2001). Related problems are discussed, and five such open problems are proposed as challenges to the automated reasoning community. Key words: many-valued logic, substructural logic, axiomatics.
1. The Problem A convenient axiomatization of the infinite-valued implicational logic Ł→ ℵ0 of Łu kasiewicz is as follows: Axioms: AxK AxB AxŁ AxTO
A→(B→A), (B→C)→((A→B)→(A→C)), ((A→B)→B)→((B→A)→A), ((A→B)→(B→A))→(B→A).
Rule:
A→B, A ⇒ B.
Det
The result of replacing AxŁ and AxTO by AxC
(A→(B→C))→(B→(A→C))
is the properly weaker “affine logic” BCK→ , and the result of replacing them instead by the two axioms AxI AxB
A→A, (A→B)→((B→C)→(A→C))
is the still weaker system TWK→ . There appears to be no sustained investigation of TWK in the literature, though it is a reasonably stable substructural logic noted in passing by several authors (Restall, 2000; Slaney, 1995). See, for example, (Rose and Rosser, 1958). The axioms have been given mnemonic names: B,
K, and so on for the associated combinators, Ł for Łukasiewicz, and TO for Total Order.
60
JOHN SLANEY
The result of dropping AxK from TWK→ is an even weaker system called M in (Harris and Fitelson, 2001) and known elsewhere in the literature as TW→ or BB I or sometimes P−W. It will be convenient to adopt some notation for the logics resulting by the addition of axioms to these base systems: we write L[Ax1 . . . Axn ] for the result of adding axioms Ax1 . . . Axn to logic L. For instance, TW→ [AxC, AxK] is BCK→ . The interest of TW→ is partly that it contains just what is needed to have as derivable rules the inferences of strengthening antecedent parts (SAP) and weakening consequent parts (WCP). Antecedent and consequent parts are defined recursively: A variable p has no antecedent parts, and its only consequent part is p itself; the antecedent parts of A→B are the consequent parts of A and the antecedent parts of B; the consequent parts of A→B are A→B itself, the antecedent parts of A and the consequent parts of B; the antecedent parts of a conjunction (or a disjunction) are those of its conjuncts (resp. disjuncts); its consequent parts are the conjunction (disjunction) itself and those of its conjuncts (disjuncts). If formulae C and D differ only in that either C has B as an antecedent part where D has A, or C has A as a consequent part where D has B, TW+ contains the theorem (A→B)→(C→D). Where A→B is provable, SAP licenses the replacement of B by A in an antecedent position, and WCP that of A by B in consequent position. It was shown almost fifty years ago, by Meredith (1958) and independently by Chang (1958), that if Ł→ ℵ0 is extended (conservatively) by the addition of negation with the axioms of double negation and transposition →
AxDN ¬¬A→A, AxTsp (A→¬B)→(B→¬A), then AxTO becomes derivable from the rest of the basis. The proof of this fact is not trivial: a tough exercise for advanced logic students and a challenging problem for automated reasoning systems (TPTP problems LCL109-1 through LCL109-6 are all versions of it). Notably missing from the logics as formulated above are the lattice connectives ∧ and ∨. In Łℵ0 this lack is more apparent than real, for they are definable in terms of implication and negation: Defn.
D∨ D∧
A ∨ B =df (A→B)→B, A ∧ B =df ¬(¬A ∨ ¬B).
The name “TW” was originally “T−W” and records the fact that the system results from the
“ticket entailment” system T (Anderson and Belnap, 1975) by dropping the contraction axiom W. “BB I” of course is a list of the combinators corresponding via the Curry–Howard isomorphism to axioms of the system. The name “P−W” is due to Belnap, who may have thought that T was called “P”; it may therefore be regarded as a spelling mistake! The single axiom (¬A→¬B)→(B→A) is equivalent to AxDN and AxTsp in the context of Ł→ ℵ0 and has been used for this purpose by several authors including Łukasiewicz himself. The equivalence does not hold in weaker systems such as TW→ , however, so the two axioms are used here.
MORE PROOFS OF AN AXIOM OF ŁUKASIEWICZ
61
We remark that negation is essential to the definition of conjunction, for without it ∧ is not definable, as may be observed from this model: → 0 1 2
0 2 0 0
1 1 2 1
2 2 2 2
The implication relation, shown in the Hasse diagram, is a join semilattice but not a meet semilattice. Thus it admits no operation of conjunction. The matrix for the implication connective, however, is such that all theorems of Ł→ ℵ0 take the designated value 2 on all valuations. It is, of course, possible to add the lattice connectives as new primitives subject to some postulates rather than to seek definitions of them. The results of so adding them to the implicational logics specified above are the positive (negation-free) logics TW+ , TWK+ , BCK+ , and Ł+ ℵ0 . The requisite postulates, the same in all cases, are as follows: Axioms:
Ax∧E1 Ax∧E2 Ax∧I Ax∨I1 Ax∨I2 Ax∨E
(A ∧ B)→A, (A ∧ B)→B, ((A→B) ∧ (A→C))→(A→(B ∧ C)), A→(A ∨ B), B→(A ∨ B), ((A→C) ∧ (B→C))→((A ∨ B)→C).
Rule:
Adj
A, B ⇒ A ∧ B.
There are no restrictions on the type of lattice that can model TW+ : any lattice whatsoever can be made to form the basis of a TW+ algebra, for instance by choosing two elements t and f such that f < t, taking the positive cone (the “designated values”) to be the principal filter determined by t, and defining an implication operation by setting x→y = t if x ≤ y and x→y = f otherwise. In particular, the lattices underlying TW+ models do not have to be distributive. That is, they need not validate the formula Dist
(A ∧ (B ∨ C))→((A ∧ B) ∨ (A ∧ C)).
By contrast, since Ł+ ℵ0 is known to be complete for the models intended by Łukasiewicz, which are based on the rational unit interval with its usual numerical The models appearing in this paper were generated by MaGIC (Slaney, 1995), which is a finite-
domain constraint solver highly optimized for modeling just such Hilbert systems of substructural propositional logics.
62
JOHN SLANEY
order, the only lattices on which Ł+ ℵ0 algebras are definable are subdirect products of chains. These are, of course, distributive. Harris and Fitelson (2001) explore the issue of distributivity in some depth, displaying pure condensed detachment proofs of the distribution of conjunction and disjunction over each other in Ł+ ℵ0 . In the + same paper, they note that BCK is well known not to be distributive and exhibit a matrix based on a six-element non-distributive lattice. Since this is the smallest counterexample to distributivity in BCK+ and since there are nondistributive lattices with five elements, we may note that BCK+ does not share with TW+ the property of having models in an arbitrary lattice. In fact, it is the presence of AxK that prevents the lattices of order 5 from being suitable, so TWK+ is like BCK+ in this respect, rather than like TW+ . It is noted in (Harris and Fitelson, 2001) that each of the three axioms AxK, AxŁ , and AxTO can be added individually to TW+ , producing three different nondistributive systems. Interestingly, TW+ [AxK, AxTO] is another nondistributive system. Harris and Fitelson leave open the problem of whether the system TW+ [AxK, AxŁ] is distributive. That problem can now be closed. 2. TWK+ [AxŁ] is Ł+ ℵ0 Note first that AxC is derivable from TWK→ [AxŁ]. This is not a new observation – it was already familiar to Łukasiewicz and to Rose, Chang, Meredith, and others long ago—but the proof is worth rehearsing to make the present paper more selfcontained. We proceed via the lemma that permutation holds in the rule form: if A→(B→C) is a theorem, so is its permutation B→(A→C). Suppose, then, that A→(B→C) is a theorem. The following are also theorems: 1. 2. 3.
B→((C→B)→B)
AxK
B→((B→C)→C)
1, AxŁ
B→(A→C)
2, SAP
Now the rule form of permutation leads easily to the theorem form. For all formulae A, B, and C 5. 6.
B→((A→(B→C))→(A→C))
2, AxB, transitivity
(A→(B→C))→(B→(A→C))
5, permutation rule
Hence the logic TWK+ [AxŁ] is in fact the same as BCK+ [AxŁ], meaning that we may help ourselves to any BCK-valid inferences. In the following proof sketch, the expression “lattice logic” will cover appeals to the basic consequences of the conjunction and disjunction axioms, such as that A→(B ∧ C) is equivalent to (A→B) ∧ (A→C). “Det.” is detachment. Equivalents can always be replaced in context.
63
MORE PROOFS OF AN AXIOM OF ŁUKASIEWICZ
1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12.
(A ∧ B)→B
Ax∧E
(A→(A ∧ B))→(A→B)
1, AxB
((A→B)→(A ∧ B))→((A→(A ∧ B))→(A ∧ B))
2, AxB
((A→B)→(A ∧ B))→(((A ∧ B)→A)→A)
3, AxŁ
(((A ∧ B)→A)→A)→A
BCK logic
((A→B)→(A ∧ B))→A
4, 5, transitivity
(((A→B)→A) ∧ ((A→B)→B))→A
6, lattice logic
(((A→B)→A) ∧ ((B→A)→A))→A
7, AxŁ
(((A→B) ∨ (B→A))→A)→A
8, lattice logic
(B→(((A→B) ∨ (B→A))→A))→(B→A)
9, AxB
(((A→B) ∨ (B→A))→(B→A))→(B→A)
10, AxC
((A→B)→(B→A))→(B→A)
11, lattice logic
An interesting variant of this proof is the same down to line 9. Then 10 . 11 . 12 . 13. 14. 15. 16.
(A→((A→B) ∨ (B→A)))→((A→B) ∨ (B→A))
9, AxŁ
A→(B→A)
AxK
A→((A→B) ∨ (B→A))
11 , Ax∨I2
(A→B) ∨ (B→A)
10 , 12 , det.
(A→B)→(((A→B)→(B→A))→(B→A))
BCK
(B→A)→(((A→B)→(B→A))→(B→A))
AxK
((A→B)→(B→A))→(B→A)
13, 14, 15, Ax∨E
Hence it is not necessary to take a detour through the logic of negation in order to establish AxTO on the basis of the other standard axioms of Ł→ ℵ0 : a detour through conjunction will also suffice. It has indeed always been a little mysterious why AxTO comes from the presence of negation in Łℵ0 : one sees the proof, but not quite how negation works the trick. Now there is a new view of the matter that may at last offer some insight: perhaps it is best to see negation as producing total order by making conjunction definable, thus forcing the underlying propositional structure to be a lattice. At any rate, since conjunction and disjunction remain interdefinable if AxTsp is weakened to a rule (if A→¬B is a theorem, then so is B→¬A), AxTO remains derivable with that weakening of the postulates of BCK→ [AxŁ] with negation. 3. Related Results and Problems As noted above, and by Harris and Fitelson, TWK+ [AxTO] is a nondistributive system. Adding negation with AxDN and AxTsp still does not make the system distributive. Here is its smallest nondistributive model:
64
JOHN SLANEY
→ 0 1 2 3 4 5 6 7
0 7 6 5 4 3 2 1 0
1 7 7 5 4 5 4 1 1
2 7 7 7 4 6 4 4 2
3 7 7 5 7 5 6 5 3
4 7 7 7 4 7 4 4 4
5 7 7 5 7 5 7 5 5
6 7 7 7 7 7 7 7 6
7 7 7 7 7 7 7 7 7
The matrices for ∧ and ∨ can be read off the Hasse diagram as usual. ¬x is simply 7 − x. The only designated (true) value is 7. Since the matrix also satisfies AxC, it is a model of BCK, showing that even in BCK with negation, AxTO is not sufficient to force distributivity. So AxTO is rather weak in the absence of AxŁ. By contrast, AxŁ is a powerful postulate, especially in the presence of conjunction, as we have seen. To strengthen the positive result a little more, note that TW→ [AxC, AxŁ] contains BCK→ and consequently that AxK and AxC are equivalent in the context of TW→ [AxŁ]. The derivation of AxK from AxC and AxŁ is very easy: 1. 2. 3. 4. 5.
(A→(B→A))→(A→(B→A))
AxI
A→((A→(B→A))→(B→A))
1, AxC
A→(((B→A)→A)→A)
2, AxŁ
B→((B→A)→A)
AxI and AxC again
A→(B→A)
3, 4, SAP
The remaining question is how strong the underlying logic must be for AxŁ and AxTO together to force strong conditions such as distributivity. Evidently, in the context of any system between TW→ and Łℵ0 , if they suffice for either AxK or AxC, then they suffice for both and therefore for the whole of Łℵ0 . In the context of TW+ itself, they do not suffice, for the following model of TW+ [AxŁ, AxTO] does not validate AxK: → 0 1 2
0 1 0 0
1 1 1 0
2 1 1 1
However, the lattice of this model is distributive. Indeed, it is a chain. All models of TW+ [AxŁ, AxTO] with up to eight elements are based on chains or are subdirect products of such totally ordered models. This prompts two open questions:
MORE PROOFS OF AN AXIOM OF ŁUKASIEWICZ
65
1. Is distributivity for ∧ and ∨ provable in TW+ [AxŁ, AxTO]? 2. Is the “real” total order theorem (A→B) ∨ (B→A) provable in the same system? Obviously, positive answers to both questions are forthcoming if it is possible to prove ((A→B)→B)→(A ∨ B) in the logic, but the known proofs of this, in Ł+ ℵ0 , for instance, use AxK, which is not available in the weaker system. Again, MaGIC assures us that there is no countermodel with fewer than nine elements, but of course some numbers are greater than eight, so this is not a proof! Another feature of the above model invalidating AxK is that it is not residuated. One of the most important connectives in substructural logics is the intensional conjunction ◦, known in the relevant logic literature as “fusion” and to linear logicians as “tensor product.” This satisfies the two-way rule: Resid (A ◦ B)→C ⇐⇒ A→(B→C). It is easy to see that no such fusion operation is definable on the three-element structure above, since it would have to satisfy 2 ◦ 2 ≤ 2 (because x ≤ 2) and therefore 2 ≤ 2→2. In fact, all known models of TW+ that satisfy AxŁ and AxTO but that do not satisfy AxC and AxK are similarly unresiduated. This leads to more open questions: 3. Is the addition of fusion a conservative extension of the positive logic TW+ [AxŁ, AxTO]? 4. If not, does the addition of fusion to TW+ [AxŁ, AxTO] suffice to axiomatize Łℵ0 ? That is, does it imply AxK and AxC? 5. Is the addition of fusion, with its two-way rule, enough to generate yet another negation-free proof of AxTO from BCK→ [AxŁ]? The last two of these questions are perhaps those to which a positive answer would be most interesting. They suggest yet another view of the mechanism whereby negation leads to total order in Łℵ0 –it makes fusion definable: Defn. D◦ A ◦ B =df ¬(A→¬B). Note that this definition is not available in logics without AxC. The three-element matrix above, which shows AxK to be independent of TW+ [AxŁ, AxTO], does allow the definition of certain other particles, notably the positive Church constant T. The defining axiom of T is AxT
A→T.
In an algebraic model, the value of T is the top element. It need not in general be the only designated value. One further result indicating the power of AxŁ and suggesting a negative answer to question 3 and a positive one to question 4 above is that if TW→ [AxŁ] is enriched with both ◦ and T, then AxK becomes provable. Originally, I established only the weaker theorem A→(B→B). The observation that the proof
could be extended to yield AxK is due to an anonymous reviewer of this paper, to whom thanks are due.
66
JOHN SLANEY
To see this, note first the effect of fusion: since everything entails T, in particular (T ◦ A)→T is a theorem, and hence so is T→(A→T). Now let A be an arbitrary formula, and let U abbreviate T→A. The following are all theorems: 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11.
((A→T)→T)→((T→A)→A)
AxŁ
T→(U→A)
1, definition, (X→T) ↔ T
((U→A)→A)→(T→A)
2, suffixing
((A→U)→U)→U
3, AxŁ, definition
(U→(A→U))→(A→U)
4, AxŁ
(T→A)→((T→T)→(T→A))
AxB
U→(T→U)
6, definition, (T→T) ↔ T
U→(A→U)
7, SAP
A→U
5, 8, detachment
A→(T→A)
9, definition
A→(B→A)
10, SAP
It is not known whether fusion alone, without the explicit presence of sentential constants, suffices for AxK. The intelligent use of automated reasoning tools, both to generate proofs and to generate counterexamples, is now the technique of choice for approaching these problems and others like them. The five listed open questions are hereby recommended to the automated reasoning community. It is better, for us and for our programs, to work on fresh problems than on stale ones, and, to recall Bob Boyer’s beautiful remark, the great virtue of working on open problems is that it is impossible to cheat. References Anderson, A. and Belnap, N.: Entailment: The Logic of Relevance and Necessity, Vol. 1, Princeton University Press, 1975. Chang, C.: Proof of an axiom of Łukasiewicz, Trans. Amer. Math. Soc. 87 (1958), 55–56. Harris, K. and Fitelson, B.: Distributivity in Łℵ0 and other sentential logics, J. Automated Reasoning 27 (2001), 141–156. Meredith, C.: The dependence of an axiom of Łukasiewicz, Trans. Amer. Math. Soc. 87 (1958), 54. Restall, G.: An Introduction to Substructural Logics, Routledge, 2000. Rose, A. and Rosser, J.: Fragments of many-valued statement calculi, Trans. Amer. Math. Soc. 87 (1958), 1–53. Slaney, J.: MaGIC, matrix generator for implication connectives: Notes and guide, Technical Report TR-ARP-11-95, Automated Reasoning Project, Australian National University, 1995. Sutcliffe, G. and Suttner, C.: The TPTP problem library: The CNF release v.1.2.1, J. Automated Reasoning 21 (1998), 177–203.