Journal of Automated Reasoning 15:267-275, 1995. (~) 1995 Kluwer Academic Publishers. Printed in the Netherlands.
267
A Note on Assumptions about Skolem Functions H A N S J U R G E N O H L B A C H and C H R I S T O P H W E I D E N B A C H Max-Planck-lnstitutfiir Informatik Im Stadtwald 66123 Saarbriicken, Germany (Received: 19 November 1994; in final form: 12 February 1995) Abstract. Skolemization is not an equivalence preserving transformation. For the purposes of refutational theorem proving it is sufficient that skolemization preserves satisfiability and unsatisfiability. Therefore there is sometimes some freedom in interpreting Skolem functions in a particular way. We show that in certain cases it is possible to exploit this freedom for simplifying formulae considerably. Examples for cases where this occurs systematically are the relational translation from modal logics to predicate logic and the relativization of first-order logics with sorts. Key words: skolemization, refutational theorem proving.
1. Introduction Refutational theorem proving has a certain degree of freedom which so far is not very often exploited. All kinds of transformations preserving satisfiability and unsatisfiability of the formulae to be refuted are allowed. Skolemization is a typical transformation which is not equivalence preserving but satisfiability and unsatisfiability preserving. But this is usually the only routinely applied transformation with this property. For an existential quantification, skolemization introduces a new function symbol whose interpretation is in general not completely determined. Sometimes it is possible to make additional assumptions about the new Skolem function without changing satisfiability and unsatisfiability of the formulae. As an example consider the formula
Vx( (x)
y(B(x,y) A C(x, y))).
Skolemization and clausification yields as an intermediate result
ff2(x) D B ( x , f ( x ) ) ,
(1)
• (x) D C(z,f(x)).
(2)
If ff is a big formula this duplication may be disastrous. If B is serial, i.e. Y x 3 y B ( x , y) holds, we claim that the following optimized transformation is possible:
B(x, f(x)), • (x) 3 C(x, f ( x ) ) ,
(3) (4)
268
HANS JORGEN OHLBACH AND CHRISTOPH WEIDENBACH
where one occurrence of q~ is dropped or, equivalently, the Skolem form B(x, f(x)) of B(x, y) is moved to the top-level of the formula. That means the axiomatization of f is made stronger and the clauses become shorter. The nontrivial part of the correctness proof, which also shows the idea behind the transformation, amounts to transforming a model for (1) and (2) into a model for (3) and (4). Since f is a Skolem function, we use the freedom to define a suitable interpretation for f. Suppose ~,(x) is true for some assignment x/a. Then by the clause (1), B(x, f(x)) is true as well, i.e. (3) is true, where f(a) is the same value in (1) and (3). This is the unproblematic case. Now suppose if(x) is false for the assignment x/a. Then (4) is true independently of the meaning of f. But what about (3)? Here we use the seriality assumption for B. We assume Vx3y B(x, y). This tells us that there is some e such that B(x, y) is true for the assignment Ix~a, y/c]. That means we can define f(a) to be just this e and then (3) becomes true as well. This informal description is made precise in the next section. In Section 3 we show that certain well known transformations in sorted logics and modal logics are in fact instances of this optimized skolemization. We give several examples which show that optimized skolemization can improve the proof search considerably.
2. Optimized Skolemization For a precise definition of the optimized skolemization we need to manipulate subformulae of a formula at a certain position inside the subformula and with a certain polarity. To this end we introduce the standard definitions of formula occurrences and polarities. An occurrence is a word over N. Let e denote the empty word. Then we define the set of occurrences occ(eo) of a formula • as follows: (i) the empty word is in occ(~), (ii) i.Tr is in occ(~) iff q? = ~IJ1 A . ' ' A ff~T/no r • = ~I/1 V . . " V ~ / n , 1 ~< i ~< n and 7r E occ(~i), (iii) 1.Tr (2.70 is in occ(~) iff e2 = V x ~ or = 3 x ~ or • = --,~ or ~b = e2 D O or q) = ,:it = O and 7r C occ(~) (Tr E oce(O)). Now if 7r E occ(~) then ~I'[,= k9 and ~[i.~r= kgil~r where ~ i is the ith subformula of • (see above). Intuitively, the polarity of some formula /b ]~r= k9 in • is 1 if ~ occurs below an even number of (explicit or implicit) negation symbols, it is - 1 if occurs below an odd number of negation symbols and it is 0 if • occurs below an equivalence symbol. We define the polarity pol(a2, re) of a formula ~1,~ in a formula • by: (i) pol(~,e) = 1, (ii) pol(~o,i.rr) = pol(i~li,Tr ) if ~ is a conjunction, disjunction, quantifier formula or • is an implication and i = 2, (iii) pol(~,i.Tr) = -pol(~[i,Tr) if • is a negation or q? is an implication and i = 1, (iv) pol(,~,i.Tr) = 0 if g~ is an equivalence.
A NOTE ON ASSUMPTIONS ABOUT SKOLEM FUNCTIONS
269
We also use occurrences to define the replacement of a formula inside another formula. (I,[Tr/'IJ] with 7r ff occ(~5) is the formula obtained from q5 by replacing (b[~ in (b with '.9. Usually, skolemization is defined with respect to a formula in negation normal form. We generalize this definition to the case of arbitrary formulae. The polarity function tells us whether a formula remains unchanged by producing the negation normal form. A formula ff[~r, 7r E oec(q~), is in the scope of a universal quantifier, if there exists a prefix # of 7r such that • lu= Vx(9 and pol(q~, #) = 1 or [u= 3x(9 and pol(q~,#) = - 1 or pol(q~,#) = 0 and • ]~, is an arbitrary quantificational formula. It is in the scope of an existential quantifier, if there exists a prefix u of 7r such that fflu= Vx(9 and pol(qhu) = - 1 or ~b[.= 3x(9 and pol(~), u) = 1 or pol(~,p) = 0 and ,I, [u is an arbitrary quantificational formula. LEMMA 1. Let q~ ]~r= ~ with 7r E occ(~) and pol(~, 7r) = 1 and .L4 be an
interpretation. Then
if3,4 ~'~ A (~ D (9) then 34 ~ ff2['n-/(9]. Proof The proof is due to Loveland [2, Lemma 1.5.1, p. 40]. THEOREM 2 (Optimized skolemization). Let q~ ]Tr= 3 y ( ~ A (9), 7r E occ(q?),
pol(¢), 7r) = 1, 3 y ( ~ A (9) is not in the scope of an existential quantifier and let Xl,.. . , Xn be the universally quantified variables which occur freely in 3y( ~ A (9). In addition, we assume the seriality condition ~ ff D VXl,.. •, Xn~y (9. Then is satisfiable
iff • [ T r / ~ { y / f ( x l , . . . , Xn)}] A V X l , . . . , Xn ( 9 { y / f ( x l , . . . , Xn)} is satisfiable, where f is a new n-place Skolem function. Proof " 3 " Assume .L4 ~ ~. Since f is new to ff it is sufficient to construct an interpretation A/V which is like .A4, but in addition specifies an interpretation for f such that M ' ~ ~ [ T r / ~ { y / f ( x l , . . . , x n ) } ] A V x l , . . . , X n ( 9 { y / f ( x l , . . . , Xn)}. Consider domain elements a l , . . . , a n as assignments for the universally quantified variables x ; , . . . , Xn. If.hal [ X l / a l , . . . , Xn/an] ~ 3y(~AO), then there exists some b as assignment for y such that 3.4[x1/a1,..., Xn/an, y/b] ~ •/x (9. We choose b as value for f, i.e. f f i ' ( a l , . . . , an) de=fb. I f J M [ x l / a l , . . . , Xn/an] fiat
def
3 y ( ~ A O ) we choose f ( a l , . . . , an) -= e, where J M [ x l / a l , . . . , Xn/an, y/c] O. Such a c always exists by the seriality assumption AA ~ V x l , . . . , Xn3y 0 which implies, as f is new to ~5, A/t t ~ VXl,... ,Xn3yO. Now by construction of f f i ' we have AJ' ~ ~ A (3y(~ A O) D ~ { y / f ( x l , . . . , Xn)}) and thus by [,emma l, fl,t' ~ ~ [ T r / ~ { y / f ( x l , . . . ,Xn)}]. In addition, 34' ~ VXl,... ,xn
270
HANS JI~IRGENOHLBACH AND CHRISTOPHWEIDENBACH
3y(9 A (3y (9 D (9{y/f (xl,..., Xn)}) and thus again by Lemma 1, .h4'
Vxl,..., xne{y/f(zl,..., Xn)). "~"Assume M ~ '~[Tr/~{y/f(xl,...,Xn)}] A V x l , . . . , X n ( 9 ( y / f ( x l , • .., Xn)}. Then we have: A// ~ ~ { y / f ( x l , . . . , Xn)} D 3y(~ A (9) by choosing y/fM(al,... ,an) for y in 3y(~' A (9) and any assignment hi,... ,an of the X l , . . . , xn. Now by Lemma 1 we conclude .A4 ~ iI~. EXAMPLE 3. We apply the optimized skolemization to Pelletier's [6] problem no. 29:
(1)
?x F(z),
(Z)
3x o(z), ~[(Vx(F(x) D H(x)) AVx(G(x) D J(x))) =--Vx, y((F(x) A G(y)) D (H(x) A J(y)))].
(3)
Elimination of the equivalence symbol in (3) by -~[~ - ~] iff (q?V~)A(--,ffV-,~) gives:
Vx(F(x) D H(x)) AVx(C(x) D J(x))) VVx, y((F(x) A C(y)) D (H(x) A J(y))), (5) "-,(Vx(F(x)D H(x)) A Vx(G(x) D J(x))) V ~Vx, y((F(x) A G(y)) D(H(x) AJ(y))). (4)
For the purpose of readability we move the negation symbols and quantifiers occurring in (5) inside: (6)
3x(F(x) A ~H(x)) V 3x(G(x) A ~J(x)) V3x(F(x) A 3y(G(y) A (-,H(x) V ~J(y)))).
There are the following occurrences of existential formulae:
(6)11 (6)lz (6)13 (6)1312
= = = =
3x(F(z) A-~H(x)), 3x(G(x) m--,J(x)), 3x(F(x) m 3y(G(y) A (~H(x) V ~J(y)))), ~y(G(y) A (~H(x) v-~f(y))).
Theorem 2 is applicable to all these occurrences, because all occurrences have polarity 1 and the formulae (1), (2) guarantee the seriality condition for the atoms of the form F(x), G(x), which we want to move outside. Thus we get after optimized skolemization:
A NOTE ON ASSUMPTIONS ABOUT SKOLEM FUNCTIONS
(7) (8)
271
~H(a) V -~J(b) V ~H(c) V ~ J ( d ) , F(a) A G(b) A F(c) A G(d).
It is obvious that a refutation of the formulae (1), (2), (4), (7), (8) is much simpler than refuting (1), (2), (4), (5). In fact, OTTER [3] (version 3.0, auto mode) needed half of the time and clauses to refute the formulae with optimized skolemization compared to the formulae translated with OTTER's standard skolemization procedure. In addition, the optimized skolemization proof is shorter and has a lower proof complexity. As pointed out by a reviewer, our skolemization technique is not new in the sense that it can be simulated by standard skolemization and equivalence preserving transformations. The formula ~ of Theorem 2 is equivalent to the formula
VXI,...,Xn~y(T2[7"g'/(~T~A e)] AVXl,...,xn~yO
(5)
because ~ ¢, D V X l , . . . , xn3y 0. Now (5) is equivalent to the formula 'V'Xl, . . . , X n ~ y
[~[7"C/ T~] /% OJ.
(6)
This can be proved by techniques similar to those used to prove Theorem 2. Eventually standard skolemization yields V X l , . . . , x~ ['I*[~r/'.~{y/f(xl,..., xn)}] A e { y / f ( x l , . . . , z~)}]
(7)
which is exactly the result of Theorem 2 if the universal quantifiers are moved inside. However, we prefer the formulation of Theorem 2, because we interpret V x l , . . . , xn @{y/f(xl,...,Xn)} as a (stronger) definition of the Skolem function f . In addition, the formulation of Theorem 2 is compatible with the usual techniques for clause normal form, e.g. anti-prenexing, whereas the above argumentation requires to move quantifiers outwards.
3. Ddja Vu The optimized skolemization has been used implicitly in some other systems. 3.1. SORTEDLOG1C
The fact that information about Skolem functions can be moved from a local context to the top-level has been implicitly exploited in sorted logic. Consider the formula (b D 3xB C(xB), which is the sorted formalization of • ~ ~x(B(x) A C(x)). In sorted logics, where all sorts are a priori assumed nonempty, it gets skolemized to • D C(a) and the sort declaration B(a) is added to the toplevel sort declarations [8, 9]. Thus, the sort declaration for a does not depend
272
HANS JORGEN OHLBACH AND CHRISTOPH WEIDENBACH
on the condition • anymore. That means global sort declarations about Skolem functions implicitly apply the optimized skolemization. Weidenbach [10, 11] shows how sorted skolemization is applied if the sorts are not a priori assumed nonempty. Then it is only possible to move the sort declarations of Skolem functions outside, if the sort of the existential variable can be proved nonempty. Otherwise the sort declaration remains inside the formula and makes a more general approach to sorted reasoning necessary. The example of Section 2 is an instance of Weidenbach's approach to sorted logic. The unary predicates can be translated into sorts. This enables further simplifications of formula (4): (4')
(VxFH(xg) A VyGJ(yG)) VVXF, ya(H(xF) A J(YG)).
Since the sorts F and G are non-empty (see (1), (2)), (4') is further simplified to (4")
VxF H ( x F ) AVya
J(Yc).
Now a refutation of (1), (2), (4"), (7), (8) by resolution extended with sorts [l l] yields no search anymore. Every possible resolution step contributes to the proof. 3.2. MODAL LOGIC Modal Logic is an extension of predicate logic with the two operators [] and <) [1]. The standard Kripke semantics of normal modal systems interprets the Q-operator as a universal quantification over accessible worlds and the <)-operator as an existential quantification over accessible worlds. This semantics can be exploited to define a "relational" translation from modal to predicate logic. For example ~ © P is translated into Vw(R(o, w) D 3v(R(w, v) A P(v))). R denotes the accessibility relation and o some initial world. Notice that the translation of the <)-operator has the typical pattern where our optimized skolemization is applicable - provided the accessibility relation is serial, i.e. we have modal systems above D. The overall effect of the optimized skolemization is that the conditions on R coming from <)-operators become positive unit clauses. From the []-operator we obtain only negative literals in the clauses. Then the negative R literals can be viewed as constraints over the theory consisting of the positive R-unit clauses and the formulae of the specific modal logic. A similar, but weaker approach has been studied by Scherl [7]. EXAMPLE 4. We show the power of the optimized skolemization by an example taken from modal logic KD45 [1]. In modal logic KD45 the formula Ot3P = <)[~<)C]P is a theorem. The theorem can be translated into first-order logic by introducing an accessibility relation R. Then the theorem is:
273
A NOTE ON ASSUMPTIONS ABOUT SKOLEM FUNCTIONS
3~(R(o,x)
3x(R(o,~)
A
A
Vy(R(x,y) ~ P(y)))
vv(r~(x,v) ~ ~z(R(v,~)
A W(R(z,
~) D P(~))))).
Here o names the initial world and R(x, y) means that world y is accessible from world x. The properties of R in modal logic KD45 are expressed by the following formulae: (1)
Vx3yR(x,y),
(2)
w , v, z (R(~, v) A R(V, ~) ~ R(~, z)),
(3)
Vx, y, z (/~(x, y) A R(x, z) D R(y, z)).
Now we apply Theorem 2 to the negated theorem. In order to get positive polarities for the existential subfomlulae, we eliminate the equivalence symbol by ~[~ --= ~] iff (~ V '/I) A ( - ~ V --~). For better readability we move the negation sign inside. The result is: (4) (5)
3x(R(o,x) A Vy(R(x,y) D P(y))) V3x(R(o,x) A Vy(R(x,y) D 3z(R(y,z) A Vu(R(z,u) D P(u))))), Vx(R(o,x) D 3y(.R(x,y) A = P ( y ) ) ) VVx(R(o,x) D 3y(R(x,y) A Vz(R(y,z) D 3u(R(z,u) A --P(u)))).
There are the following occurrences in (4) and (5) which name existentially quantified subformulae:
A vv(R(x,v) D P(V))), A vv(n(x,v) z 3z(n(V,z)A vu(R(z, u) z P(~))))), (4)[21212 = 3z(R(y,z) A Vu(R(z,u) D P(u))), (5)1112 = 3v(R(~,y) A ~P(y)), (5)1212 = 3v(R(x,y) A Vz(R(ytz) D 3u(R(z,u) A --,P(u)))), (4)[1 (4)12
= --
(5)12121212 =
A -,P(~)).
All occurrences have polarity 1. They are either of the form 3w(R(o, w) A ~) or 3w(R(v, w) A '~), where v, w are variables. In order to apply Theorem 2 and to move the formula R(o, w) (R(v, w)) outside, we must prove the seriality condition ~ ((1) A (2) A (3) A (4) A (5)) D 3wR(o,w) (~ ((1) A (2) A (3) A (4) A (5)) D 3wR(v, w)). The proof is trivial, since (1) already implies the seriality of R. Thus Theorem 2 is applicable to all occurrences of the existential quantifiers. For example we start with the occurrence 1 of (4). We introduce a new constant a, replace ~x(R(o, x)AVy(R(x, y) D P(y))) with Vy(R(a, y) D P(y))) and add R(o, a) as a conjunct to (4). The procedure can be repeated for the other occurrences. Eventually, we get
274
HANS JURGEN OHLBACH AND CHRISTOPH WEIDENBACH
(6) Vy(R(a,y) D P(y))) VVy(R(b,y) D Vu(R(h(y),u) D P(u))), (7) R(o, a) A R(o, b) A Vy R(y, h(y)), (8) W(R(o, ~) ~ -~P(~(~)))v Vx(l:t(o,x) D Vz(R(f(x),z) D-~P(g(x,z)))), (9) VxR(x,i(x)) A VxR(x,f(x)) A Vx, zR(z,g(x,z)), where (6) is the optimized skolemization of (4), (8) is the optimized skolemization of (5), (7) are the R atoms moved outside (4) and (9) are the R atoms moved outside (5). The obvious advantage of this skolemization technique is the stronger definitions for the Skolem functions (constants). Using standard skolemization these definitions usually occur in disjunctions with other literals from the theorem. This makes a proof of the theorem more complicated. The theorem prover OTTER (version 3.0, auto mode) proves the theorem with optimized skolemization in less than one minute; i.e. it refutes the formulae (1), (2), (3), (6), (7), (8), (9). Although we tried various parameter settings, OTTER did not find a proof of the theorem in the version with standard skolemization, i.e. OTTER fails to refute the formulae (1), (2), (3), (4), (5) using its standard skolemization. However, it should be noted that the special translation techniques developed for modal logic (e.g., see the work of Nonnengart [4] or the work of Ohlbach [5]) are still more powerful than our optimized skolemization, because they also eliminate the formulae coming from the specific modal logics (in our case the formulae (1), (2), (3)). Translating the above example using Nonnengart's approach we get
(1')
Vx, y, z R(x, y: z),
(4')
Vx(R(o:a,x) ~ P(x)) VVy(R(o:b,y) D Yz(R(y:h(y),z) D P(z))),
(5') vx(R(o,x) ~p(x:~(x)))v Vy(R(o,v) D Vz(R(v: f(y),z) D P(z:g(v,z)))), where ":" is a new two-place function symbol written in infix notation. The formula (1') is the translation of (1), (2), (3), (4') is the translation of (4) and (5') is the translation of (5). The formulae (1'), (4'), (5') are refuted by OTTER in less than one second.
4. Summary We have presented an optimized skolemization of existential quantifiers which moves information about the Skolem function from the local context of the occurrence of the existential quantifier to the top-level of the formula. Instances
A NOTE ON ASSUMPTIONSABOUTSKOLEMFUNCTIONS
275
of this optimized skolemization have been used implicitly or explicitly in special applications. We have defined it now in such a way that it can be used as a general method for arbitrary formulae. However, the proof of the seriality condition may be as complex as the proof of the input formula, in general. Therefore optimized skolemization requires a more sophisticated implementation concept than standard skolemization. Nevertheless there are many examples where the seriality condition can be easily proved (e.g., see the examples above, other problems of the Pelletier collection) and then optimized skolemization avoids duplication of literals, yields shorter clauses, shorter and less complex proofs and a smaller search space. In some cases optimized skolemization makes a proof possible where proof procedures using standard skolemization fail.
References 1. Chellas, B. E: Modal Logic, Cambridge Univ. Press, Cambridge, 1980. 2. Loveland, D.: Automated Theorem Proving: A Logical Basis, Vol. 6 of Fundamental Studies in Computer Science, North-Holland, Amsterdam, 1978. 3. McCune, W.: Otter 2.0, in lOth Int. Conf. on Automated Deductiom CADE-IO, Vol. 449 of LNCS, Springer, 1990, pp. 663-664. 4. Nonnengart, A.: First-order modal logic theorem proving and functional simulation, in Proc. of 13th Int. Joint Conf. on Artificial Intelligence, IJCAI-93, Morgan Kaufmann, 1993, pp. 80-85. 5. Ohlbach, Hans Jtirgen: Semantics-based translation methods for modal logics, J. Logic and Computation 1(5) (1991), 691-746. 6. Pelletier, E J.: Seventy-five problems for testing automatic theorem provers, J. Automated Reasoning 2(2) (1986), 191-216, Errata: J. Automated Reasoning 4(2) (1988), 235-236. 7. Scherl, R. B.: A constrained logic approach to automated modal deduction, Internal Report UIUCDCS-R-93-1803, University of Illinois at Urbana-Champaign, May 1993. 8. Schmidt-Schaul3,M.: Computational Aspects of an Order Sorted Logic with Term Declarations, Vol. 395 of LNAL Springer, 1989. 9. Walther, C.: A Many-sorted Calculus based on Resolution and Paramodulation, Research Notes in Artificial Intelligence, Pitman Ltd., 1987. 10. Weidenbach, C.: A sorted logic using dynamic sorts, MPI-Report MPI-I-91-218, Max-PlanckInstitut fur Informatik, Saarbriacken, December 1991. 11. Weidenbach, C.: Extending the resolution method with sorts, in Proc. of 13th Int. Joint Conf. on Artificial Intelligence, IJCAI-93, Morgan Kaufmalm, 1993, pp. 60-65.