J Intell Inf Syst DOI 10.1007/s10844-017-0447-6
A multi-strategy approach to structural analogy making Fabio Leuzzi1 · Stefano Ferilli1,2
Received: 23 June 2016 / Revised: 1 November 2016 / Accepted: 13 January 2017 © Springer Science+Business Media New York 2017
Abstract Analogy is the cognitive process of matching the characterizing features of two different items. This may enable reuse of knowledge across domains, which can help to solve problems. Indeed, abstracting the ‘role’ of the features away from their specific embodiment in the single items is fundamental to recognize the possibility of an analogical mapping between them. The analogical reasoning process consists of five steps: retrieval, mapping, evaluation, abstraction and re-representation. This paper proposes two forms of an operator that includes all these elements, providing more power and flexibility than existing systems. In particular, the Roles Mapper leverages the presence of identical descriptors in the two domains, while the Roles Argumentation-based Mapper removes also this limitation. For generality and compliance with other reasoning operators in a multistrategy inference setting, they exploit a simple formalism based on First-Order Logic and do not require any background knowledge or meta-knowledge. Applied to the most critical classical examples in the literature, they proved to be able to find insightful analogies. Keywords Analogy · Argumentation · Multi-strategy · First-order logic
1 Introduction and related work Analogy is the cognitive process of matching the characterizing features of two items (subjects, objects, situations, etc.). It allows one to reuse knowledge from a known item or
Fabio Leuzzi
[email protected] Stefano Ferilli
[email protected] 1
Dipartimento di Informatica, Universit`a di Bari, Bari, Italy
2
Centro Interdipartimentale per la Logica e sue Applicazioni, Universit`a di Bari, Bari, Italy
J Intell Inf Syst
domain (called the base) to an unknown one (called the target), without having to learn from scratch. Analogical reasoning is essential for producing new conclusions that are helpful to solve a problem (Gentner 1998). Almost all high-level and advanced mental activities in humans strongly rely on analogy, including art and scientific research: (1) in the study of learning, analogies are important in the transfer of knowledge and inferences across different concepts, situations, or domains; (2) analogies are often used in problem solving and reasoning; (3) analogies can serve as mental models to understand new domains; (4) analogy is important in creativity (e.g., it was a frequent mode of thought for such great scientists as Faraday, Maxwell, and Kepler); (5) analogy is used in communication and persuasion; (6) analogy (and similarity) underlies many other cognitive processes. In some sense, similarity is a (simpler) kind of analogy. While similarity maps exactly the same features in two items, an analogy may map a feature in one item onto a completely different feature in the other, provided that they both play in some sense ‘the same role’. So, analogy is much related to abstraction, while similarity is more related to generalization. Indeed, abstracting the ‘role’ of the features away from their specific embodiment in the single items is fundamental to recognize possible analogies between them. There is also some connection between analogy and metaphors, in that the latter are one-way mappings: the features of one item can be mapped onto those of the other, but the reverse mapping would not make sense. It is clear that analogy has a tight relationship to semantics, because the mappings are based on the role and meaning of the features in the two items, which goes beyond simple syntactic association (as for similarity). In fact, after finding the analogy on some (fundamental) features, the association can be extended to further features, for which experience has not yet discovered the association of roles. Analogical reasoning involves an inductive step, that hypothesizes the presence of an analogy between two contexts, and a deductive step, that performs truth-preserving reasoning based on the inductively inferred knowledge (Michalski 1993). However, induction, defined as the process of inferring general knowledge from specific observations, does not fit reasoning by analogy. While induction is based on generalization, we would rather describe analogy as an intuition that suggests possible correspondences between contexts. So, analogy would be better described as an intuition, since it is not truth-preserving, nor it performs generalizations. It generates mappings that are specific to the involved domains. Two aspects support this claim. First, a qualitative evaluation is involved, since a good analogy satisfies a need (while generalizations are usually evaluated by their accuracy). Second, induction leverages common features in different items, while analogy may map different concepts depending on the given goal or perspective. Changing context, goal or perspective, the same analogy loses its original meaning and usefulness. Furthermore, the outcome of reasoning by analogy might be inconsistent with previous knowledge. “If the inferences mandated by an analogy contradict a fundamental belief, especially one that has accrued many consequent implication, then resolving this contradiction might well involve the shock and amazement of transformational creativity” (O’Donoghue and Keane 2012). However, a set of analogical mappings can be used to identify a recurring meta-pattern (i.e. a common network of roles). When accomplishing a task, this allows to use the knowledge present in a more familiar domain to enrich the knowledge of the target domain. The recurring meta-pattern can be seen as an abstract description of the schema shared by the domains under consideration (see Subsection 3.5). Unfortunately, the identification of an abstract theory describing both domains is quite complex, since abstractions based on syntactic transformations only might generate an inconsistent theory, even if the ground one is consistent (Giordana et al. 1991). Thus, the matching mechanism of analogy, alone, seems not enough to obtain abstraction. Cooperation strategies between different reasoning operators
J Intell Inf Syst
(like analogy and abstraction) are needed, in order to obtain a more complex result, and to allow the reasoner to take advantage of a possible common sense. Analogical reasoning can be seen as a 5-steps process (Gentner 1998). (1) Retrieval finds the best base domain that may help to solve the task in the target domain. (2) Mapping searches for a mapping between base and target domains. (3) Evaluation provides criteria to evaluate candidate mappings;1 (4) Abstraction shifts the representation of both domains to their roles’ schema, converging to the same analogical pattern. (5) Re-representation adapts one or more pieces of the representation to improve the match. Several works about analogy applied numerical or probabilistic approaches, or anyway used non-relational representations. For instance, Latent Relational Analysis (LRA) (Turney 2005), Turney’s work on pairs of words (Turney 2008), Structured Tensor Analogical Reasoning (STAR) versions 1 and 2 (Halford et al. 1994; Wilson et al. 2001), Bayesian Analogy with Relational Transformations (BART) (Lu et al. 2012), Copycat (Hofstadter and Mitchell 1994). Albeit several critical issues could be discussed, this would be out of scope here, because we want to focus on symbolic relational representations, in order to gain a better understanding of the phenomenon that will hopefully allow us to exploit the large knowledge bases that are now available (e.g. Google Knowledge Graph).2 In fact, many works on analogy operators adopt formal (Propositional or First-Order) Logic as the most suitable representation for describing high-level cognitive and reasoning processes. The main contributions include the Structure Mapping Engine (SME) (Gentner 1983; Falkenhainer et al. 1989; Gentner and Markman 1997) and its derivatives (de los Angeles and Forbus 2012; Baydin et al. 2012), Analogical Constraint Mapping Engine (ACME) (Holyoak and Thagard 1989), Learning and Inference with Schemas and Analogies (LISA) (Holyoak and Hummel 2001), Discovery Of Relations by Analogy (DORA) (Doumas et al. 2008), Heuristic-driven Theory Projection (HDTP) (Schwering et al. 2009). Most of them leverage the use of identical descriptors to assume roles as analogous. This is often insufficient to deal with complex cases, and, anyway, is more related to similarity than to analogy. Indeed, the canonical definition of analogy simply requires a correlation between roles. Recent approaches can map objects and relations having different names (i.e., representing different concepts in the respective domains), but use a knowledge representation formalism that is too complex to automatically modify the knowledge base (e.g., SME and HDTP formalize types of objects, or functions, in dedicated sections of knowledge). Interestingly, HDTP relaxes the assumption that predicates and terms having the same name play the same role in different domains. Given the relevance of analogy, the objective of this work is to define novel strategies for automated analogical reasoning, overcoming some problems that are present in what is currently provided by the literature. This effort resulted in the definition of two analogy operators, where an analogy operator is a procedure that can be applied to two descriptions and returns possible analogies between them. Both are based on a syntactic logical approach. The former, named Roles Mapper, works in a more traditional setting, and aims
1 E.g. (Gentner 1998): structural soundness holds if the alignment and the projected inference are structurally consistent; factual validity is needed to check if the projected inference preserves truth (i.e. it does not violate any constraint), which is not ensured because analogy is not a deductive mechanism (but just a hypothesis) —note that factual correctness is not guaranteed by structural consistency; relevance for current goal holds if and only if the produced inference moves the knowledge towards the goal, making structural consistency and factual validity just preconditions. 2 https://www.google.com/intl/es419/insidesearch/features/search/knowledge.html
J Intell Inf Syst
at solving the most critical and classical examples of analogies available in the literature, but requiring less background knowledge. The latter, named Roles Argumentation-based Mapper, is an evolution of the former that adds more flexibility by using an argumentation reasoning operator to relax the assumption that predicates and terms having the same name play the same role in different domains, as in HDTP. The discovered analogy can be provided to humans or to automatic decision support systems as a further aid to their problem solving processes, or can be exploited in a multistrategy learning setting to enrich the observations of the world they take as an input. In particular, analogy might be applied to conceptual graphs instead of specific situations, this way allowing an ontological mapping. The remainder of this paper is organized as follows. Section 2 gives some preliminaries, and Section 3 presents the proposed mapping procedures in details. The approach is evaluated in Section 4, then the paper is concluded in Section 5.
2 Preliminaries In our approach, each piece of knowledge is formally represented as a Horn clause (Lloyd 1987), i.e. a disjunction of literals involving at most one positive literal, where a literal is a (possibly negated) atom. An atom is a predicate applied to its arguments, that are terms (in our case, only constants). A predicate p requiring n arguments is denoted as p/n. Implicitly assuming the inclusive disjunction operator, a clause can also be seen as a set of literals {l0 , ¬l1 , . . . , ¬ln }. The ProLog representation of a clause is: l0 : −l1 , . . . , ln where, in the usual interpretation, l0 is the head (i.e., the conclusion of an implication) and l1 , . . . , ln is the body (i.e., the conjunction of premises of the implication). We may extract the predicate on which an atom l = p(t1 , . . . , tn ) is built using function predicate(l) = p/n, and the terms of l using function terms(l) = {t1 , . . . , tn }. These functions may be straightforwardly extended to literals, while for a clause C we define predicates(C) = ∪l∈C {predicate(l)} and terms(C) = ∪l∈C terms(l). For our purposes, clauses are not interpreted as implications: they just provide a suitable formalism for expressing the data. The li ’s express properties of, or relationships among, (the objects denoted by) their arguments. Analogical mappings are to be found in the body; the predicate in the head labels the situation that is being described in the body. The heads may be used to provide a preferred focus, i.e. a specific perspective for which the analogical mapping is sought. Although optional, this feature can be very useful to reduce the search space and direct the operations toward a particular goal of interest. The ‘preferred perspective’ is enabled when the arity of the predicates in the heads is the same, in which case the system is bound to establish an analogy between corresponding arguments. Using 0-ary predicates in the heads disables this feature. In other words, given a predicate p/n, if n ≥ 2 then p is a relation, if n = 1 then p is an attribute, otherwise n = 0 and p is just a label providing no information about the structure of the description. So, if any term is specified, it is used to start the mappings from the terms in the heads. In the following, we will refer to each ProLog clause that is input to an analogy as a description. Furthermore, mapping and association will be used as synonyms to indicate a matching between portions of descriptions. Let us show the formalism by introducing a running example that will be used throughout the paper to illustrate the various steps of the procedure. To be able to show interesting manipulations for every step of the procedure, we adapted the descriptions, even at the cost of introducing a few oddities.
J Intell Inf Syst
Example 1 (Fairy tale and life context – 1) The fairy tale The fox and the grapes3 goes: “This fox has a longing for grapes: he jumps, but the bunch still escapes. So he goes away sour; and, ’tis said, to this hour declares that he’s no taste for grapes.” would be formalized as: fairy tale(fox, grape) :–
wants(fox, grape), cannot reach(fox, grape, fox does not reach grape), is(fox, sour), has(fox, bad opinion), cause(fox does not reach grape, bad opinion), says(fox, grape is no taste), says(fox, it is smart), is(grape, not ripe, grape is no taste).
It means that people tend to belittle things that they would like to obtain but that they cannot, e.g., as in a situation where “John loves Carla but cannot have her, so he spreads a bad opinion about her”: situation(john, carla) :–
loves(john, carla), cannot have(john, carla, john cannot have carla), says(john, carla is bad), says(john, he is a charming man), is(carla, bad, carla is bad), feels(john, sad).
The heads in these descriptions suggest that we want to establish an analogy between the fairy tale and the given situation, in which John plays the role of the fox, and Carla plays the role of the grape. Note that the atoms says(fox, it is smart), says(john, he is a charming man) and feels(john, sad) have been added to emphasize the peculiarities of our approach. In fact, the first two, in particular, are clearly useless for the analogy. An analogy can be cast as a mapping of predicates and terms that play the same role in two descriptions. So, before providing a formal definition of analogy, we must define formally what a role is and what is the role that (the object represented by) a term plays in a description. This allows to switch from the specific objects/terms that appear in a description to more abstract representatives thereof (the roles). Definition 1 (Roles) A role is a predicate p/n or its r-th argument position, denoted p/n.r; from this we define: – –
the set of roles of a literal l such that predicate(l) = p/n as R(l) = {p/n} ∪ {p/n.i}i=1,...,n ; the set of roles of a clause C as R(C) = ∪l∈C R(l).
Concerning a specific term t, we have: – – –
given a literal l such that predicate(l) = p/n, the literal roles of a t ∈ terms(l) are defined as R(t, l) = {p/n.i | t is the i-th argument of l}; given a clause C, the clause roles of a t ∈ terms(C) are defined as R(t, C) = ∪l∈C R(t, l); a role of a term t in C is any element of R(t, C).
So, any predicate p/n defines n + 1 roles: p/n and p/n.i (i = 1, . . . , n). Informally, a role represents a kind of interaction. These interactions are associated to the properties 3 Walter
Crane’s version, in Baby’s Own Aesop, 1887 (https://en.wikipedia.org/wiki/The Fox and the Grapes).
J Intell Inf Syst
and relationships expressed by the predicates in a description. The set of roles of a term expresses the allowed interactions of that term. Definition 2 (Analogy) Given a pair of descriptions D , D and the associated sets of roles R(D ) and R(D ), an analogy is a one-to-one mapping f : R → R for R ⊆ R(D ) and R ⊆ R(D ). If such an f exists, D and D are analogous. In other words, an analogy is a mapping of roles across two domains. When one has a task to accomplish, he must focus on the objects and relationships that are necessary and sufficient to carry out that task. Sometimes some of these objects or relationships are missing. In these cases, if he can establish an analogy that aligns the current task to a past experience, he may try to find the missing elements in the past experience and transpose them to the current task, in order to solve the problem. Task is here used as an informal term, intended in its everyday meaning, to denote the motivation by which an analogy is sought. If the user is missing some knowledge (in the form of objects and/or relationships) that is necessary for him to carry out his current task, then he may try to infer it by finding an analogy between his (partial) knowledge about the current task and the (more complete) knowledge he has about another task. If he succeeds in mapping his partial knowledge of the current task on part of the knowledge he has about the other task, then he may try to transpose the remaining knowledge about the other task to the current one, in order to extend it and hopefully be able to solve his current task. In this context we define: Definition 3 (Analogical perspective) Given a pair of descriptions DB , DT , representing respectively the base and target domains, and an analogy f : KB ⊆ R(DB ) → KT ⊆ R(DT ) between DB and DT , let ST and SB denote the knowledge that solves a given task in DT and the analogous task in DB respectively. f satisfies an analogical perspective P = (DB , SB ), (DT , ST ) if SB ⊆ (KB ∪ t−1 f (IT )) ∧ ST ⊆ (KT ∪ tf (IB )) where: KB and KT denote the knowledge aligned by f ; IB ⊆ R(DB ) \ KB and IT ⊆ R(DT ) \ KT are the non-aligned but transposable knowledge from the base and target domain respectively; and the function tf : R(DB ) → R(DT ) transposes knowledge from DB to DT based on f . So, the two domains may cross-fertilize, providing each other pieces of knowledge that allow to better understand the described situation and help to accomplish tasks (e.g., making comparisons, solving problems, etc.) in them. To build an analogy that satisfies an analogical perspective, an analysis of the relationships in which the objects in the description are involved is fundamental.
3 The analogical reasoner This paper deals with all the five steps of analogy, inspired by the typical brain processing done in working memory. It is not concerned with the semantics of the goal for which the analogy is sought, nor with possible ways to formalize it. The goal may even be unknown. So, our operators must be general, and suggest several possible analogies between the given descriptions. As usual in the literature, we assume that the descriptions are encoded using
J Intell Inf Syst
the same abstraction level, since the application of abstraction operators is beyond the scope of this work. That is, since the analogy maps roles one-to-one, we exclude the cases in which n roles in one description express a concept that is analogous to a concept expressed by m roles in the other description, with n ≥ 1, m ≥ 1 and n = m. Rather than on roles directly, we focus on the predicates and terms that play those roles. This raises a question about how the same predicates and terms, when found in different descriptions, have to be handled. If the source of the descriptions is the same, it is plausible that it used the same descriptors for expressing the same concepts, i.e. they carry the same meaning. In such a case, it may be safely assumed that they represent potential points of contact between the descriptions, which may be leveraged to build part of the analogical mapping. In other cases this assumption does not hold, and would be misleading if adopted. We will provide solutions for both cases. So, in our setting, an analogy must map the predicates and terms that play the same role in two clauses C and C . More formally, = θP , θT , i.e. it consists of two mappings: – –
a predicate mapping θP ⊆ predicates(C ) × predicates(C ), that maps analogous predicates between C and C , and a term mapping θT ⊆ terms(C ) × terms(C ), that maps analogous terms between C and C .
Each element of θP or θT is called a predicate association or a term association, respectively. A term (resp., predicate) mapping is consistent if it is a one-to-one mapping; it is inconsistent otherwise. Two term (resp., predicate) mappings θ and θ are compatible if θ ∪ θ is consistent, incompatible otherwise. Two analogies are compatible if both their term and their predicate mappings are compatible, incompatible otherwise. Now let us see when and how an analogy between two atoms may take place: Definition 4 (Atomic analogy) Given two atoms having the same arity n > 0, l = p (t1 , . . . , tn ) and l = p (t1 , . . . , tn ), we define their atomic analogy as the pair a(l , l ) = aP (l , l ), aT (l , l ) where: – –
aP (l , l ) = {(p /n, p /n)} (called the predicate analogy between l and l ) aT (l , l ) = {(t1 , t1 ), . . . , (tn , tn )} (called the term analogy between l and l )
if aT (l , l ) is a consistent term mapping. In all other cases it is undefined. We also need to define when a predicate or term mapping can be added to an existing partial analogy (i.e., an analogy still under construction). Definition 5 (Candidate term association) Given two clauses C and C , a consistent term mapping θT ⊆ terms(C ) × terms(C ), and two sets of literals L ⊆ C and L ⊆ C ; a pair of terms (t , t ) is a candidate term association for L , L w.r.t. θT iff θT ∪ {(t , t )} is consistent, and – –
∀(l , l ) ∈ L × L : R(t , l ) = R(t , l ) = ∅; or t = t (= t) ∧ ∀(l , l ) ∈ L × L s.t. t ∈ terms(l ), t ∈ terms(l ) : R(t, l ) = R(t, l ).
In other words, either terms t and t are different and appear always at the same positions in all the literals in L and L (where L and L are chosen using a procedure reported in the following sections), or they are the same term, in which case it suffices that it appears at the same positions in all the literals in which it appears. The set of candidate term
J Intell Inf Syst
associations for L , L w.r.t. θT is denoted cta(L , L , θT ). ncta(L , L , θT ) = {(t , t ) ∈ cta(L , L , θT ) | (t , t ) ∈ θT } are the new candidate term associations. Definition 6 (Candidate predicate association) Given two clauses C and C , a consistent predicate association θP ⊆ predicates(C ) × predicates(C ), a consistent term association θT ⊆ terms(C ) × terms(C ), two sets of literals L ⊆ C and L ⊆ C where ∀l ∈ L : predicate(l ) = p /n, and ∀l ∈ L : predicate(l ) = p /n; then the (singleton) set of pairs {(p /n, p /n)} is the candidate predicate association for L , L w.r.t. θP , denoted cpa(L , L , θP ), if θP ∪ {(p /n, p /n)} is consistent, and – –
p = p ; or an atomic analogy aT (l , l ) is defined s.t. aT (l , l ) ⊆ θT .
Otherwise cpa(L , L , θP ) = ∅. That is, all literals in each of the two sets L and L must be built on the same predicate (implying to have the same arity), or all the arguments of the literals in which they appear must have been already mapped. This definition is compliant with the fact that reasoning by analogy cannot be reduced to just looking for equal predicates across domains, because this would be just a generalization and the derived inference could be useless or trivial. Even worse, the same predicate, used in different domains, might carry a different meaning, which would be misleading for the analogy operator.4 Definition 7 (Deterministic association) Given two clauses C and C and a cta(L , L , θT ) for which ∀(l , l ) ∈ L × L : R(t , l ) = R(t , l ) = ∅ holds, we recall that R(t , l ) = p /n.i and R(t , l ) = p /n.i by Definition 1. Then ∀l ∈ L and ∀l ∈ L , l and l have arity n, t and t appear in the i-th position. We say that (t , t ) is a deterministic association. Example 2 (Candidate predicate and term associations) Suppose θT = {(t , t )}: –
–
–
–
L1 = {f (t , h, r), g(t , h, w)}, L1 = {o(t , q, y), u(t , q, e)} (h, q) is a candidate term association because it is deterministic (since h and q are always at the same positions in the literals in L1 and L1 ); the other terms are not deterministic. L2 = {f (t , h, r)}, L2 = {o(t , q, y)} (h, q) and (r, y) are candidate term associations because they are deterministic; adding them to θT , all terms in L2 and L2 become mapped, and hence {(f/3, o/3)} becomes a candidate predicate association and can be added to θP . L3 = {f (t , w, r), g(t , h, v)}, L3 = {o(t , q, r)} there are no deterministic associations, but r appears at the same position in a literal from each set, and hence (r, r) is a candidate term association. L4 = {f (t , w, r), g(t , h, v)}, L4 = {o(t , q, y)} no candidate term associations are available, thus no mapping can take place.
3.1 Retrieving previous useful knowledge Given a reference description (target) T , it must be compared to the descriptions available in the knowledge base B to retrieve potentially analogue past experiences (candidate
4 An
example of this situation is the classical Solar system/Rutherford atom problem, that will be shown in the following.
J Intell Inf Syst
bases). While the search may be guided by the shared predicates among the descriptions, considering only occurrences of the same predicates is not enough: the whole network of relationships in the descriptions must be considered, in order to identify common substructures that might suggest a relevant association. This can be done using a structural similarity evaluation, which should increase the possibility of retrieving surprising and creative source domains (O’Donoghue and Keane 2012). In O’Donoghue and Keane (2012), the experiences are projected in an n-dimensional vector space, where each dimension represents a particular topological quality of the domain, but in this way the relational information is lost. To overcome this limitation, we adopt the structural similarity measure proposed in Ferilli et al. (2009). It is based on the following formula: sf(i , i ) = sf(n, l, m) =
l+1 1 1 ( + ) 2 l+n+2 l+m+2
that evaluates the similarity between i and i based on: – – –
n : the amount of information carried by i but not by i ; l : the amount of common information between i and i ; m : the amount of information carried by i but not by i .
For two terms t , t (in our setting, constants that represent entities), sf(t , t ) is evaluated based on their properties (expressed by unary predicates) and roles. For two atoms a , a built on n-ary predicates, sf(a , a ) is evaluated based on the multiset of predicates corresponding to atoms directly linked to them in the clause body, and on the average similarity of their arguments. For two sequences of atoms S , S , sf(S , S ) is based on the length of their compatible initial subsequence and on the average similarity of the atoms appearing in such a subsequence. Finally, the similarity fs(C , C ) of two clauses C , C is computed according to their least general generalization, considering how many literals and terms they have in common and on their corresponding lower-level similarities. fs(T , B) tends to return high scores for short similar descriptions T and B. In these cases, if T is a target and B ∈ B is a candidate base, little knowledge can be actually used for inference in an analogy. To avoid obvious or useless analogies in these cases, we smooth fs(T , B) by a multiplicative factor that rewards longer descriptions: mf(T , B) =
avg({|T |, |B|}) |T | + |B| = 2∗L L
where | · | is the cardinality (i.e., the number of literals) of a clause, and L = maxC∈B |C| normalizes mf(·, ·) into [0, 1]. This trade-off between similar structure and large cardinality should help to find candidate bases that are both similar enough to the target as to satisfy the mapping phase, and rich of further elements to infer novel, potentially useful, knowledge.
3.2 ROles mapper In a nutshell, our analogical engine, presented in (Leuzzi and Ferilli 2013), initializes the analogy mapping and then progressively expands it, guided by linkedness (i.e., term sharing) among literals. New term or predicate associations are added according to Definitions 5 and 6, that ensure overall consistency of the mapping. The core of the strategy is the ROle Mapper (ROM) described in Algorithm 1. It involves three levels of nested iterations: the loop at line 6 encloses the whole life cycle of the analogical mapping; the loops at lines 9 and 18 are aimed at extending the analogy with mappings between predicates and between terms; finally, the loop at line 7 checks whether any of the loops at lines 9 and/or 18
J Intell Inf Syst
produced novel mappings or not. In the following, the various passages will be explained and commented, also with the help of the running example.
3.2.1 Initializing an analogy When looking for an analogy between two descriptions, the starting points of the comparison, i.e. the first items to try putting in correspondence, must be determined. Formally, we have Definition 8 (Starting point) Given two clauses C and C , a starting point is a partial analogy on C , C . Initially, the global predicate mapping is empty (θP = ∅). Then, if an explicit goal is expressed by the heads, the mapping of the heads’ arguments is taken as a starting point,
J Intell Inf Syst
using the interpretation of the heads as preferred perspective according to which the analogies between descriptions are to be found. Suppose that the two clauses C and C input to Algorithm 1 have heads l0 and l0 , respectively. If a(l0 , l0 ) is defined, the global term mapping is initialized at θT = aT (l , l ). Otherwise, an empty global term mapping θT = ∅ is initially set, and candidate starting points must be identified by evaluating the shared knowledge between the two descriptions, that provides potential points of contact between them. Example 3 (Fairy tale and life context – 2) Since the heads have the same arity, their terms are mapped, yielding the starting point θT = {(fox,john), (grape,carla)}
θP = ∅
3.2.2 Extending an analogy Following the Algorithm 1, the loop 9 locates and maps shared predicates in (L , L ), in which all literals l ∈ L ⊆ C and l ∈ L ⊆ C are built on the same predicate p/n. Since no argument in L and L has been mapped yet, deterministic mappings are searched for. At the following runs, the loop at line 9 goes on with mappings collecting the pairs of subsets of C and C such that, for each pair (L , L ), all literals l ∈ L ⊆ C and l ∈ L ⊆ C are built on the same predicate p/n, and their arguments have already been partially (and consistently) mapped by θT . In such a case, if {(p/n, p/n)} is compatible with θP and the new candidate term associations for L and L are compatible with θT , then they are integrated in the previous partial analogy. Example 4 (Fairy tale and life context – 3) Looking for a deterministic alignment between literals built on the same predicate, ROM finds the sets of literals {says(fox, grape is no taste), says(fox, it is smart)} and {says(john, carla is bad), says(john, he is a charming man)}. The partial analogy is updated as follows: θP ← θP ∪ {(says/2,says/2)}
θT ← θT ∪ ∅
Nothing can be done to map terms, because there is no deterministic alignment. In the next step, i.e. considering the next L and L , a consistent atomic analogy is found, {is(grape, not ripe, grape is no taste)} and {is(carla, bad, carla is bad)}, which is also compatible with the current partial analogy. So, the update is: θP ← θP ∪ {(is/3,is/3)}
θT ← θT ∪ {(not ripe,bad), (grape is no taste,carla is bad)}
Now a consistent atomic analogy is found between the previous literals {says(fox, grape is no taste)} and {says(john, carla is bad)}, since all their components have been aligned. Once this atomic analogy is established, also the analogy between the remaining atoms {says(fox, it is smart)} and {says(john, he is a charming man)} becomes deterministic, from which: θT ← θT ∪ {(it is smart,he is a charming man)}
J Intell Inf Syst
In ROM, the loop at line 18 (Algorithm 1) expands the current partial analogy using literals built on different predicates but in which entities that play an analogous role can be found, according to Algorithm 2. First, lines 1 to 5, the literals whose term analogy has already been entirely found in the current partial analogy are selected, and their predicate mapping, if deterministic, is added to the current partial analogy (note that, in this case, both L and L must be singleton). Then, lines 6 to 10, the literals having only some mapped terms in the same positions are considered, and all of their not yet mapped arguments that are candidate term associations (i.e., have deterministic bindings, or are equal and always appear in the same positions) are added to the current partial analogy. Both options use the reliability score of the pairs of sets of literals to decide which extensions are to be tried first. Indeed, each performed extension establishes associations that may prevent subsequent associations, and hence it is advisable to try the more reliable first. The search for predicate and term associations continues until no new mapping is found. Example 5 (Fairy tale and life context – 4) At a certain point of the analogy making procedure we have {(grape,carla), (fox,john)} ⊂ θT , that means the arguments of the two sets of literals {wants(fox,grape)} and {loves(john, carla)} are completely and consistently mapped. This yields a candidate predicate association, which allows us to expand the partial analogy built so far: θP ← θP ∪ {(wants/2,loves/2)}
θT ← θT ∪ ∅
A partial consistent alignment is also present in {cannot reach(fox, grape, fox does not reach grape)} and {cannot have(john, carla, john cannot have carla)}, from which an atomic analogy between these two literals can be derived and used to further expand the partial analogy. First, the candidate term association is applied: θT ← θT ∪
{(fox does not reach grape,john cannot have carla)} This completes the mapping of the arguments and allows to apply the candidate predicate association: θP ← θP ∪ {(cannot reach/3,cannot have/3)}
J Intell Inf Syst
In Algorithm 1, the loop at line 7 ensures that loops at lines 9 and 18 are repeated while any of them extends the mapping. Then, the loop at line 7 performs a further extension by adding all deterministic associations that emerge based on the previous ones.
3.2.3 Ranking hypotheses It may happen that shared descriptors are not present, or cannot be used as starting points, or that deterministic mappings are not available. In these cases the search for analogical associations would get stuck and the availability of an effective strategy for suggesting suitable atomic analogies that provide new starting points becomes essential. Since new analogical associations are partly determined by previous ones, wrong choices may negatively affect subsequent mappings, and the accumulation of these errors may tamper the whole reasoning. In fact, how to rank candidate analogical hypotheses is a well-known and awkward problem. ROM tackles this problem by exploiting a structural similarity function to obtain a reliability score for candidate atomic analogies. Based on this score, it selects the most reliable pair of (not already mapped) literals for performing the next atomic analogy. Definition 9 (Reliability Score) Given a clause C, the reliability score of a literal l = p(t1 , . . . , tn ) ∈ C is obtained as: n f (ti ) rf(l) = i=1 |C| · n where f (·) returns the number of literals of C in which t appears. Consider now two clauses C and C . Given a pair of literals having the same arity (l , l ) ∈ C × C , the reliability score for the hypothesis of atomic analogy a(l , l ) is given by: rs(l , l ) = rf(l ) · rf(l )
Given a pair of sets of literals having the same arity (L , L ) ∈ 2C × 2C , the reliability score for the hypothesis of extending an analogy using the pair (L , L ) is given by: rf(l ) rf(l ) · l ∈L re(L , L ) = l ∈L |L | |L | Intuitively: rf(l) averages the relative frequencies of terms in l, and thus it is an indicator of the centrality of l in C; rs(l , l ) rewards more mappings that maximize the coverage of literals in both clauses; re(L , L ) mixes the average centrality of the literals in the two sets. When the loops at lines 7, 9 and 18 in Algorithm 1 are ineffective due to the absence of deterministic mappings, the loop at line 6 tries to restart the mapping expansion. This attempt can be seen as the last opportunity to further extend the mapping before the algorithm terminates. rs(·, ·) is applied to all pairs (l , l ) ∈ C × C not fully mapped by s.t. a(l , l ) is (defined and) compatible with . Then, the mapping hypothesis having highest reliability score is used to update the analogical mapping. This may enable the loops at lines 9 and 18 to find further deterministic associations that were previously not exploitable. It may happen to have more than a maximum. Noteworthy that if we use the Reliability Score the information about the next atomic analogy is not enough. Conversely, having the sufficient conditions to carry out the analogy, we have either a single maximum or never used the score. In other words, the possibility of multiple maximums is not far from the natural condition of lacking information in human analogical reasoning.
J Intell Inf Syst
Example 6 (Fairy tale and life context – 5) The mapping obtained so far cannot be further extended leveraging deterministic associations. The pair is(fox, sour) and feels(john, sad) has a reliability score of 0.016, which suggests an atomic analogy between them. Since (fox,john) is already in θT , the other components are mapped: θP ← θP ∪ {(is/2, f eels/2)}
θT ← θT ∪ {(sour, sad)}
which leads to the overall mapping in Table 1. This is the final analogy, because no further candidate association can be found in the next iteration.
3.3 A multi-strategy approach: roles argumentation-based mapper The authors of STAR-2 (Wilson et al. 2001), in Rutherford’s analogy between the Solar system and the atom, pointed out that “irrelevant details were added to both the solar system and the atom in the form of the temperature difference of the sun and planets and the mass difference of the electron and the nucleus”. This causes SME (and ROM) to map the mass difference of the sun and the planets with the mass difference of the electron and the nucleus, producing a wrong analogy. In ROM (Algorithm 1), this is due to the ‘same predicate ⇒ same meaning’ assumption embedded in the loop at line 9. To overcome this problem, we devised another operator, the Roles Argumentation-based Mapper (RAM), shown in Algorithm 3, in which the mapping of equal predicates and terms is not mandatory. Instead of using a (possibly wrong) measure to solve non-deterministic mappings, a multistrategy approach based on argumentation is adopted. Argumentation allows to recognize consistent subsets of a set of contrasting statements.
3.3.1 Use of the abstract argumentation framework RAM exploits the abstract argumentation theory (Dung 1995). An argumentation framework is a pair AF = AR, AT , where AR is a set of arguments and the binary relation AT ⊆ AR × AR represents attacks between arguments. Then: – –
A set S ⊆ AR of arguments is conflict-free if there are no arguments A, B ∈ S such that (A, B) ∈ AT (i.e., A attacks B). An argument A ∈ AR is acceptable wrt a set S of arguments iff for each argument B ∈ AR: if B attacks A then B is attacked by (an element of) S.
Table 1 Analogy between fairy tale and life context Mapped Predicates
Mapped Terms
Base clause
Target clause
Base clause
Target clause
(fairy tale)
(life context)
(fairy tale)
(life context) john
says/2
says/2
fox
is/3
is/3
grape
carla
wants/2
loves/2
grape is no taste
carla is bad
cannot reach/3
cannot have/3
not ripe
bad
is/2
feels/2
fox does not reach grape
john cannot have carla
sour
sad
it is smart
he is a charming man
J Intell Inf Syst
–
A conflict-free set of arguments S ⊆ AR is admissible iff each argument in S is acceptable wrt S (let us to denote admissible sets as adm(AF )).
Specifically, RAM obtains from each pair of literals having the same arity, used as a possible starting point, a mapping that becomes an argument in an argumentation framework. So, all possible mappings suggested by RAM play the role of arguments, and an argument attacks another argument if they are incompatible. Then, argumentation identifies sets of consistent partial mappings, that RAM translates back to analogy hypotheses. As for ROM, such analogies must satisfy the perspective indicated by the user in the heads (if any). RAM uses complete extensions5 as the consistent subsets of mappings that suggest an analogy, whose computation is not expensive (it needs P power (Egly et al. 2010)). A complete extension of an argumentation framework AF = AR, AT is a set C s.t. C ∈ adm(AF ) and ∀a ∈ AR defended by C in AF , a ∈ C holds. For the sake of clarity, a is defended by C if ∀b ∈ AR s.t. (b, a) ∈ AT , ∃c ∈ C s.t. (c, b) ∈ AT .
3.3.2 Computing an analogy from several perspectives RAM will be described in the following using again the running Example 1. The core of RAM is in the extension loop starting at line 4, Algorithm 3. Given two clauses C and C , RAM collects all (l , l ) ∈ C × C for which an atomic analogy exists, and passes each of them to this loop, that uses the atomic analogy as a starting point = a(l , l ) = θP , θT to build an analogy from it.
Example 7 (Fairy tale and life context – 6) Some starting points are provided by the pairs of literals having arity 3: is(grape, not ripe, grape is no taste) – cannot have(john, carla, john cannot have carla) is(grape, not ripe, grape is no taste) – is(carla, bad, carla is bad) ... 5 An extension determines which arguments are reliable in an argumentation framework. Some extensions are tighter, some others are looser in selecting ‘reliable’ arguments.
J Intell Inf Syst
Others are provided by the pairs of literals having arity 2: wants(fox, grape) – loves(john, carla) wants(fox, grape) – says(john, carla is bad) wants(fox, grape) – feels(john, sad) ... RAM collects the set of hypotheses of analogy computed by the loop at line 4 in Algorithm 3, according to Algorithm 2, and considers it as a set of arguments for an argumentation framework. The set of attacks is built by identifying all pairs of incompatible hypotheses of analogy. Note that (in)compatibility is a symmetric relationship, so if ( , ) is an attack in AT , then ( , ) is in AT as well. Intuitively, in Algorithm 3, each starting point determined in the loop at line 2 represents a particular perspective, and the loop at line 4 finds an analogy that satisfies it. Each perspective is considered as an argument claiming a partial analogy. Thus, arguments associated to incompatible mappings attack each other. This argumentation framework is then provided to the system ASPARTIX (Egly et al. 2010) that computes its complete extensions. If an atomic analogy a(l0 , l0 ) between the heads of C and C is defined, the complete extensions that are incompatible with the corresponding term mapping aT (l0 , l0 ) are filtered out. The final choice of the analogy that best fits the task for which the analogy was sought is left to the user. Example 8 (Fairy tale and life context – 7) An analogy is computed for each pair found in Example 7 by expanding the corresponding starting point mappings. In this example, the superscripts identify the partial analogies computed. Consider, for instance, the following two pairs: (wants(fox, grape) , loves(john, carla) ) yields 16 = θP16 , θT16 with θP16 = { (cannot reach/3,cannot have/3), (is/2,feels/2), (wants/2,loves/2), (says/2,says/2), (is/3,is/3) } θT16 = { (sour,sad), (fox,john), (grape is no taste,carla is bad), (not ripe,bad), (grape,carla), (fox does not reach grape,john cannot have carla) } ( wants(fox, grape) , says(john, carla is bad) ) yields 14 = θP14 , θT14 with θP14 = { (wants/2,says/2) } θT14 = { (grape,carla is bad), (fox,john), (john cannot have carla,fox does not reach grape) } When looking for the pairs of mappings that are not compatible, in order to associate to each of them a mutual attack in the argumentation framework, it is easy to see that 16 is not compatible with 14 , since predicate wants/2 is mapped onto different predicates in 16 and 14 . This generates two attacks: (16 , 14 )
(14 , 16 )
Computing the complete semantics on the resulting argumentation framework yields several sets, among which {16 , 3 , 5 , 10 , 15 } and {12 , 14 }. In the former set, the unions of all predicate mappings and of all term mappings yield the result in Table 1. Conversely, the latter would yield the associations: θP = {(is/2, loves/2), (wants/2, says/2)} θT = {(sour, carla), (john cannot have carla, fox does not reach grape), (grape, carla is bad), (fox, john)} that do not make sense.
J Intell Inf Syst
Example 8 shows that RAM can reach the same results as ROM, but is also open to different possibilities because it is not bound to associate equal predicates in the two descriptions. This feature is useful in the tricky case of Rutherford’s analogy, where STAR2 highlighted that the presence of relations having the same name is not a hint of their having analogous roles. It is also useful when no shared descriptors are present in the two descriptions, in which case ROM would fail. Compared to Algorithm 1, Algorithm 3 replaces the step that maps equal relations across domains by the use of all possible starting points, so that all perspectives can be explored. Also, the reliability assessment used to restart the mapping when no deterministic mappings are available has been replaced by the more objective argumentation system, that provides the set of consistent partial analogies.
3.4 Inference One-to-one alignment of analogous roles for entities and relationships across domains has a primary importance, because it ensures that part of the structural consistency is verified (Falkenhainer et al. 1989). For this reason, it is taken as the basis for the inference step, aimed at transferring missing knowledge across the domains. The literals that are completely mapped (i.e., having their predicate and all of their arguments mapped by the analogy), but whose counterpart is not present in the other description, can be immediately projected onto the other domain by taking their analogous counterparts. Moreover, inference can be extended to partially mapped literals, introducing new names for the missing elements (that represent new knowledge that can be hypothesized in the other domain). We identify these names with the ‘skolem ’ prefix. Of course, the more Skolem elements in a projection, the less reliable that projection. Formally, consider an analogical perspective P = (D , S ), (D , S ), an analogy : K → K satisfying P (with = θP , θT ), and an atom l = p (t1 , . . . , tn ) ∈ D \ K . If ∃(p /n, p /n) ∈ θP ∧∀i = 1, . . . , n : ∃(ti , ti ) ∈ θT , then t ({l }) = {p (t1 , . . . , tn )}. If ∃(p /n, p /n) ∈ θP ∨ ∃i = 1, . . . , n : ∃(ti , ti ) ∈ θT , then t ({l }) = {p(t1 , . . . , tn )} where: if ∃(p /n, p /n) ∈ θP then p = p else p = skolem p ; and ∀i = 1, . . . , n : if ∃(ti /n, ti /n) ∈ θT then ti = ti else ti = skolem ti . The same strategy allows to transpose an atom l = p (t1 , . . . , tn ) ∈ D \ K to D \ K . Example 9 (Fairy tale and life context – 8) The inference hypotheses from the fairy tale to the life context are: 1: 2:
skolem cause(john cannot have carla, skolem bad opinion) skolem has(john, skolem bad opinion)
The former means that the fact that John cannot have Carla may somehow play the role of a ‘cause’, as in the Fairy Tale, and that the effect of this cause may be something analogous to the ‘bad opinion’ that the fox has of the grape in the Fairy Tale because he is not able to reach it. The latter means that John is likely related to this kind of ‘bad opinion’ by something analogous to ‘having’ it, as the fox in the Fairy Tale. In other words, the inference proposes the following knowledge that is missing in the life context: “it can be reasonably assumed that John has something like a bad opinion about Carla, due to the fact that he cannot have her”, which is indeed the correct explanation we were expecting.
J Intell Inf Syst
3.5 Meta-pattern formalization Albeit analogy, itself, does not perform generalizations, the analogical mappings found between pairs of descriptions representing experiences, contexts or concepts can be used to obtain more general analogical schemes (Leuzzi and Ferilli 2016). Specifically, each analogy can be ‘condensed’ in a pattern, that in turn can be used for searching further analogies with other experiences. Such pattern takes place in the long term memory, allowing self-improvement and growth. Formally, given an analogy = θT , θP between two clauses C and C , a generalized pattern C with |C| ≤ min(|C |, |C |) is outlined as follows. ∀(l , l ) ∈ C × C , l = p (t1 , ..., tn ), l = p (t1 , ..., tn ), for which ∃a(l , l ) s.t. aP (l , l ) ⊆ θP ∧ aT (l , l ) ⊆ θT : l = p(t1 , ..., tn ) ∈ C, where: if p = p , then p = p = p , otherwise p is a new predicate (p ∈ predicates(C ) ∪ predicates(C )); ∀i = 1, . . . , n : if ti = ti , then ti = ti = ti , otherwise ti is a new term (ti ∈ terms(C ) ∪ terms(C )). Example 10 (Fairy tale and life context – 9) The mapping in Table 1 aligns the literals as shown in Table 2, yielding the following pattern: pattern(fairy tale(fox, grape), situation(john, carla)) :– wants OR loves(fox OR john, grape OR carla), cannot reach OR cannot have(fox OR john, grape OR carla, fox does not reach grape OR john cannot have carla), says(fox OR john, grape is no taste OR carla is bad), says(fox OR john, it is smart OR he is a charming man), is(grape OR carla, not ripe OR bad, grape is no taste OR carla is bad). We represent the generic element (predicate or term) obtained from the mapping using a syntax that chains the predicate (respectively term) of the base description to the analogous predicate (respectively term) of the target one, inserting ’ OR ’ in the middle. Such a syntax is just a way to keep understandable the pattern, in order to evaluate it at a glance. As in usual generalization, patterns can be used to find analogies with other descriptions (or even with other patterns). If such analogies do not fully map the pattern components, new (and more general) patterns (actually, meta-patterns) can be generated. Differently from usual generalization, new patterns do not replace the patterns from which they originated, because each analogy is motivated by a specific perspective, and hence the corresponding
Table 2 Literal mappings between fairy tale and life context Fairy tale
Life context
fairy tale(fox, grape)
situation(john, carla)
says(fox, grape is no taste)
says(john, carla is bad)
says(fox, it is smart)
says(john, he is a charming man)
is(grape, not ripe, grape is no taste)
is(carla, bad, carla is bad)
wants(fox, grape)
loves(john, carla)
cannot reach(fox, grape,
cannot have(john, carla,
fox does not reach grape)
john cannot have carla)
is(fox,sour)
feels(john,sad)
J Intell Inf Syst
pattern must be preserved as a representative of that perspective. Then, the same (meta-) pattern may give rise to several meta-patterns, based on different combinations of specific domains. This reflects the fact that different aspects of a given situation may have different analogies with other experiences according to different perspectives. Note that, as long as (meta-)patterns are progressively generalized (or used with full analogies) with new domains, the surviving elements in the resulting meta-pattern (the predicates or terms that do not contain ’ OR ’) are more and more supported and confirmed, and hence are likely to represent common sense knowledge. In particular, if specific predicate or term names survive many refinements, they are likely to represent fundamental concepts. Hence, we are proposing to learn from the mapping phase as well as from the inference phase. The formalization of meta-patterns takes advantages from the classical generalization procedure, becoming a multi-strategy approach. Example 11 (Pattern and job change desire) Tom needs a new job; since there are no open positions for the job he wants, he takes a job that he dislikes. Tom does not like to show that he has been unable to reach the desired job, so he denigrates the desired position, giving a good opinion about the job that he found with regret. Its formal description: job change(tom,job)) :– needs(tom, job), cannot find(tom, job, tom cannot find job), tells(tom, job is bad), is(job, bad, job is bad), uses(envious, badness), accepts(tom, other job), dislike(tom, other job), tells(tom, other job is good), is(other job, good, other job is good). has a structure that recalls the analogy between the fairy tale and the life context. Computing the analogy between our pattern and this further description (which we skip for the sake of brevity) we have a meta-pattern: pattern(pattern(fairy tale(fox, grape), situation(john, carla)), job change(tom,job)) :– wants OR loves OR needs(fox OR john OR tom,grape OR carla OR job), cannot reach OR cannot have OR cannot find(fox OR john OR tom, grape OR carla OR job,fox does not reach grape OR john cannot have carla OR tom cannot find job), says OR tells(fox OR john OR tom,grape is no taste OR carla is bad OR job is bad), is(grape OR carla OR job,not ripe OR bad OR bad, grape is no taste OR carla is bad OR job is bad). Note that we used a description for simplicity, even if this computation can be done between two patterns as well. As we can see, the only common-sense concept that survived is is/3. Other ones are just place cards strategically named in order to be interpretable. As can be seen, the aligned concepts regard the denigration of something that the protagonist was unable to reach. The concepts leaved outside of mapping is perfectly coherent with the other stories. In fact, projecting those concepts in the base descriptions generating the pattern, we find that the fox could provide a positive opinion about any other food that it is able to reach, just like John could do with any other woman, different from Carla, who falls in love with him. The history of a pattern can be traced by recording the origin of each predicate/term in the pattern using 4-tuples of the form: (head, type, pattern name, original name)
J Intell Inf Syst
where head is the head of the original clause, type indicates if the record concerns a predicate or a term, pattern name represents the name reported in the pattern and original name reports the name in the original clause.
4 Evaluation We evaluated our proposal on several cases of different complexity, most of which drawn from other works. All were solved in a fraction of a second. As usual in literature, we propose qualitative evaluations, except for the retrieval step that admits a quantitative evaluation. To further clarify our approach, we use analogies that may give an insight in the behavior of our operators.
4.1 Retrieval of potential analogies The background knowledge contains the following stories. 1. Eagle Karla: eagle Karla donates a tailfeather to a sportsman, asking him to promise that he would never attack eagles. One day the man meets the eagle, and attacks it. 2. Country Zerdia: a country is attacked by another, named Gagrach. The attack fails since the missiles are guided by old computers. So Zerdia offers Gagrach some machines asking in exchange the promise that Gagrach will never attack again Zerdia. 3. Hawk Zerdia: a hunter has no feathers on his arrows; hawk Zerdia gives him some feathers, asking him in exchange the promise to never attack Zerdia, but obtaining back that the hunter shot it. 4. Conquer fortress: a General wants to conquer a fortress placed in the center of a village. 5. Heal tumor: a doctor wants to defeat a tumor that a patient has inside of healthy tissue. 6. Get medicine: a client of a pharmacy has not enough money to buy a medicine. He has several other clients around him having money. 7. Solar system: planets revolve around the sun. Revolution is caused by gravity, due to the difference in masses. 8. Rutherford atom: electrons revolve around the nucleus. Revolution is caused by their having opposite charge. 9. Maternity: a child is attracted to his mother, due to the affective force, and to the difference in experience. 10. Water flow: water passes from a beaker to another due to the amount of water producing a pressure until the level of water in both beakers is the same. 11. Heat flow: heat is transferred from the hottest object to another while the temperature is different. 12. John-Carla affair: John cannot have Carla, then he spreads a bad opinion about her. 13. Fox-grape fairy tale: when a fox cannot reach the grapes, it says they are not ripe. 14. Addition: description of some properties of addition. 15. Union: description of some properties of set union. 16. Boy: a poor description about a boy. 17. Dog: a poor description about a dog. They have been already presented in literature (Falkenhainer et al. 1989; Veloso and Carbonell 1993; Wilson et al. 2001; Holyoak and Thagard 1989; Gick and Holyoak 1980), with the addition of descriptions 6 and 9. The set is not very large because, as long as
J Intell Inf Syst
analogical meta-patterns are obtained (and saved), the size of the background knowledge grows significantly. The retrieval strategy presented in Section 3.1 was used to select the most promising pairs of descriptions in which an analogy could be found. Table 3 reports all the pairs of descriptions, whose score is different than zero. This confirms that our approach produced only plausible pairs. Descriptions 14, 15, 16 and 17 are not involved in any of these pairs, due to all these cases completely lacking shared knowledge. All of them have been elaborated by ROM and RAM, highlighting that RAM is able to find useful alignments in all of such cases.
4.2 Solar system and rutherford’s atom The analogy between Solar system and Rutherford atom has been widely used in literature (e.g., in Falkenhainer et al. (1989) and Veloso and Carbonell (1993)). Expressed in our formalism, we have the following descriptions: solar system(sun, planet) :– inanimate(sun), inanimate(planet), mass(sun, mass sun), mass(planet, mass planet), greater(mass sun, mass planet, major mass), attraction(sun, planet, attracts), revolve around(planet, sun, revolve), attraction mass(major mass, attracts, major mass attracts), cause(major mass attracts, revolve, cause revolve), temperature(sun, temp sun), temperature(planet, temp planet), greater(temp sun, temp planet, major temp), gravity(mass sun, mass planet, force gravity), cause(force gravity, attracts, why attracts). rutherford atom(nucleus, electron) :– inanimate(nucleus), inanimate(electron), mass(nucleus, mass n), mass(electron, mass e), greater(mass n, mass e, major mass), attraction(nucleus, electron, attracts), revolve around(electron, nucleus, revolve), charge(electron, q electron), charge(nucleus, q nucleus), opposite sign(q nucleus, q electron, major charge), cause(major charge, attracts, why attracts). Since the heads have the same arity, their arguments are mapped as starting points, obtaining θT = {(planet,electron), (sun,nucleus)}. Then, ROM selects the literals built on Table 3 Ranked list of potential analogies
#
Pair
Score f s() · mf ()
1
Hawk Zerdia – Eagle Karla
3.6
2
Country Zerdia – Eagle Karla
3.577
3
Hawk Zerdia – Country Zerdia
2.917
4
Conquer fortress – Heal tumor
2.342
5
Solar system – Rutherford atom
1.357
6
Conquer fortress – Get medicine
1.025
7
Solar system – Maternity
1.022
8
Heal tumor – Get medicine
0.957
9
Rutherford atom – Maternity
0.946
10
Fairy tale fox-grape – Situation John-Carla
0.624
11
Water flow – Heat flow
0.608
J Intell Inf Syst
the same predicate in both clauses, in order to search for a deterministic alignment. This yields {inanimate(sun), inanimate(planet)} and {inanimate(nucleus), inanimate(electron)}, from which the association (inanimate/1, inanimate/1) is established and added to θP . Going on, the sets {mass(sun, mass sun), mass(planet, mass planet)} and {mass(nucleus, mass n), mass(electron, mass e)} are selected, in all of which the first term is already mapped, but the latter is not deterministic according to Definition 7. So, the terms cannot be associated, but the predicate does, and {mass/2, mass/2} is added to the current predicate mapping θP . Then, since (sun, nucleus) ∈ θT , trying to expand the partial analogy one gets the pair of sets {temperature(sun, temp sun)} and {charge (nucleus, q nucleus)}. Since the first term of all literals is already mapped and (temp sun, q nucleus) is deterministic according to Definition 7, this term association is added to θT . Now all terms in the previous sets have been mapped, and so the predicate association (temperature/2, charge/2) can be added to θP as well. This allows the selection of {temperature(planet, temp planet)} and {charge(electron, q electron)}, that yields the new association (temp planet, q electron). At this point, no new deterministic association can be found, and hence the analogy expansion gets stuck. The similarity function is used to break through the deadlock, and suggests as plausible atomic analogies {opposite sign (q nucleus, q electron, major charge)} and {gravity(mass sun, mass planet, force gravity)}. However, mass sun is already associated to mass n, so the mapping phase terminates with no further action (see Table 4). The expected explanation of the physical phenomenon is, more or less, that “the difference in masses, together with the mutual attraction of the nucleus and the election, causes the electron to revolve around the nucleus”. The inference hypotheses from Solar system to Rutherford atom are: 1: 2: 3: 4:
greater(q nucleus, q electron, skolem major temp) skolem attraction mass(major mass, attracts, skolem major mass attracts) cause(skolem major mass attracts, revolve, skolem cause revolve) skolem gravity(mass n, mass e, major charge)
where 1, 2 and 3 fully satisfy this interpretation. Additionally, hypothesis 4, obtained by transposing the gravity to the electromagnetic force, explains that there is a force based on the charge of the particles. Table 4 Solar system and Rutherford atom mapping provided by ROM Mapped predicates Base clause
Mapped terms Target clause
Base clause
Target clause
temperature/2
charge/2
temp planet
q electron
mass/2
mass/2
temp sun
q nucleus mass n
greater/3
greater/3
mass sun
cause/3
cause/3
mass planet
mass e
attraction/3
attraction/3
force gravity
major charge
revolve around/3
revolve around/3
revolve
revolve
inanimate/1
inanimate/1
major mass
major mass
why attracts
why attracts
attracts
attracts
planet
electron
sun
nucleus
J Intell Inf Syst
As said, (Wilson et al. 2001) shows that the correct analogy is different, since temperature difference between sun and planets and mass difference between electrons and nucleus are noise. To identify many plausible analogies we used RAM. It returned 6 plausible analogies, among which we considered the 3 having the largest number of mapped relations. The one reported in Table 5 proves that RAM is able to avoid the traps due to relations having the same name but that are not analogous in the considered context. Indeed, the only relation that has not been mapped compared to the outcome of STAR-2 is the noisy one. This analogy reveals that the reason why the electron revolves around the nucleus is already expressed in the atom description, since the cause of the attraction is the opposite charge, recognized as being analogous to the difference in masses between the sun and the planets in the Solar system. The other two results were wrong: one mapped the masses just like ROM, while the other mapped the noisy relation (temperature/2, mass/2), instead of the correct one (mass/2, charge/2). This shows that RAM can reach the same results as STAR-2, but can also do more. These three results represent three different perspectives, which suggests the presence of a point of strong ambiguity. The common part of the three results identifies a stable portion of the analogy, while the other associations that characterize each mapping represent different alternatives among which the system cannot choose, due to lack of knowledge (a situation in which also humans can make mistakes).
4.3 Military and medical strategy Gick and Holyoak (1980) adopts a cognitive psychology perspective. To study the process of analogical reasoning in humans, two stories, representing the base and the target domain, are submitted to a group of humans. The humans had to complete the latter based on the former, trying to recognize the knowledge that solves the problem in the target story. Without any suggestion, only 57% of the subjects provided a complete solution to the analogy, whereas ROM provides directly the correct analogical solution. Let us give details about the stories and the relative expected solution. The base story follows.
Table 5 Solar system and Rutherford atom mapping provided by RAM Mapped predicates
Mapped terms
Base clause
Target clause
Base clause
Target clause
cause/3
cause/3
why attracts
why attracts
gravity/3
opposite sign/3
force gravity
opposite charge
greater/3
greater/3
major temp
major mass
attraction/3
attraction/3
attracts
attracts
revolve around/3
revolve around/3
temp planet
mass e
inanimate/1
inanimate/1
temp sun
mass n
mass/2
charge/2
revolve
revolve
planet
electron
mass planet
q electron
mass sun
q nucleus
sun
nucleus
J Intell Inf Syst
A fortress was located in the center of the country. Many roads radiated out from the fortress. A general wanted to capture the fortress with his army. The general wanted to prevent mines on the roads from destroying his army and neighboring villages. As a result the entire army could not attack the fortress along one road. However, the entire army was needed to capture the fortress. So an attack by one small group would not succeed. The general therefore divided his army into several small groups. He positioned the small groups at the heads of different roads. The small groups simultaneously converged on the fortress. In this way the army captured the fortress. It is translated into the following Horn clause, where each row encodes a sentence in the story. conquer(fortress) :– located(fortress,center), partof(center,country), radiated(oneroad,fortress), radiated(roads,fortress), partof(oneroad,roads), capture(general,fortress), use(general,army), prevent(general,mines), located(mines,oneroad), located(mines,roads), destroy(mines,army), destroy(mines,villages), 5. couldnotuse(army,oneroad), 6. capture(army,fortress), 7. couldnotuse(subgroup,oneroad), 8. splittable(army,subgroups), partof(subgroup,subgroups), 1. partof(subgroups,army), destroy(mines,subgroup), notenough(subgroup), 9. distribute(subgroups,roads), 10. converge(subgroups,fortress), 11. capture(subgroups,fortress). 1. 2. 3. 4.
The target story follows. A tumor was located in the interior of a patient’s body. A doctor wanted to destroy the tumor with rays. The doctor wanted to prevent the rays from destroying healthy tissue. As a result the high-intensity rays could not be applied to the tumor along one path. However, high-intensity rays were needed to destroy the tumor. So applying one low-intensity ray would not succeed. The Horn clause representing the target story is: 1. 2. 3. 4. 5. 6. 7.
heal(tumor) :– located(tumor,interior), partof(interior,body), defeat(doc,tumor), use(doc,rays), prevent(doc,healthytissue), located(healthytissue,oneslit), located(healthytissue,slits), aredestroyed(healthytissue,rays), couldnotuse(rays,oneslit), defeat(rays,tumor), couldnotuse(ray,oneslit), [additional]radiated(oneslit,tumor), radiated(slits,tumor), partof(oneslit,slits), splittable(rays,ray), partof(ray,subrays), partof(subrays,rays), aredestroyed(healthytissue,ray), aredestroyed(healthytissue,subrays), notenough(ray).
Item 7 encodes implicit knowledge, saying that: the cancer can be reached from one or many directions; a slit is one of many slits; the rays can be splitted; and healthy tissue can
J Intell Inf Syst
be damaged and/or destroyed from rays or sub-rays. The expected result must include the missing knowledge of the target story, that is: “The doctor therefore divided the rays into several low-intensity rays. He positioned the low-intensity rays at multiple locations around the patient’s body. The low intensity rays simultaneously converged on the tumor. In this way the rays destroyed the tumor.” Given the mapping obtained by ROM (see Table 6), the inference hypotheses from base to target domain are: 1: 2: 3: 4: 5:
defeat(subrays,tumor) splittable(rays,subrays) skolem distribute(subrays,slits) skolem converge(subrays,tumor) aredestroyed(healthytissue,skolem villages)
Hypothesis 1 can be identified as the goal of the problem; it is made up of fully mapped components, making it a conclusive inference. Hypotheses 2, 3 and 4 reveal the procedure to reach the goal; for non-mapped predicates, the skolem prefix was added by the projection step. These inferences are to be considered contingent rather than conclusive. Hypothesis 5 does not make sense (it is a case of fake inference). It is noteworthy that all these statements were absent in the target domain, and have been obtained using the base domain. The consistency with the expected analogical reasoning is evident. In order to assess the quality of the relational structure, we evaluated the similarity between the clauses before and after the inference. The sf score (Ferilli et al. 2009), normalized in ]0, 1[, was 0.65 for the original clauses, and was 0.85 after the alignments. So, the alignment allowed to recognize an additional 20% of the structure. Such portion of the clauses appeared unrelated before ROM.
4.4 Patterns assessment The pattern generation approach was evaluated by carrying on the running example of analogical reasoning between military and medical domains, described in Section 3, to a third domain, referred to as Pharmaceutical. It tells a story about the purchase of a medicine for which an offertory is needed.
Table 6 Military and medical mapping outcomes by ROM Mapped Predicates
Mapped Arguments
Base clause
Target clause
Base clause
Target clause
destroy/2
aredestroyed/2
country
body
capture/2
defeat/2
center
interior
partof/2
partof/2
roads
slits
couldnotuse/2
couldnotuse/2
subgroups
subrays
splittable/2
splittable/2
oneroad
oneslit
use/2
use/2
army
rays
radiated/2
radiated/2
mines
healthytissue
prevent/2
prevent/2
general
doc
located/2
located/2
subgroup
ray
notenough/1
notenough/1
fortress
tumor
J Intell Inf Syst
A medicine is located in the warehouse of a pharmacy. A patient needs to purchase the medicine. The patient must face the problem that his money is not enough. As a result the patient cannot purchase the medicine paying the total price. However, the total amount is needed to purchase the medicine. So applying a partial amount cannot succeed. The clause encoding such concepts is: 1. 2. 3. 4. 5. 6. 7.
get(medicine) :– located(medicine, warehouse), located(warehouse, pharmacy), purchase(patient, medicine), use(patient, medicine total amount), mustface(patient, not enough money), couldnotuse(medicine total amount, one money source), couldnotuse(partial amount, one money source), purchase(medicine total amount, medicine), [additional]splittable(medicine total amount, partial amount), partof(partial amount, many partial amounts), partof(many partial amounts, medicine total amount).
ROM finds the mapping in Table 7. Analogous domains can be found in the stored mappings as reported in Table 8. While predicates use/2, located/2, couldnotuse/2 and partof/2 introduce no novelty, predicates mustface/2 and purchase/2 are more interesting because the analogy indicates that mustface/2 is the ‘difficulty’ to be overcome, whereas purchase/2 stands for the main action on which the Pharmaceutical story is built. The term mapping suggestions have no shared knowledge, so each term is traced to different terms in the base domains. For instance, patient is traced to the main actors in the other domains (general and doc); medicine, that is the target object in the story, is traced to fortress and tumor; and so on. As for the analogy between the Military and Medical domains, the gain in the portion of aligned structure before and after analogy was evaluated. The normalized sf score is 0.4 for the original clauses, and 0.74 after applying ROM. So, 34 % more structure has been aligned thanks to the analogy. The evaluation of the inference step is important, as well. Since the projection of knowledge is not empty, the analogy with the Pharmaceutical domain might be represented by an new meta-pattern. Table 7 Pattern and Pharmaceutical mapping outcomes by ROM Mapped Predicates Base clause capture OR defeat/2
Mapped Arguments Target clause
Base clause
Target clause
purchase/2
country OR body
pharmacy
partof/2
partof/2
subgroups OR subrays
many partial amounts
couldnotuse/2
couldnotuse/2
subgroup OR ray
partial amount
located/2
located/2
center OR interior
warehouse
use/2
use/2
oneroad OR oneslit
one money source
prevent/2
mustface/2
army OR rays
medicine total amount
mines OR healthytissue
not enough money
general OR doc
patient
fortress OR tumor
medicine
J Intell Inf Syst Table 8 Suggestions from original domains Source
Military Medical Military Medical Military Medical Military Medical Military Medical Military Medical Military Medical Military Medical Military Medical
Mapped Predicates
Mapped Arguments
Pharmaceutical
Mapping
Pharmaceutical
mustface/2
prevent/2
medicine
use/2
use/2
patient
located/2
located/2
not enough money
couldnotuse/2
couldnotuse/2
medicine total amount
partof/2
partof/2
one money source
purchase/2
capture/2 defeat/2
warehouse
–
–
partial amount
–
–
many partial amounts
–
–
pharmacy
Mapping fortress tumor general doc mines healthytissue army rays oneroad oneslit center interior subgroup ray subgroups subrays country body
5 Conclusions Analogy is a fundamental inference mechanism to transpose knowledge from known to unknown domains, producing new conclusions that may help to solve a problem. This work proposes a structural approach to hypothesize analogous concepts among descriptions, comparing the structure of the relational network. Specifically, we propose two operators. The former (ROM) is bound to map equal descriptors, and expands the mapping to other descriptors that play the same role in the two domains. The latter (RAM) can try all possible perspectives and suggest several consistent analogies. The proposed approach is multistrategic, since RAM exploits argumentation to detect consistent mappings. Multi-strategy is also involved in that we also propose a way to find patterns that generalize analogies across several domains (induction), and to transpose missing knowledge across domains (abduction). Application of the proposed operators to standard analogy problems of different complexity, taken from the literature, proved that they are effective in finding the correct solution even when other approaches could not. As future work, we plan to apply our analogical engine in order to recognize analogies between the networks learned from different text corpora by the ConNeKTion tool (Rotella et al. 2015). This will involve further study on the relationships with abduction, to check whether the inferred knowledge is consistent with the constraints of the target domain, and abstraction, to find a proper alignment of the representations. A further interesting direction will be the use of a probabilistic approach to asses the reliability of the mappings.
J Intell Inf Syst
References de los Angeles, C.M., & Forbus, K.D. (2012). Using quantitative information to improve analogical matching between sketches. Baydin, A.G., de M´antaras, R.L., & Onta˜no´ n, S. (2012). Automated generation of cross-domain analogies via evolutionary computation. CoRR abs/1204.2335. Doumas, L.A.A., Hummel, J.E., & Sandhofer, C.M. (2008). A theory of the discovery and predication of relational concepts. Psychological Review, 115(1), 1–43. Dung, P.M. (1995). On the acceptability of arguments and its fundamental role in nonmonotonic reasoning, logic programming and n-person games. Artificial Intelligence, 77, 321–357. Egly, U., Gaggl, S.A., & Woltran, S. (2010). Answer-set programming encodings for argumentation frameworks. Argument &, Computation, 1(2), 147–177. Falkenhainer, B., Forbus, K.D., & Gentner, D. (1989). The structure-mapping engine: Algorithm and examples. Artificial Intelligence, 41, 1–63. Ferilli, S., Basile, T., Biba, M., Mauro, N.D., & Esposito, F. (2009). A general similarity framework for horn clause logic. Fundamenta Informaticae, 90(1–2), 43–66. Gentner, D. (1983). Structure-mapping: a theoretical framework for analogy. Cognitive Science, 7(2), 155– 170. Gentner, D. (1998). Analogy. A companion to cognitive science (pp. 107–113). Gentner, D., & Markman, A.B. (1997). Structure mapping in analogy and similarity. American psychologist, 52, 45–56. Gick, M.L., & Holyoak, K.J. (1980). Analogical problem solving. Cognitive Psychology, 12(3), 306– 355. Giordana, A., Saitta, L., & Roverso, D. (1991). Abstracting concepts with inverse resolution. In L. Birnbaum & G. Collins (Eds.) ML, Morgan Kaufmann (pp. 142–146). Halford, G.S., Wilson, W.H., Guo, J., Gayler, R.W., Wiles, J., & Stewart, J. (1994). Connectionist implications for processing capacity limitations in analogies. Advances in Connectionist and Neural Computation Theory, 2(1-2), 363–415. Hofstadter, D.R., & Mitchell, M. (1994). The copycat project: A model of mental fluidity and analogymaking. In Advances in connectionist and neural computation theory. Norwood, N.J.: Ablex Publishing Corporation. Holyoak, K.J., & Hummel, J.E.Dedre Gentner, K.J.H., & Konikov, B.N. (Eds.) (2001). Understanding analogy within a biological symbol system. Cambridge, MA: The MIT Press. Holyoak, K.J., & Thagard, P. (1989). Analogical mapping by constraint satisfaction. Cognitive Science, 13, 295–355. Leuzzi, F., & Ferilli, S. (2013). Reasoning by analogy using past experiences. In Proceedings of the 28th italian conference on computational logic (CILC 2013), CEUR-WS.org (Vol. 1068, pp. 115–129). Leuzzi, F., & Ferilli, S. (2016). New Frontiers in Mining Complex Patterns: 4th International Workshop, NFMCP 2015, Held in Conjunction with ECML-PKDD 2015, Porto, Portugal, September 7, 2015, Revised Selected Papers, Springer International Publishing, Cham, chap Generalizing Patterns for Cross-Domain Analogy, pp. 147–162. Lloyd, J.W. (1987). Foundations of logic programming, 2nd Edn. Springer. Lu, H., Chen, D., & Holyoak, K.J. (2012). Bayesian analogy with relational tansformations. Psychological Review, 119(3), 617–648. Michalski, R.S. (1993). Inferential theory of learning: Developing foundations for multistrategy learning. In Machine learning: a multi-strategy approach (Vol. 4, pp. 3–62). Morgan Kaufmann Publishers. O’Donoghue, D., & Keane, M.T. (2012). A creative analogy machine: Results and challenges. In Proceedings of the 3rd international conference on computational creativity (pp. 17–24). Dublin, Ireland. Rotella, F., Leuzzi, F., & Ferilli, S. (2015). Learning and exploiting concept networks with connektion. Applied Intelligence, 42(1), 87–111. Schwering, A., Krumnack, U., K¨uhnberger, K.U., & Gust, H. (2009). Syntactic principles of heuristic-driven theory projection. Cognitive Systems Research, 10(3), 251–269. Turney, P.D. (2005). Measuring semantic similarity by latent relational analysis. CoRR abs/cs/0508053. Turney, P.D. (2008). A uniform approach to analogies, synonyms, antonyms, and associations. CoRR abs/0809.0124. Veloso, M.M., & Carbonell, J.G. (1993). Derivational analogy in PRODIGY: Automating case acquisition, storage, and utilization (pp. 55–84). US, Boston, MA: Springer. Wilson, W.H., Halford, G.S., Gray, B., & Phillips, S. (2001). The star-2 model for mapping hierarchically structured analogs. The analogical mind (pp. 125–159).