AI & Soe (1990) 4:291-313 © 1990 Springer-Verlag London Limited
AI & SOCIETY
Interactive Fiction: Artifical Intelligence as a Mode of Sign Production Peter BOgh Andersen and Befit Holmqvist Department of Information and Media Science, Universityof Aarhus, DK-8200 Arhus N, Denmark
Abstract. Interactive media need their own idioms that exploit the characteristics of the computer based sign. The fact that the reader can physically influence the course of events in the system changes the author's role, since he no longer creates a linear text but a narrative space that the reader can use to generate stories. Although stories are not simulations of the real world, they must still contain recognizable parts where everyday constraints of time and space hold. AI-techniques can be used to implement these constraints. In fact, we suggest that AI is probably best seen as an aesthetic phenomenon. Keywords: Artificial intelligence; Interactive fiction; Semiotics; Narrativity; Interactive media
Introduction V E N U S is an a c r o n y m of the Danish equivalents of Video, C o m p u t e r s , Narrativity and Educational systems. The four words are keywords in a project that started in J a n u a r y 1990 at the Institute of Information and Media Science, University of Aarhus. Its purpose is to develop narrative techniques for multimedia systems and apply t h e m to educational systems. This p a p e r reports on a sub-project that focus on narrative techniques for interactive media. A t the m o m e n t , we are building an interactive fiction, and later we will try to transfer the techniques to a m u s e u m system where we want to use quasi-fiction to tell visitors in the local m u s e u m about the history of D e n m a r k . T h e reason for starting with fiction is that fiction gives you an opportunity for playing and experimenting with the code. W e feel that c o m p u t e r aesthetics sorely need the artistic labor which other media benefits from. A work of art, " . . . compels one to reconsider the usual codes and their possibilities. Every [aesthetic] text threatens the codes but at the same time gives them strength; it reveals unsuspected possibilitiesin them, and thus changes the attitude of the user toward them [ . . . ] the aesthetic experience challenges the accepted organization of the content and suggests that the semantic system could be differently
292
Peter BCgh Andersen and Befit Holmqvist
ordered, had the existing organization been sufficiently frequently and persuasively challenged by some aspect of the text." (Eco, 1976, 274)
The effect of aesthetic labor is to continually challenge existing codes, opening new meaning potentialities, and keeping codes from stagnation and death. We think the utilitarian and slightly insipid discourse of the computer medium would benefit from provocations from a serious computer art.
The Narrative Techniques Project As mentioned above, we are developing a short story. In order to "challenge the accepted organization" of the computer medium, the story is about love and erotics, computers normally being viewed as the opposite of emotions and sensuality. The story is about a woman called Eva who breaks out of her current dismal conditions and starts a journey to find a new life. She seeks a new context. On her travels, she passes through different settings, meets different persons and becomes involved in events she cannot control herself. She wanders around in a world of deceit and stubbornly seeks the "true" happiness. The system is mainly based on pictures, but uses text and sound as auxiliary codes. The first introductory part of the story was finished in summer 1989 (apart from some stylistic flaws). This part explores the graphical possibilities as means of sign production and is described in Bogh Andersen and Holmqvist (1990). That part went smoothly without too much frustration. However, when we began to work on the next part where interaction was to be the most important means of expression, we got stuck. We discovered that this part of the project contained far more practical and theoretical difficulties than we had ever dreamt about, and it took half a year before we had sorted things out so much that we could get going again. There were three interrelated problems: 1. The balance between reader and author shifts, since the reader must perform some of the functions previously allotted to the author. Who is responsible for getting a satisfactory experience out of the product? The reader or the author? What is the best balance between the two? 2. As a consequence of this, the size of the narrative controlled by the author shrinks, so that the author no longer plans and constructs a 300 pages story. Rather, he constructs small narrative pieces that must be combinable in many ways, much like a construction set. What should these narrative pieces look like? 3. How should one compose interactive fiction? We know by tradition how to write a text, but which techniques are suitable for developing a product that is not one, but many texts? These problems are not special to our project, but are general problems of interactive media, as several researchers in the field have already noticed. (See Yellowlees Douglas (1990), Moulthrop (1989), and Bolter and Joyce (1987) for interaction fiction, and Marshall and Irish (1989) for hypertext teaching systems. ) If we define interactive fiction as:
InteractiveFictionand AI
293
a piece of fiction in which the physical movements of the reader is an intended and integrated part of the aesthetic experience by influencing the course of events in the fiction, we immediately see that the problems are inherent in the very notion. The reader is no longer an interpreter of a fixed product, but can act physicallyon it and change it. Maybe the solution is to change our view of the product. Instead of seeing it as a collection of narratives which the reader can read in different order, we should see the product as a narrative world! The reader can act in this world, and it is the author's responsibility to design the world s o , a t using it gives rise to exciting and emotional experiences. Ideas of this kind is suggested by Krueger (1983) who calls it "artificial reality", by Laurel (1986, 1989) and by Smith and Bates (1989) who uses the term "synthetic reality". We do not like the term "reality" since it seduces us to see the product as a simulacrum, as a model that somehow resembles reality. This view of art is very narrow, being associated to the literary schools of realism and naturalism. There are many other aesthetic schools (surrealism, expressionism, modernism, etc., etc.), and in order to make room for them too, we shall use the term narrativespace. The term "space" (instead of text or drama) signals that we are not creating a collection of independent texts or plays but rather a space in which the reader can act and experience; the adjective "narrative" reminds us that this space- as other media like film and books - has but one function: it is a means for the reader to generate experiences, emotion and insight, a "machine for generating interpretations" (Eco). Now, how do we think of and organize such narrative spaces, when we no longer can partition them in a linear sequence of chapters and paragraphs? One way is to envision it as a space populated by different creatures- among which there may be human beings. The dispositions and actions of these humans must be described in computational form, and one place to look for ways of doing it is in artificial intelligence (Laurel (1989) and Smith and Bates (1989)).
Artificial Intelligence Artificial Intelligence (AI) can be defined as: Machine behavior that gives a human audience theimpression that the machine performs cognitive processes similar to human communication, understanding, reasoning, planning, and feeling. Since its birth, there have been two rather different views on AI, the ontological and the pragmatic view: 1. The ontologicah There are non-trivial similarities between machine and human behavior, and one can learn about one by studying the other. 2. The pragmatic: AI techniques can provide useful and/or exciting interfaces in specific circumtanees, but - like all other interface types - they are in principle illusions created by skilled exploitation of aesthetic techniques.
Peter BCgh Andersen and Berit Holmqvist
294
In our project we adopt the pragmatic view: computer systems are viewed as a medium in analogy with books, films, theater, and pictures, and AI techniques are basically seen as aesthetic techniques for producing a particular kind of signs that are placed on the scene of the screen in order to convey specific information and emotions to the "reader". Signs are associations of contents (the signified) to expressions (the signifier). These associations are culturally determined, based on a code shared among a group of sign users. Language is one example, pictorial codes another one. Computers are also interpreted by means of codes, known as interface standards. The computer-based sign uses sound (ranging from the beep to human speech) as well as states and systematic changes of screen pixels as its means of expression (Fig. 1). In this view, AI provides techniques for specifying roles (not human minds) and staged actions and emotions (not real actions and emotions) that aim at creating real experiences and emotions in a real human audience.
Expression
Content
The signifier: sound and screen pixels
The signified: the meaning assigned to sound and pixels
Fig. 1. The computer-based sign.
A Short Classification of Computer-Based Signs Thus, AI is seen as a special type of computer-based sign. The point in doing this is that we are invited to view systems components, including AI components, according to their contribution to the total creation of meaning. A system component can only be justified if it itself has a visual expression or exerts perceptible and interpretable influences on the visual expression of other signs. This point of view has methodological consequences. It makes one skeptical about one aspect of the method proposed by Bates and Smith and Bates (1989), namely to provide the basic machinery of the narrative space first and patch staging devices on afterwards. The danger in this method is a divorce of form and content that runs the risk of creating material for plots that are either not exciting or cannot be expressed in the computer medium. In our opinion, the method is based on a problematic conception of staging and content, namely that staging is subordinate to content. If design is based on the sign concept that contains both the signified (the content, the meaning) and the signifier (the staging, the means of expression) as indispensable parts, as two sides of the same coin, one will use a different method, involving a close dialectic between plotwriting and visualization. We have described AI as a technique for building computer-based signs; what, then, is a computer-based sign?
Interactive Fiction and AI
295
In the following we give a short definition and classification of computer-based signs according to the roles they can play in the creation of meaning taking place in the interface. The classification is based on BCgh Andersen (1990) that also provides a general framework for a semiotic treatment of computer systems and their use context. The signifier of the prototypical computer-based sign is composed of three classes of features:
1. A handling feature is produced by the user and includes key-press, mouse and joystick movements that cause electrical signals to be sent to the processor. 2. A permanent feature is generated by the computer. It is a property of the sign that remains constant throughout the lifetime of a sign token, serving to identify the sign by contrasting it to other signs, e.g. icons in picture based systems, or letter sequences in textual systems. 3. A transient feature is also generated by the computer, but unlike permanent features, it changes as the sign token is used. It does not contrast primarily to other signs, but only internally in the same sign, symbolizing the different states in which the sign referent can be, e.g. location, hilite. The following classification of computer-based signs is based on two criteria: Which features does the sign possess? Can the sign perform actions that affect features of other signs?
+ac~on
+permanent
-permanent
-action
+han~ing
-handling
+transient
Interactive
Actor
Object
-transient
Button
Controller
Layout
Ghost Fig. 2. Classification of computer based signs.
Interactors Interactors are unique to the computer medium. They exploit features from all three dimensions. The sign token is distinguished from the other signs by permanent features like e.g. size and shape, and during its lifetime it can change transient properties like e.g. location and color, these changes being functionally dependent upon its handling features. In most cases it can perform actions that change transient features in other signs. The heroes of video games, like the lonely soldier penetrating the enemy camp
296
Peter BCgh Andersen and Befit Holmqvist
in the action game Commando (Elite, no year), are nearly always signified by interactors, which provide a very strong identification for the "reader".
Fig. 3. Interactor.
The hero of Dark Castle (Silicon Beach Software Inc., 1986) is a very impressive example. He is able to change shape to simulate walking, fighting, throwing, jumping, climbing, falling, etc., and within the individual actions like jumping there are several varieties. Besides changing his own features, he can act on the enemy signs: kill rats, bats and vultures with stones, hit a henchman with a mace, etc. In many systems, the cursor is often the main interactor. In tool-applications, the semantic relationship between the interactive cursor and the sign it affects is often that of instrument to object 1, as in painting programs, where I can paint the "paper" (object) with the "brush" (instrument). In video games the interactor signifies a human 1st person agent (viz. the reader) pitted against hostile agents denoted by actor signs: the hero (agent) fights the goblin (object). Actors
Another common type lacks the handling features, but still has some action associated to them. They are able to change position and/or shape on the screen and to influence other signs, but they cannot be influenced directly by the reader, although they may adapt their behaviour according to the way the reader manipulates his interactor. We call them actors. Enemies in action games are normally actors. Dark Castle presents the following horrifying cast of bad guys: Bal
30 Ilolrll~
Yullurl~ 3D pPlnt;
R,1t 20 palrl~ Durr~ag rue
N
Htn~hman 100 p*in]s
N
Nmm 1~0 llalr¢~
~r~l I ~1 pair¢~
: .... Ivbq~, Gt,eem
NrgQple |0D pednts
Fig. 4. Actors.
Each exhibits its own pattern of behavior, carefully contrasted to the others. While bats start by hanging a moment in the ceiling and then flap irregularly down towards the hero, the vultures attack him in one streamlined dive. In tool-applications, actors represent automatic processes that cannot be
InteractiveFictionand AI
297
interfered with once started. Word processing programs may contain a Repaginate command that inserts new page numbers in the document, an Index that produces a list of indexed terms, complete with the number of the pages on which they occur, and a Table of Contents that generates an automatic table of contents. In the Macintosh interface standard, the permanent clock icon is the standard sign for actors, meaning that the machine is busy and cannot be interfered with.
Controllers Some signs change other signs although they do not change their own visual appearance. The actions belonging to them are presented indirectly by clearly influencing the behavior of other signs. Controllers are signs that only change properties of other objects, not of themselves.
'erses l~nd
Iow'ing.
~I
I
three Fig. 5. Controller.
Non-fiction applications use controllers to divide the screen into work areas, which influence the mouse cursor. The rectangle enclosing the "paper" is a controller, since it changes the cursor when it moves across the border. It follows from the definition that controllers are only fully realized in connection with another sign, in this case the interactive cursor. Their permanent features can always be seen, but the actions associated with them are only perceived through the transient features of the interactive sign. Dark Castle exploits controllers in a very brilliant way. Its scenes depict caves in a castle, and express three modalities: obstacles that prevent movement, paths that allow reader controlled movement, and abysses forcing movement without reader control.
Objects Object signs possess permanent and transient features, but no handling features: they cannot influence other signs, but can themselves be influenced. Often the user handles an interactor denoting a tool to influence an object sign denoting a work object. The "paper" in a word processor is an example: it is an object sign
Peter Bcgh Andersen and Befit Holmqvist
298
that can be modified by means of the interactive text cursor whose permanent features include shape (I) and whose transient features include location. ~t ACTION l determi nes an inter the paredigm members seem no on the floor, ~i nee we discover oo rs.
IRIZONTAL
Z
Fig. 6. Object and interactor.
The text cursor is handled by pressing a key, causing the rest of the text to be pushed one character to the right and leaving a new character in the available space.
Layouts Layouts lack transient and handling process features, and have no function vis a vis signs. They serve as mere decoration and are quite similar to conventional paper based signs.
Ghosts Ghosts are signs that lack both permanent and transient features. They are not represented by icons or other identifiable graphical elements, and they cannot be manipulated directly. However, they do have function to other signs. Like controllers they show their existence by influencing the behavior of other nonghost signs. Ghosts are common in games, where they are used aesthetically. For example, some maze games have hidden traps: they cannot be seen, but they cause the protagonist to fall down if he steps on them.
AI and Computer-Based Signs Now, where in this scheme do AI techniques belong? It seems to us that they belong to the category of ghosts. In some cases in a very literal way, since the effect is that the user perceives a kind of a "ghost in the machine" (whence the name). The Eliza program is probably the most famous system of this type. Other systems contain large hidden parts whose Working is not shown directly, but is used to create relationships between other signs, thus tying them together to a composite sign. Consider for example a database-system that allows the user to enter queries in English and uses the database to produce not only literal answers, but answers such as a normal intelligent clerk would have produced, given the same information. One way of analyzing this is as follows: the question and the answer are separate signs - sentences - just like in language. However, if they are related in a specific way, these two signs come to establish a composite sign, often called an exchange in discourse analysis. The exchange itself is an "invisible" relation between two visible sentences, and the program that specifies
Interactive Fiction and AI
299
the relation is only perceptible through its effect on the component sentences. Therefore, the question-answering algorithm of the program is a ghost.
Interpretation of Computer-Based Signs The above typology is based on expression features, but what about the meanings of the signs? A preliminary hypothesis would be that it is not possible to assign a fixed meaning to each sign type, but rather a limited range of semantic possibilities. Choice of an interpretation within this range depends upon the genre of computer applications we are looking at. Applications can be grouped into genres characterized by a common semantic system ("a universe of discourse"), and within such types one finds relatively stable mappings from the sign types to the semantic units. In games, the main semantic distinction is between animate and inanimate entities. Animateness is coded by presence, inanimateness by absence of transient features. On the one hand, we have the hero and the villains coded as interactors and actors, on the other the tools, nourishments, valuables, paths, obstacles and abysses that are often all coded as controllers. Semantically, hero and villans differ in person, since the hero is first person "I", and villains third persons, "they". The person distinction is coded by presence and absence of handling features. As mentioned above, many inanimates are coded as controllers, and the different subtypes can be distinguished according to the nature of the features they influence. Spaces have function to the locomotion of hero and villains, and can be divided into paths that enable locomotion, obstacles that prevent it, and abysses that makes a certain locomotion mandatory. In the Dungeons and Dragons variety, cave walls are obstacles and tunnels are paths. Ghosts are used to code traps. They cannot be seen, but influences the hero's vitality; in fact they often kill him. Finally, most games have layouts that have no function to other signs, for example trees and clouds to indicate a landscape. In the next section we shall elaborate the idea of AI techniques as aesthetic techniques, and since the genre is fiction, we employ the game interpretation of the computer-based signs.
The Drama in the Pub
The easiest way to illustrate the idea is by example. We build the interactive novel by first writing non-interactive episodes in a literary form, then cutting the episode into minimal sign-tokens, and finally defining the sign-types of the tokens by extracting their combinatorial possibilities. The sign-types can then be combined to yield other episodes, and we know that they at least will yield one interesting one, namely the episode they were abstracted from. Here is a little piece of our narrative where the protagonist Eva has entered a pub together with her boyfriend Adam. The names Adam and Eva are there for the convenience of the system designers but not known by the "reader". It was an odd pub I entered. It was very quiet and very dark. There were several guests but they were frozen stiff. It reminded me of the set for a modem piece of theater. You can faintly see the actors on
300
Peter BoUghAndersen and Befit Holmqvist
the dark stage but you don't know which roles are attached to them. I sat down at a table and waited for the waiter. While waiting I searched the room at random. My eyes slowly got used to the darkness and I started to discern pesons and groups. I could see there was a bar but there was no bartender. My eyes fell on a pint of beer standing on the counter. Was it for me? Where was the bartender? What if I just walked up there and took the beer? I looked around again. Not very far from me a man and a woman both dressed in white were sitting, absorbed in each other. After having looked at the man for a while I noticed him moving nervously on the chair. Was it the bartender who had left his post at the bar for a moment? I was thirsty and hoped to he able to hypnotize him back to where he belonged. As if he could feel my eyes burning his neck he suddenly extricated himself from the woman and slowly walked towards the bar. I followed him, convinced that I would finally get my beer. I turned my head and caught sight of the woman who was now left alone at the table. She followed the man with her eyes. They were wide open but I could not make out if they were expressing yearning or frustration. For a moment my heart was touched. I tried to catch her eyes but she did not see me. She was concentrating on the man as I had been a couple of seconds before. Everything grew mysterious. I went up to the bar to order my beer. My beer! The man was not a bartender. He just stood there leaning against the bar with a pint of beer in his hand. Probably the same one that had been left on the counter because it was not there anymore. Why did he stand there? Had he lost interest in the woman? A husky laughter from the end of the bar drew my attention. A sinful beauty dressed in black and a huge man almost two meters tall with a black beard stood with their arms around each other scornfully leaned towards the counter. The woman half turned against the man in white, the man half turned towards the dark interior of the pub. So that was why the man in white was stuck to the bar! Captured in the field of beauty. Before my inner eye I saw the image of a spider silently waiting for the innocent fly to be captured in the net. The woman had an unfit cigarette dangling between her lips. The white man looks at the woman and shows her a lighter. The woman gets free from the giant and turns towards the man in white with a provocative smile. For a second I look at the woman at the table. Her eyes tell a story of despair. I am furious. I want to stop the performance. But the man in white is spellbound and when the witch raises her hand and strokes his hair he turns his back to the room and his bride. They embrace each other and I am helpless. A bartender suddenly shows up and places another beer on the bar. I hurry to grab it. I want to offer it to the woman in white. To comfort her and divert her attention from the provocative scene, I put the glass on the table in front of her and receive a little smile in return, but I can't catch her eyes. She is like in a trance waiting to be awakened. I am standing there and in my mind I stroke her transparent innocence when a shadow from a black cloud falls over the d~colletage of her white dress. I see the giant rising behind her. He must have come from the inner darkness of the room with evil motives. The woman sits quietly without any suspicion. If she could see me I could warn her. She does not. I want to stop the giant and throw myself inbetween them and force him back. With all my strength I press my fist into his stomach and for a moment he staggers a few steps backwards but gains his balance again and comes back to the woman. Once more I try but this time the vilain takes my arm and wrestles me away from the table. I cannot prevent what is going to happen. He leans over the woman and puts Ms big hairy hands over her round breasts. She opens her mouth as to cry but her voice is silent. Horrified she tries to turn herself out of his grasp that is growing tighter and tighter. I can't stand looking at it. But I cannot get close. I get the idea of drawing the white man's attention to what happens. When I look towards the bar I can see that it is too late. The white man is out of control. He is lying on his knees in front of the black woman with his arms around her hips and his head against her bosom. He cannot see her mischievous smile or feel that the hand over his neck is cold and calculating, not warm and tender. It strikes me that her flirt with the man has all been a scheme to make the innocent woman a victim to the evil plans of the giant. In order to translate these verbal signs into computer-based them and segment them into 7 parts:
signs, we simplify
When we enter the pub, we can see several persons. Eva and Adam, both clad in white, are sitting at a table, whereas the demonic Lea and Jason are standing at the bar (of course in black clothes). Adam is moving nervously, then he rises, walks to the bar, and buys a beer (1). Lea turns towards Adam with an unlit cigarette in her mouth. He lights her cigarette, and she smiles provocatively at him (2). From her table, Eva looks at them with a lost and despairing expression (3). Lea strokes Adam lightly over the chest, and he turns his back to the room (4). In the meantime, Jason has moved down to Eva (5), and in spite of her resistance, drags her out on the floor to dance (6). Lea smiles mischievously; her flirt with Adam has all been a scheme to make Eva victim to Jason's evil plans (7).
Interactive Fiction and AI
301
Narratives like this are bound together by cohesions on many levels that can be made visible if we change the text and note what happens. Sometimes a change only gives a different story, and sometimes it turns it into nonsense. Take, for example, episode (1): Adam is moving nervously, then he rises, walks to the bar, and buys a beer. Many changes of sequence or specific events produce deviant and strange episodes, as for example: Adam is moving nervously, then he sits down, walks to the bar, and buys a beer (wrong sequence). Adam is moving nervously, then he rises and walks to the bar, and buys a car (wrong object). Cohesions like these should be controllable by the author, which again means that they should be represented in the system. The reason is not that we want to spend all our programming time simulating behavior in a bar; the reason is that good fiction lives by breaking cohesions like these in a calculated purposeful fashion. But in order to do that, the cohesions must be under author control. Otherwise, the author has no other option than to create narratives that all the time and outside his control, break rules. In our narrative, we at the moment distinguish between three levels of cohesions:
The Physical Level: Level 1 Signs The cohesions on the physical level include the contiguities of space, time, and causality. In order for Adam to buy a beer at the bar, he must be there, and this he achieves by walking to the bar, but in order to walk, he must rise from the chair. Similarly, in order to light Lea's cigarette, she must have an unlit cigarette in her mouth, and he must be close to her. Deviation from these constraints is a deviation from the normal world of the reader, and should therefore be used for an aesthetic purpose, not forced randomly upon the reader without reason. AI techniques for describing purposeful actions can be applied here. Table 1 shows four actions articulated into preconditions that must be satisfied before the action can take place, a method for performing the action (which in our case is simply a description of how to move the graphics on the screen and change their shape), and a list of results that are obtained when the action is successfully accomplished. 2 Schemata like these can be used in various ways. One common way to use them is to assign a goal to an actor, e.g. to have a beer. The actor selects the action that contains the goal in its result list, and checks its preconditions. If a precondition is not fulfilled, the actor selects a new action (a means) that fulfills the precondition (e.g., walking to the bar will fulfill the precondition of being at the bar of the beerbuying action). This can go on until the actor hits upon an action whose preconditions are all fulfilled. Then the constructed plan is executed. The schemata must be interpreted by the actors, so that one of the results
302
Peter B0gh Andersen and Befit Holmqvist
appoints the goal, viz. the state the actor wants to achieve. For example, walking to C is distinguished from leaving B in that the first action interprets being at C as the goal, while the latter sees not being at B as the goal. Similarly, buying and selling differ only in that selling assumes the goal of the seller (to get a sum of money), while buying presents the goal as the acquisition of the commodity. Table 1 shows some of the actions behind our short episode. Table 1. Level 1 signs Level 1 signs
Preconditions
Method
Results
A raises
A is sitting
A changes position
A is standing
to upright A walks to C
A buys B from C
A lights a cigarette for B
A at B A not at C A is standing A at C C has B A has money A has a lighter B has a cigarette The cigarette is unlit A is at B
A is moving in the direction of C C hands B to A, and A hands money to C A applies lighter to cigarette
A not at B A at C A is standing A at C C has money A has B A has a lighter B has a cigarette The cigarette is lit A is at B
We believe that the aesthetic interpretation of AI can be generalized, although we will not elaborate the argument here. In our interpretation the standing discussion for and against AI is pointless, and both parties are wrong. Adherents of AI are wrong because they cannot recognize an illusion when they see one, mistaking the light spots on the screen for the real Paul Newman. Opponents recognize the illusion and despise it, but they have no reason for this contempt; all artists work with illusions but still take their job deadly serious.
The Cognitive and Social Level: Level 2 Signs But control of the physical cohesions is not enough, since humanly interpreted behavior is an indispensable ingredient in stories. In our story, the reader and Adam and Eva make one interpretation of the actions of Jason and Lea, namely that Lea is interested in Adam, but this interpretation turns out to be a deceit; their real purpose is to divert Adam and make Eva vulnerable to Jason's advances. What we need is second level signs that interpret first level signs and add a new content to them. Thus, whereas the physical signs use screen displays as their expression and denote fictive physical actions, the second level signs use the first level signs as their expression, and denote psychological and social relations. Signs with this structure are often called connotation signs. The main theme in the story is the protagonist's search for a context she feels at home in, so dissolving and establishing social bonds are important processes. Important states are being alone versus being together. In the example, a bond is dissolved when Adam abandons Eva, and new bonds are created when Adam contacts Lea.
Interactive Fiction and AI
303
As shown below, one can abandon someone by walking to a different place, and a male can make contact with a female by lighting her cigarette. Table 2. Level 2 signs Level 2 signs
Preconditions
Method
Results
A abandons B
A and B are together C is different from B A is alone B is alone A is male B is female
A walks to C
A is alone B is alone A and B are together
A contacts B
A lights a cigarette for B or A speaks to B or...
The schemata show how first level signs likeA walks to C andA lights a cigarettefor B are part of the method, and thereby of the expression, of the second level signs. However, not any walk will count as an abandonment, and not any cigarette lighting as a proposal for contact. The two level 1 signs must be designed in a special way. In the first case, we might show Eva's despairing expression, in the second Lea's provocative smile, in order to provide the extra connotations.
The Narrative Level: Level 3 Signs Up till now, an episode must both obey the constraints of the physical and the social world. Otherwise said: its signs must both be well-formed level i and Level 2 signs. But there is more to narrativity than that. Our story is built around two poles; harmony and disharmony. The protagonist of course searches for harmony but it is not easy to get, and as a part of her quest she has to open herself to experiences, involving the opposite disharmonic pole. Harmony is characterized by security and normality but runs the risk of boredom; disharmony, on the other hand is a world of unpredictable events, which may turn out to be dangerous. The oscillation between harmony and disharmony is the basic theme of the story, but how do we get these abstract concepts to interact with the more concrete happenings on the screen. What kind of system structure will support us ? Maybe the answer is that the basic system structure should be identical to a literary analysis. At the moment, we are experimenting with using the so-called semiotic quadrant (Greimas, 1966, 1970) as our basic narrative structure. In the example, the harmony part is associated with groups of two people of different sex, but of the same color (thus Adam and Eva, dressed in white, is a harmonic group, and so is the black-clad Lea and Jason), while disharmony, its antonym, involves people of different colors (Eva and Jason form a disharmonic couple). In addition, in harmony the everyday constraints hold, while disharmony is a carnival, where everything can be turned upside down. According to semiotic narratology, a story can never send a protagonist directly from one antonym to the other. There must be "stepping stones", namely the negatives of the two poles. We are fairly sure the negative of harmony should be a vulnerable state, where the person is alone, while the negative of disharmony probably should be a state of invulnerability or liberation.
304
Peter Bcgh Andersenand Befit Holmqvist
The quadrant defines the global narrative routes of the story. Below is shown the popular butterfly route: Harmony = Disharmony = Everyday Carnival Number: two Number: two Gender: different Gender: different Color: ~me R e ~ b e ~ p r e t a t i o n ~ : i ifferent E x ~ ~ i l d r a w a l Non-harmony= VuInerabfliW Number: one
Non-Dtsharmonyffi InvuInerability Number: one Fig. 7. The semioticsquare.
The labels of the arrows denote global narrative operations, and in our case, they are level 3 signs that are expressed by level 2 signs. For example, moving Eva from harmony to vulnerability is called "exposure", which is expressed by the social level 2 episode of "abandoning". Driving her on to the disharmony pole is called "absorption", and will be done by a contact creating action of Jason, namely forcing himself upon her against her will. In the continuation of the example we could have chosen to let the narrator succeed in wrestling Jason down to the floor. Thereby Eva would again be alone, this time not abandoned but liberated. This would be "withdrawal" to the non-disharmony pole; and Adam could return to the table, take Eva by the hand, and leave the pub together with her, bringing us back to the harmony pole. If we again look at the little episode, we can see a complicated three-level sign structure governing the ongoings: the actions must satisfy constraints from all three levels: the physical level, the social level, and the narrative level:
i:': L
~ .~xpomu'o
Abtorb~/on
:'
A ab~ talons E
:
~o~,,
L ¢ontac~ A
/ \It,~o~. IJ~L ~ I ~
I
I,
J contact~ L
~:
/\
'i
I:..
IA ~ . - , I ~ - ~ . ~ i A V . ~ , ~ I
I
I-'.~on. IJ ~
I
I ~"
,
IL'~on'I
I
I'~.i;
•:..................X"~/~G"/~'~" ....... ................. K)~"~ i lAat tabl'
Ihatbar
'" IA a t b ~
IAatbar
IEt~ "
iI J - t ~
l~at~
l~=t~
IJatb.x
I
iIL~tb,~
IT.~t~
IL*t~
IL"-~
I
IA~'bo~r
I
i Ib~'~d"rhu b'~Ib~"~d'rhu b'~rlAl~'b°'r
.........
Fig. 8. The threeqevelsign structureof the story.
I
..................
Interactive Fiction and AI
305
The story works as follows: if the story is in a state of harmony, then the narrative constraints for displaying an exposure are satisfied. One possibility is to manifest this as a process where Adam abandons Eva, but this is only possible if they are together at the moment. If not, exposure could be manifested as e.g., Eva getting rid of a protection she had, or by launching herself into a dangerous journey. If the exposure is realized as an abandonment, this again can be manifested in various ways: one possibility is the actual one, namely that Adams walks to the bar, which again requires the physical condition that Adam and Eva are at the table. Had they not been physically together, the same effect could be achieved by means of a letter or a phone-call. These constraints define the combinatorics of the narrative signs. Since the narrative is interactive, we cannot know beforehand which state will obtain,,so the system must be able to process the constraints in order to determine what will happen next, i.e. the type of sign to display next. There are still parts of the literary version that are not accounted for yet, namely the remarks on the author's interpretation of what he experiences in the pub and what kind of actions he takes, e.g. Her eyes tell the story of despair. 1am
furious. I want to stop the performance. [ . . . ] 1put the glass on the table in front of her. In a written text the author can be present, and we use this presence to indicate possible interpretations and possible actions from the reader of our interactive narrative.
AI Can Provide the Substance for Artistic Form
Let us now try to generalize a little from this example. Art consists in giving form to a substance; the sculptor gives form to his stone, the painter to his colors, the dancer to body movements, and the poet to linguistic material. 3 "We may define as invention a mode of production whereby the producer of the sign-function chooses a new material continuum not yet segmented for that purpose and proposes a new way of organizing (or giving form to) it in order to map within it the formal pertinent elements of a content-type." Eco (1977: 245)
What is the substance of a narrative space? We have chosen to let the recognizable features of the readers' everyday life, signified by signs of level 1 and 2, be its substance. These features must be built into narrative space, not because we want to make a model of reality, but because we need some recognizable stuff, some material, we can form into an aesthetic experience, much as the sculptor needs a stone, and a painter paint and canvas. The above description is based on semiotic, mostly literary, theory, but similar ideas have been proposed from a drama point of view by Laurel (1989) who bases her arguments on Aristotle's philosophy: to the substance, consisting of everyday experiences that are moulded by dramatic form, corresponds her material cause: "Material cause: The material cause of a thing is the force exerted on it by what it's made of. So, to pursue the architecture example, the material cause of a building includes stones or concrete or wood, glass, nails, mortar, and whatever else it's made o f . . . the material cause [of drama] is the stuff it's
306
Peter Bcgh Andersen and Befit Holmqvist
made up of - namely, the sayings and goings of the characters and the affordances of the dramatic situations." Laurel (1989: 8--9).
and the form, imposed by the connotative signs, corresponds to herforrnalcause: "Formal cause: The formal cause of a thing is the force exerted on it by what it's trying to be . . . . So in drama, the formal cause of a play is the playwright's notion of what it will be when it's done (which probably includes a notion of genre and style as well as story or plot)." Laurel (1989: 8--9)
We believe that AI techniques could be helpful in creating this stuff for narrative spaces. The restaurant script programmed by Schank and Abelson (1977) seems to be able to account for at least signs of level 1 and 2: it describes the everyday rules one applies subconsciously when dining at a restaurant. But there is one further point related to the form/substance distinction to make. The substance of narrative spaces differs from the stone manipulated by the sculptor by consisting of signs. For example, the action of fetching a beer will be constructed out of a sequence of graphics shown after each other at different locations on the screen.
Expression
Content
Graphics shown at different locations
Fetching a beer
Fig. 9. Level 1 sign.
The connotative sign uses this sign as an expression substance for a new sign. The context in which the beer-fetching occurs, the speed with which it occurs, and the way the graphics are drawn can be used to associate a new, specific significance to the action of beer-fetching: in the example it counts as an act of desertion and betrayal on the part of the man, exposing the female hero to the dangers of the two demonic characters.
Content
Expression
Expression Graphics shown at different locations
Content Fetching a beer
Betrayal,
exposure, desertion
Fig. 10. Level 2 sign.
However, the process in which the connotative sign uses the object sign to signify a new content does not consist of just taking over the object sign as it is. As said above, it involves using the object sign as substance for a new form, and when a substance is used as form, it means that new distinctions are introduced into it.
Interactive Fiction and AI
307
Viewed merely as an object sign, the beer-fetching episode classifies some properties of the graphic sequence as distinctive form features that must be present in order for the program execution to count as signifying beer-fetching: for example, the initial and final locations of the graphic must be different, otherwise we would not use the term fetch. Also, a beer-graphic - and not a picture showing a car - must be associated with the actor-graphic in final state, otherwise it would not be beer-fetching. However, the facial expression of the actor, his speed, and much of the context do not constitute distinctive features of form, but are only variants. Faster and slower movements would still signify beer-fetching. When we use this sign as stuff for a connotation sign, however, we introduce new distinctive form elements into it that were absent in the level 1 sign. Now the context may be significant, or the speed, or the facial expression of the actor. Apart from moulding the lower level signs in a new way, other things may happen as well: the aesthetic form may, for example, 1. add a new significant syntagmatic structure (the three trial of fairy-tales, antithesis, parallelisms); 2. add new paradigmatic structure (in fairy-tales like Cinderella, the notion of "old" is associated to "bad" since the elder sisters are "bad", while "young" is connected to "good", the youngest child always being "good" to animals in distress. These couplings do not exist in the language that provides the substance for the fairy-tales - they are added by their aesthetic structure); 3. loosen constraints: although language separates humans from non-humans so that some verbs can only be used about one of them, birds normally not being able to "talk", narrative structure can loosen these constraints: in fairy-tales, inanimates can talk and horses fly; 4. invert real constraints: although the peasant in feudal society must stay in his situation, he can become a king in the fairy-tales.
Implementation Issues How could these ideas be implemented on a computer? How can the actors; including the user/reader, their movements, goals and emotions on the one hand, and the aesthetic ideas of the author on the other hand be made to play together? Laurel (1989) contains a sketch of a possible program architecture, parts of which we plan to borrow in our system. Our system will contain actors that are programmed to have "normal" reactions, intentions and feelings. All the time they figure out what steps they would take next in a normal world, but are not allowed to do it immediately. They send their goals and desires to the controlling dramatic episodes, asking: "I want to do this or this or this. Can I?". The episode receives these suggestions; if one can be exploited aesthetically, the actor is allowed to do it. If neither can be used, the episode replies: "No, you cannot have your way. Since this is fiction and you are only an actor, I want you to do this instead".
308
Peter BCgh Andersen and Berit Holmqvist
The present version of our system contains the following kinds of objects:
paths, ob~t.aeles, [ and abysses (eontro!ler) ...... ]
~1 eharaeterl -
- ["F; the r ~ a e r !
[::+ ) Fig. 11. General message-passing in the system.
There are three visible signs: controllers denoting the parts of the landscape that influence the characters of the play, the characters themselves, implemented as actors, and the reader representative which is of course an interactor. In addition to these visible signs, there are three types of ghost signs that only show their existence by influencing the visible signs. The actor's mind is programmed to generate behavior we can interpret as everyday sensible goaldirected behavior, and the dramatic episode adds aesthetic requirements to this behavior in the manner described above. However, in a good narrative, the actors do not only have conscious goals and plans, but also subconscious urges, impulses and passions. This is what the emotional relations object is used for. Emotional relations are expressed by vectors between persons and persons and objects. Below is shown a situation where a woman is repelled by the man in the black sweater who is attracted to the woman.
Fig. 12. Vector based interaction.
Interactive Fiction and AI
309
On the screen display the vectors are of course invisible, and can only be seen through their effects on man and woman. The length of the arrows symbolizes the force of the vector, and the circles the range of them. Thus, the negative vector from the woman towards the black man is not active at the moment, since the man is outside its range, while the man's attraction towards the woman is operative, because she is inside the range of the vector. The system works by moving the persons stepwise according to the vectors that influence them, so the man will slowly move towards the woman, while the woman will not do anything. However, when the man comes inside the small circle, the woman's negative vector begins to work ("she discovers him"), and she backs away while he follows. Vectors are useful for expressing concepts like ambiguous feelings (positive and negative vector towards the same person) and conscious goals that are hidden for the reader/user. In our example, the narrator interpreted the situation with Jason and Eva as a dangerous one, wanting to protect Eva. Let us say that Jason is implemented as an interactor and that the user drags him away from Eva. But the reader has misinterpreted the situation, since Eva does not want to be rescued, so she follows and approaches Jason. This can be achieved by a positive vector from Eva to Jason. Vector signs are ghost signs; there is no screen dement that denotes a vector, it is only indirectly manifested in the movements and postures of the actors. The dramatic episode is the third object; it gives dramatic form to the physical and psychological material. Summarizing we can see the actor is caught in a net of possibly conflicting demands: 1. 2. 3. 4.
from from from from
his conscious plans (be rational; do this in order to achieve that) his emotional relations (avoid or approach a person) the narrative event (be entertaining) the reader (move there or notice this).
Interaction and Enactment Finally we want to present some loose thoughts on the problem of interaction and enactment. When we are talking about human machine interaction we are always referring to a physical process, whereas when we are talking about fiction, be it drama, literature or film, interaction refers to mental processes. In interactive fiction these two modes of interaction have to play together.
To Read with Your Body Rhythm plays a crucial role in all forms of physical interaction (Buxton, 1986) and there is no reason to disregard it in narrative systems. As said before we employ the game interpretation of the computer based signs
Peter BCgh Andersen and Befit Holmqvist
310
since it is fiction we are building. It is true that the fascination and enactment in computer games is built upon the winner/loser paradigm but that is just one part of the coin, the other part is the physical interaction and its rhythm. To win you have to master your body movements in a meaningful way, in many games the mastery of physical movements seems to be the superior source of enactment, but very seldom do these changes have any deeper connection to the rhythm in the real world. Even if rhythm is exciting in itself, we want it to be interpretable within the fiction, and we want to exploit tensions between rhythm and content, as we see it in verse. Since there is no standard code for interpretation of interactive rhythm available, the best solution seems to design rhythm as a motivated sign that the reader can interpret by means of her experiences from the real world. We must give the audience a point of departure in its own daily life experiences that enables it to give meaning to the events in the story. If you want to catch a butterfly with your bare hands in the real world you have to move very quietly and close your hands around it with tenderness so that it does not escape and its wings are not hurt. A double click with the mouse won't do in real life These real life experiences could be used to make the physical interaction a meaningful part of the interactive narrative. In our narrative we have in fact programmed a butterfly so that it can read the speed of the mouse. If the user is too fast the butterfly flies away. If he moves the mouse quietly the butterfly will sit still, but in order to finally catch it the user has to press the mouse button and hold it down for a while. If he lets go of the mouse button too early the butterfly will escape.
To be a Part o f the Narrative
In addition to the physical body rhythm, we have to look at the narrative role of the user. In B0gh Andersen and Holmqvist (1990) we use the classical actant model of Propp and Greimas to define the roles in a narrative:
T Fig. 13. The actant model.
There is one important point to be made in this connection. In non-interactive media, it makes no sense to enter the reader directly into the actant schema, but since our narrative is interactive, we have to assign an actant role to the reader as well. Many computer games are based on the Aristotelian ideal of first-persons with the hero as the subject. By making the hero the only interactive sign the
Interactive Fiction and AI
311
reader is forced to identify with him and take the same role, but in our case the role of the subject won't do since it would create pornography. But enactment does not necessarily mean that you act as the main character. On the contrary, enactment can mean that you want to warn or defend the main character. Therefore, we give the user the roles of helper or antagonist, which in its turn partitions the interactive actions into two main classes: helping somebody to do something or preventing it. The role you assign to the user has nothing to do with whether he is visible or not, but the whole concept of interaction somehow contradicts that the user is represented by an interactor on the screen. Again we refer to real life experiences. In real life you do not pick yourself up by your bootstraps or take yourself by the hand for a walk. Therefore we want the cursor to be his only representative. We do not think this will disturb the strive for enactment. The important thing is the way it is done. In the real world you cannot just move people around as you want. You can tell them to move or you can in other ways draw their attention to things and events. In our narrative we use spotlights as a means to draw the actor's attention to events in another place of the narrative space, and instead of directly moving an actor you can try to persuade him by moving his shadow.
To be Both Actor and Director
The user's actions are not just a part of the story on the narrative level. By acting in the story he acts on it at the same time. He produces meta signs because he is the subject in the meta-narrative about instructing the actors and setting the stage. The object he desires is a good story. This duality makes it very difficult to maintain an Aristotelian ideal of enactment. We think that the computer medium invites a form of story telling somewhere inbetween Aristotle and Joyce, possibly in line with the reflective Brechtian tradition of "verfremdung", where the audience is constantly forced back to reality when the risk of absorption is at most. In modern experimental literature this tradition of mental interaction is encouraged and made explicit in different ways. One example is The French Lieutenant's Woman by John Fowles, where three different views on a certain historical period are presented and mingled; one level at which the story takes place, a fictive love story; one level where the author makes comments to the Victorian age trying to explain the difference in moral attitudes, natural science and structure of society between the 19th and 20th century; and one level where he is commenting on what it is to write and read a novel. The interplay between the different views forces the reader to continuously make reinterpretations of what she has just read thereby being mentally interactive. 4 This type of staging is not linear as the classical drama but a spiral. 5 It does not follow a cause and affect logic but tries to give many different views on the same problem. It has a plot and a progression in time, but the different layers make you stop the clock and add some new pieces to the interpretation. The enactment goes
312
Peter Bcgh Andersen and Befit Holmqvist
more on the intellectual challenges of penetrating a general problem than the
direct sensing of one individual's problem.
Notes 1. The case theory of Fillmore (1968, 1977) is originallydeveloped toaccount for the syntax of verbal language, but turns out to be easily adaptable to describe the syntax of computer-based signs. 2. See Black and Bower (1980) and Meehan (1977) for a treatment of narratives as problem solving. 3. Although the idea of giving form to a substance may at first sight seem a strange way of describing programming, some authors, e.g. Nygird and SCrg~rd (1986, p. 380), in fact have described programming as the imposition of a form (which they call structure) upon a substance (which they call a process). 4. We owe this description to our student S0reu Aalykke. 5. We owe this idea to our collegue JCrgen Bang.
References Andersen, P. B0gh and B. Holmqvist (1990). Narrative computer systems. The.dialectics of emotion and formalism. Paper presented at the conference Computers and Writing III, Edinburgh. (Swedish translation In J. F. Jensen (ed) (1990). Computer-kultur-computer-medier- Computersemiotik [Computer culture- computer media - computer semiotics]. Nordic Summer University.) Andersen, P. Bogh (1990). A Theory of Computer Semiotics. Semiotic Approaches to Construction and Assessment of Computer Systems. Cambridge University Press. Cambridge. Bates, J. (no year). Oz Project. Overview and Schedule 1989-1992. School of Computer Science, Carnegie Mellon University. Pittsburg. Black, J. B. and G. H. Bower (1980). Story understanding as problem solving. Poetics. 9. 223-250. Bolter, J. D. and M. Joyce (1987). Hypertext and creative writing. Hypertext "87 Proceedings. 41-51. The ACM: New York. Eco, U. (1976). A Theory of Semiotics. Indiana University Press: Indiana. Buxton, W. (1986). Chunking and phrasing and the design of human-computer dialogues. Proc. of the IFIP World Comp. Congress. 59--64. Dublin, Ireland, Sept 1-5, 1986. Fillmore, Ch. J. (1968). The case for case. In E. Bach and R. T. Harms (eds) Universals in Linguistic Theory. Holt, Rinehart and Winston: London, New York, Sydney, Toronto. Fillmore, Ch. J. (1977), The case for case reopened. In P. Cote and G. M. Sadock (eds) Syntax and Semantics 8. Grammatical Relations. Academic Press. New York. Greimas, A. J. (1966). S~mantique Structurale. Larousse. Paris. Greimas, A. J. (1970). Du Sens. Essais S~miotique. ~ t i o n s du Seuil. Paris. Joyce, M. (1987). Afternoon. Jackson. Michigan. Krneger, M. W. (1983). Artificial Reality. Addison-Wesley Publ. Comp. Reading, Mass. Laurel, B. K. (1989). Interactive fantasy: a dramatic model. Interactive arts lecture series. CarnegieMellon University. Pittsburg. Laurel, B. K. (1986). Interface as mimesis. In D. A. Norman and S. W. Draper (eds) User Centered System Design. Lawrence Earlbaum. Hiilsdale. New Jersey. Marshall, C. C. and Irish, P. M. (1989). Guided tours and on-line presentations: how authors make existing hypertext inteih'gible for readers. Hypertext '89 Proceedings. The ACM. New York. 15--26. Meehan, J. R. (1977). TALE-SPIN, an interactive program that writes stories. In 5th International Joint Conference on Artificial Intelligence. 91-98. MIT. Cambridge, Mass. Moulthrop, S. (1989). Hypertext and "the Hyperreal". Hypertext '89 Proceedings. 259-367. Nyg~rd, K. and SCrg~rd, P. (1987). The perspective concept in informatics. In Bjerknes, G. et al. (eds) Computers and Democracy. Avebury. Aldershot. Propp, V. (1975). Morphology of the folktale. Univ. of Texas Press. Austin and London. Schank, R. and Abelson, R. (1977). Scripts, Plans, Goals and Understanding. Erlbanm. Hillsdale, New Jersey. Smith, S. & Bates, J. (1989). Towards a theory of narrative for interactive fiction. CMU-C~--89-121. School of Computer Science, Carnegie Mellon University. Pittsburg.
Interactive Fiction and AI
313
Yellowlees Douglas, J. (1990). Is there a reader in this labyrinth? Paper presented at the conference Computers and Writing III, Edinburgh.
Correspondence and offprint requests to: Peter Begh Andersen, Associate Professor, Department of Information and Media Science, University of Aarhus, Niels Juelsgade 84, DK-8200 Aarhus N, Denmark. Email:
[email protected].