REVIEWS English Words Mirrored J. L. DOLBY and H. L. RESNIKOFF, The English Word Speculum,vols. I-V, The Hague, Mouton & Co., 1967.
By W. Nelson Francis THIS SET of five volumes of about 350 pages each, produced by photographing double-column computer printout, with eleven pages of letterpress introduction, comprises six different sortings of alt or part of a list of 73,582 English words containing all "left-justified bold-face words [i.e. main entries] in the Shorter Oxford Dictionary" (I.iii). Each sorting was derived from a "master list" containing more information than appears in any of the individual lists. Specifically, the 79-character line for each entry of the master list consists of the following:
1. The Main Entry, 20 characters (only five entries in the Shorter Oxford turned out to be longer). 2. The Reversed Entry, 20 characters, a left-justified reverse spelling. 3. The Vowel String Count, 1 character, indicating whether the entry is hyphenated initially (code S), finally (P), or internally (H); or if it contains an internal space (B) or an apostrophe (A); and if none of these, how many "vowel strings" it has. A vowel string is taken as any instance of A, E, I, O, U, Y or combination of these without intervening consonant, final E not being counted. Although the authors are careful not to say so, it is apparent that in an overwhelming number of words, the number of vowel strings coincides with the number of graphic syllables. Exceptions are words like chaos, naive (two syllables, one vowel string), pickle, euchre (two syllables, one vowel string), embryo, employee (three syllables, two vowel strings), and, because no account is taken of accented -d in words of French origin, cloisonn~ (three syllables, two vowel strings). 4. Status, 1 character, indicating whether standard status is accorded the word by either the Shorter Oxford (X) or by Webster 111 (W), or both (B). If neither dictionary admits the word as standard, this position is left blank. 5. Shorter Oxford Part o f Speech and Status, 10 characters, each position corresponding to a part of speech label (noun, adjective, verb, W. Nelson Francis is Professor o f Linguistics at Brown University and the author o f The Structure of American English (New York, 1958)and o f The English Language: An Introduction (New York, 1963). He is also (with Henry Ku~era) one o f the compilers o f the 1,000, O00-worcl "'Corpus o f Present-Day Edited American English," some description o f which is presented in their Computational Analysis o f Present-Day American English (Providence, 1967).
143
adverb, preposition, conjunction, pronoun, interjection, past, other). In each position a code indicates the status given the word as that part of speech (standard, archaic, dialectal, obsolete, etc.); thus handsome is coded standard as adjective and adverb, obsolete as verb. 6. Webster's Third Part of Speech and Status, 10 characters, as in the preceding. 7. Merged Part of Speech and Status, 10 characters, indicating consensus of the two dictionaries, or if they disagree, the status given by the Shorter Oxford. 8. Sequence Number, 7 characters, serving to indicate "dictionary order" of the items. Thus the entry (originally a punched card) for handsome in the master list reads as follows: HANDSOME
EMOSDNAH
2B.SOS
. . . .
S.D
. . . .
SOS
. . . .
214980
This is to be read as indicating that the word has two vowel strings, it is accorded standard status by both dictionaries, the Shorter Oxford lists it as standard for adjective and adverb, obsolete as verb, Webster Ill lists it as standard for adjective and dialectal for adverb, the merged list shows agreement on the adjective status and takes the other two from the
Shorter Oxford. It is obvious that there is a great deal of information included here in readily usable form. In fact, I would suspect that many of those who acquire the book will hanker after a tape of the master list in order to do some sorting of their own, different from that supplied by the authors in their six lists. For example, how many words on the list are given standard status by both dictionaries as both noun and verb, or how many nouns called substandard by one dictionary are called standard by the other, and so on. Unfortunately it is not stated anywhere in the five volumes that copies of the tape are available. If they are, they will be of immense value to those who have access to a good computer and money to pay for its use. But those who don't have both of these things will find that on the whole the authors have selected their six lists to afford most easily the maximum amount of information of the kinds that are most likely to be sought. Volume I includes the "Random Word List," whose purpose is stated as "quite simply that of easing the time-consuming task of obtaining random samples from the dictionary, and, hopefully, of thereby setting aside the need for the time-honored process of testing all hypotheses concerning English words on the A section of the dictionary" (I.xi). This list contains the whole list of words in random order with new sequence numbers, and also the vowel string count, the status code, and the part-of-speech information from the two dictionaries. It lacks the reverse spellings and the merged part-of-speech and status codes. The remaining five lists are based on lists shorter than the original 144
master list. Volumes II, III, and V present various arrangements of the list of 66,439 "regular" words-those that have entries in only the first four columns (noun, adjective, verb, adverb) of the merged part-of-speech and status list. This list thus excludes all those words that are given even marginal status by either dictionary as preposition, conjunction, pronoun, interjection, past tense form, or "other." In effect it excludes words that appear on any closed class list. Volumes II and III contain the "Forward" and "Reverse" lists respectively. The list is first sorted on the vowel string column, thus bringing together all prefixes in one group, and the same with suffixes, hyphened words, "broken" words (i.e. fixed phrases consisting of two or more graphic words), words containing apostrophes, and words of one, two, three, and up to eight vowel strings. Each group is then alphabetized in direct order for the Forward list and in reverse order for the Reverse list. It is thus possible to find out readily, for example, how many words on the fist begin with gh or end with i, though in each case one will have to look in eleven places (excluding prefixes and suffixes) to find them all. Volume V contains the "Reverse Part-of-Speech Word List," also sorted first on the vowel string column, but with each resulting group sorted next on the merged part-of-speech field and finally on the reverse spelling. This list, then, would enable one to determine readily, for example, how many two-syllable adjectives end in -y. Volume IV contains the remaining two lists, called the "DoubleStandard Word List." They present those words (34,307 in all) which are given standard status by both dictionaries-those which have code B in the Status column. It is of considerable interest to note how small the list is-fewer than half the entries in the Shorter Oxford are considered standard by both dictionaries. It would be of interest to know how large the overlap is, or how many of the original 73,582 entries in the Shorter Oxford are given standard status by that dictionary but not by Webster III and vice versa, but this information would have to be derived from the tape. The two lists in Volume IV present this short "double-standard" list sorted first on the vowel string count and then alphabetically in normal order and in reverse spelling respectively. The value of these lists to anyone even marginally concerned with the English vocabulary, especially with its graphics, is obvious. They make readily available information that previously could be obtained only by laborious counting of a perhaps unrepresentative sample of the dictionary. Take for example a couple of statements I once made in a paper on the English graphic system (W. N. Francis, "Language, Speech, and Writing," mimeo, Center for Applied Linguistics, 1963). Both had to do with the claim, attributed to Bernard Shaw, that ghoti is an acceptable way of spelling [fig[. The first was that "in 35 pages of [Webster's New World Dictionary] I could find only 5 instances of final (i), while there were 126 of final (y)." I don't remember how long it took me to find this out, but if I had had The English Word Speculum I could have found out in 15 145
minutes that of the 66,439 regular words in the Shorter Oxford Dictionary, only 220 end in (i) while 5708 end in (y) (the proportion turns out to be almost exactly the same, about 1 : 26). I could also readily have classified these by length (in terms of vowel strings), by part of speech, and by status (e.g. as standard, archaic, foreign, proper name, etc.). The other statement was an intuitive one which would take hours or days to verify: "Excluding a few place names ending in -burgh, [the digraph (gh)] is always postvocalic in final position." The Speculum fortunately verifies this hunch-the only exceptions are argh and bargh, classed as dialectal, and simurgh, classed as foreign by the Shorter Oxford and given alternate spellings without the -h by Webster IlL One or two shortcomings of the book need to be mentioned. Even in the short time I spent working with it, I felt the lack of a list presenting the whole master list sorted by reverse spelling only, without the preliminary sorting by vowel string count. Other possible lists would also have been useful, but for them one probably should obtain the tape (if indeed it is available) and do his own sorting. In arrangement, the lists are on the whole easy and convenient to use. The items are printed in columns of 100, two to the page, which facilitates counting-though the unwary user may not notice (as indeed I did not until after I made the counts reported above) that the last word on a page is repeated at the head of the first column of the following page, so that there are actually 199 rather than 200 entries per page. Counting of fractional columns would have been greatly simplified if the columns had been ruled or blocked by tens. Alternatively, the rows on each left-hand page could have been numbered, or the user could have been furnished with a numbered ruler. But I suppose that anyone using the lists very much will soon discover that there are 11 rows to the inch, so that he can measure rather than count. Certainly this reviewer expects to use these lists frequently enough for that to be a useful trick to remember. In sum, students of English lexis and graphics owe a debt to Messrs. Dolby and Resnikoff for furnishing them with a valuable tool, and to the Lockheed Missiles and Space Company, which supported the production of the Speculum under its Independent Research Program.
Statistics and Corneille CHARLES MULLER, Essai de statistique lexicale: "L'Illusion eomique" de Pierre Corneille. Paris, Klincksieck, 1964, 204 p.; Etude de statistique lexicale: Le vocabulaire du theatre de Corneille. Paris, Larousse, 1967, 380 p. By Pierre R. Ducretet THESE TWO BOOKS by Professor Muller, of the Universit~ de Strasbourg, are noteworthy for anyone interested in the application of statistics to the study of style and language in literary works. Pierre R. Ducretet is Assistant Professor o f lTrench at the University o f T o r o n t o .
146
The first of these, Essai de statistique lexicale, is the application of quantitative and statistical methods to the study of L 'Illusion comique of P. Corneille. It is a methodological essay of considerable scope. The basic contention of the author, in undertaking the study, is that style can be defined as a quantitative difference with respect to a norm. He therefore seeks to establish an internal norm for the text and then applies statistical criteria to determine the deviations of the text from what can be considered as its norm. He is, of course, aware of the arbitrary stand he takes with regard to what constitutes a norm for a text, but he argues convincingly for his views. The study is based on a verbal index of L'Illusion comique of some 16,586 words, at first manually established and later transferred to punched cards and encoded for statistical ends. The work is divided into three parts. Part I, 118 pages, deals exclusively with L 'Illusion comique and ends with conclusions concerning the nature of the text and its vocabulary, and the use of quantitative methods as a basis for stylistic analyses. Part II, 54 pages, is an exhaustive description of the methodology used first in setting up the index and second in the use of statistical methods to determine the value of the quantitative information yielded by the index. Part III, 43 pages, contains the various frequency indexes and statistical tables established for L 'Illusion comique, as well as a two page bibliography of quantitative and statistical works, and a table of contents. Professor Muller's second book, Etude de statistique lexicale, is an expansion of his EssaL The Etude deals with the totality of the theatrical vocabulary o f Corneille, some 530,000 words. It is a systematic quantitative and statistical study of the vocabulary of Corneille's plays. Though the method followed by the author is basically similar to the one he established for his Essai, he has expanded it considerably, particularly in its statistical aspect. The point of departure for the Etude is the verbal indexes of Corneille's plays established at the Centre d'Etude du Vocabulaire fran~ais at the U. of Besanqon. On the basis of these indexes, Professor Muller applied statistical criteria to determine to what extent the style of Corneille could be described with respect to the internal norm arrived at by the analyses of Corneille's total vocabulary as found in his plays. The major interest of the Etude resides in the discussion of the principles on which this type of study should be based, and in the solutions proposed for specific problems. Muller's analysis of these problems and his rebuttal to arguments that are adverse to this type of stylistic studies are worthy of serious consideration. The work is divided into five parts. Part I, 34 pages, deals with methodology. Part II, 84 pages, discusses the quantitative lexical structure of Corneille's plays. Part III, 46 pages, examines the vocabulary characteristics of the various plays according to period and genres. Part IV, 93 pages, is an attempt to arrive at a synthesis o f the information furnished by the statistical data, noting variations and progressions or regressions 147
from one play to the next. Part V, 120 pages, consists of quantitative tables, indexes, and various quantitative and statistical information. A short bibliography, which to some extent completes that of the EssaL and a table of contents which includes a list of the various statistical tables scat_tered throughout the text are also included. In both the Essai and the Etude, the author points out repeatedly that the statistical method as applied to language in general, and to literary works in partlcular, is only one possible approach among others. All the statistical method can do is to help clarify and guide the study of language and style by pointing out certain quantitative phenomena, thereby contributing to the definition of style and to the analysis of language. The author also points out all the problems involved in the semantic classification of words for establishing an index. In his view, when one departs from strictly morphological criteria, homonymy and polysemy become predominant features of the problem of index making. Another major problem, of course, is to determine what a word is, when one abandons the morphological criterion to enter into that of meaning. What is one to do with set phrases, prepositional phrases, etc.? The reasonable consideration of these and various other problems that statistical methods pose, is of considerable interest for anyone planning an index or wishing to use existing quantitative or statistical information. Another quality of these two works is that they do not attempt to shed new light on the style or other aspects of Corneille's plays when none is to be shed. The author limits himself to a descriptive analysis of the statistical information he has gathered. He examines how far his method can go and what significant information can be gathered about a text by statistical methods. When the method yields nothing significant, this is clearly stated. Aside from the statistical part of the study, the frequency index of Corneille's theatrical vocabulary is an excellent point of departure for comparative studies of 17th century French theater vocabulary. What is regrettable, however, is that, except for very low frequency words, the frequency index does not furnish the exact location of each occurrence of a given word. Furthermore, there is no regrouping of words according to some pattern (i.e., verbs together, nouns together, etc.); the verb forms, for instance, are all under the infinitive heading, regardless of their tense and mood. In short, except for proper nouns, which are listed separately, there is little morphological or syntactical information readily available in the form of special indexes, tables, charts, graphs or dictionaries. The tables and graphs in the works deal with statistical information in general. This means that, because of the need to search for context in each instance, considerable time would have to be spent to investigate any one aspect of Corneille's vocabulary. But, then, this would have implied so much more time and effort to achieve, because of the number of words that would have had to be exactly indexed and coded, that it could not be 148
expected. Mass plays an important role in this sort of study, and the Etude is already 379 pages long.
Machine-Assisted Textual Criticism DOM JACQUES FROGER, La critique des textes et son automatisation. Paris: Dunod, 1968 (Initiation aux nouveaute"s de la science 7). Pp. xxii-280. By Vinton A. Dearing READERS of Computers and the Humanities will be principally interested in Chapter V of Dom Froger's book, for it gives a wide-ranging and suggestive survey of computer applications in textual criticism.' I suspect that the reader who can at once see how he would write computer programs to do the work that the author outlines will have already written a good many of the algorithms, if not indeed of the programs themselves. On the other hand, the textual critic who knows nothing of machine computation will fired much of interest, and even one far gone in textual criticism by computer will probably find new ideas. Chapter V begins with a non-technical description of the computer and some of its auxiliary equipment, essentially as manufactured by Bull G.E., and proceeds next to a brief survey of some of the necessary preliminaries to textual criticism proper. Dom Froger believes that the bibliographical problem of locating texts and publications about them will not find an automated solution for many years. He sees optical scanning of printed texts and of clear contemporary handwriting as a nearer possibility, but concludes that automatic reading of the kind of manuscripts that textual critics analyze is far in the future. Similarly distant in his estimation is automatic analysis of the physical characteristics of texts, though the problem here is rather one of collecting, adequately describing, and classifying examples whose dates and provenance are known. Collation of texts, the next matter to which Dom Froger addresses himself, has been performed by computer for some time. The author discusses in some detail, but not in terms of any particular computer or computer language, his way of doing the work. The program behind the description was written by a Mme. Renaud for the Gamma Tambour ET Bull in 1960, and handles up to 44 texts or segments of texts of up to 3,500 words each, at a time. The texts are punched in columns 33-80 of cards, column 80 of one card being treated as immediately followed by 1The preface, by R. Marichal, gives a briefer survey of computer solutions to the problems of textual criticism that differs in some small details from Dora Froger's
in Chapter V.
Vinton A. Dearing, Professor o f English at the University o f California at Los Angeles, is currently editing the works o f John Dryden.
149
column 33 of the next. There is an illustration of such a card facing p. 230, showing some numbers punched in columns 1-32 as well, but there is no explanation of these; presumably one is the text number and one the card number. The cards are verified, and then printouts are proofread in the usual way until no errors remain. The machine gives nhrnbers to the words and spaces in the first or base text, space 1 preceding word 2 (the first word), so that all spaces have odd numbers and all words even, and these become the reference numbers for identifying the variations. It then compares the base text with the others in turn. When it finds a variant, it expresses it as an addition or an omission: a substitution is treated as an addition immediately followed by an omission, a transposition as an addition followed after an interval by an omission. The machine prints out the collations of the base text with each of the others, as well as a consolidated collation of all the texts, in a form that seems at first difficult to read but no doubt becomes easy enough with practice. Dom Froger's example throughout the book is a speech by Argan, the principal character in Moli~re's Malade irnaginaire, at the beginning of Act I, scene v, of which he has imagined a number of copies made by Moli~re's wife and her friends, her copy being his text 1 but the farthest removed from the original. The opening of text 1, with the words numbered as by the computer, reads as follows: 2
4
6
8
I0
12
14
16
18
20
22
MA FILLE JE VAIS VOUS DIRE UNE ETRANGE NOUVELLE OU PEUTETRE
The variations are as follows (the numbers are those of the texts): MA] i; OR CA MA 2-6. 1-3, 5, 6; DOIS 4.
FILLE] i, 3, 4, 6; CHERE 2, 5. DIRE] i, 306; ANNONCER 2.
I, 4, 6; NOUVELLE 2, 3, 5.
VAIS]
ETRANGE NOUVELLE]
OU] i, 2, 4-6; A LAQUELLE 3.
Thus there are an addition, three substitutions, an omission, and another substitution. Additions are keyed to the spaces between the words; a space to which more than one word must be keyed is divided decimally, as 1.1, 1.2. The consolidated collation for the foregoing reads as follows (N -number, R = addition, O = omission): N 1.1 R.OR 2 3 4 5 6 N 1.2 R.CA 2 3 4 5 6 MA N 3 R.CHERE 2 5 N 4 O.FILLE 2 5 JE N 7 R.DOIS 4 N 8 0.VAIS 4 VOUS N ii R.ANNO NCER 2 N 12 O.DIRE 2 UNE N 16 O.ETRANGE 2 3 5 NOUVELLE N 19.1 R.A 3 N 19.2 R.LAQUELLE 3 N 20 0.OU 3 PEUTETRE
150
This comparison procedure is mechanically simple, especially when the texts are prose, but the result est ~nigmatique en bien des endroits. La presence de petites variations ~ l'int~rieur des grandes est une source de complications. La combinaison des variantes orthographiques, confondues par la machine avec des substitutions, et des omissions ou additions, produit des effets paradoxaux. Une longue omission ou addition peut occasioner de faux raccordements sur des mots usuels (articles, conjonctions, prdpositions, etc.), et se trouver indfiment fractionn~e. The routines for recording and printing the results of the comparison are probably fairly complex. The comparison routine requires only small adjustments to make the machine do more of the work. As set forth in the book, the procedure when the two texts fail to agree is to test the next word in each, first text first word against second text second word, and if there is no agreement, first text second word against second text first word. Agreement on the first test indicates an addition, on the second an omission. If there is no agreement, the search range is extended. No agreement on the first test would signal the second test; agreement on the second would indicate a transposition. No agreement on the second test would be a signal to compare the second word in each text; agreement in this test would indicate a substitution. The problem of splitting up long additions or omissions because of false matches of common words-a matter of concern only if the textual critic's method requires frequency counts, as Dora Froger's does--can be reduced by testing the next word or the next several words after the matching words. It is often helpful when coming to write up the critical apparatus in standard form to know the exact spellings of the texts, but if this is not a matter of concern, the identification of spelling variants can be reduced by providing a dictionary of alternate spellings of common words of a certain length, say up to four letters, and testing mismatches against it when they meet the length test. It is not clear whether Dom Froger's program moves all the texts or segments into the machine at once. With a program which does so, the computer can compare all the texts together at once and print out the results in convenient units (perhaps according to the lines in the base -text) somewhat as follows: UNIT TEXT 1
1
I
2 OR CA
i
3 OR CA
i
4 OR CA
i
5 OR CA
i
6 OR CA
MA FILLE JE VAIS VOUS CHERE
DIRE ANNONCER
DOIS CHERE
151
2 2 2
1 UNE ETRANGE NOUVELLE 4 6
2 2
2 5
***
2
3
***
OU
PEUTETRE
A LAQUELLE
Transpositions of entire units would be shown by the unit numbers for each text, omissions of entire units by * * * instead of unit numbers for the texts in question. Internal numbering of the words and spaces in each unit of the base text-an idea suggested by Dora Froger's program-should be particularly helpful. To return to his program, the computer also prints out for each omission or addition a card with "1" punches in columns 33-80 for the texts having the variation, column 33 representing text 1, and so on. Once again, what is punched in columns 1-32 is not specified but is obviously some key to the variation to which the card belongs. These cards are then examined by the textual critic, who eliminates those for non-significant variations and all but one each of those for significant variations. (E.g., for a substitution, he eliminates the card for the "omission" and keeps that for the "addition.") He also creates cards as necessary to represent variations with more than two alternate readings as if they had only two, a peculiar requirement of Dom Froger's method of textual criticism that does not concern us here. The idea of having the computer make a machine-readable record of the variations was new to me, and interesting. Consoles and editing programs are now available so that the computer output could be edited before punching by an auxiliary machine from computer tape, or the punching could be entirely eliminated, the edited tape or disk or other storage medium being used for input to the next program. A footnote on p. 219 and the presence of Greek letters in what is represented as the printout from the next program suggest to me that Dom Froger had not actually written the program when his book went to press, 2 but others had written essentially similar programs if he had not. He describes a program that prepares from the cards recording the variations, a series of tables from which the textual critic can construct a form of the family tree of the texts, in which the earliest ancestral text has not been located, or has been located only provisionally. The variant groups-2 3 4 5 6; 2 5; 4; 2; 2 3 5; 3 in our example-are sorted in ascending order by the number of texts in each a and the sequence of the 2 H i s " a v a n t - p r o p o s " s a y s t h a t h e is d e s c r i b i n g a p r o g r a m w r i t t e n in 1 9 6 0 o r 1 9 6 1 b y P h i l i p p e P o r ~ , b u t he also s a y s t h a t h e a n d Por~ d i s a g r e e as t o h o w t o d o the work. The text obviously represents Dora Froger's ideas.
3The s m a l l e s t v a r i a n t g r o u p in t h i s sense is a t e x t t h a t s t a n d s b y i t s e l f in a v a r i a t i o n , s u c h as 2, 3, a n d 4 in t h e e x a m p l e .
152
text numbers-e.g., 2; 3; 4; 2 5; 2 3 5; 2 3 4 5 6 - a n d then each group is listed together with the next larger group or groups that have some of the same texts. If any smaller group has more than one larger group listed with it, the evidence suggests that some text has two or more independent ancestors, and the machine halts. If the textual critic cannot decide on a single ancestor for each text, he must adopt some other method of textual criticism. Otherwise, the machine proceeds finally to subtract the smaller groups from the larger. If this leaves a group with no texts, it is given a Greek letter (any number larger than the number of extant texts would do as well), which is listed as the provisional ancestor of each of the smaller groups subtracted from it; otherwise its remaining text or texts are listed as ancestor. Each group-the larger groups in the form they take after subtraction-is given a level-number, a list of the variations in which it occurs, and a count of these variations. The giving of a level-number is new to me and is helpful in plotting the location of the texts in the tree. With the mechanical plotters available today, it would be possible for the computer to draw the tree from the data in the tables. Dora Froger leaves the identification of the earliest ancestral text to the textual critic. A computer could, if supplied with information as to which groupings of the texts preserved authorial readings, determine the earliest ancestral text and redraw the tree if necessary. But Dora Froger considers only computer-aided studies that would help in the identification of authorial readings and of non-authorial passages where the texts happen to agree. He suggests that copyists' habits could be reduced to psychological laws if enough first proofs of printed books were examined, and the probabilities of chance agreement between copyists determined if enough sets of reprints having immediate common ancestors were examined. He does not allow for the influence of the mechanics of various kinds of reproduction on the kinds of errors that occur in transmission. He also suggests that an author's characteristics of writing and the probabilities of their occurrence might be determined with the help of computerproduced dictionaries, concordances, and lists of stylistic traits. He gives no directions for obtaining these helps, and suggests in fact that they are hard to obtain instead of relatively commonplace. Those who are familiar with them may still find Dom Froger's comments interesting or useful: a dictionary program must include some way of coding the input to distinguish homographs and give other information that is more easily supplied by hand than written into the program (Dom Froger looks forward to having a scanner read the text, a computer print it out in columns, the critic annotate the text, and the scanner read the printout with its manuscript annotations); an author may have a large range of stylistic traits, and these may be equally characteristic of his cultural milieu; questions that can be answered easily without a computer should not be answered with one. Chapter V concludes as follows: 153
Ajoutons, pour finir, que le philologue qui se dispose ~ faire un travail h la machine dlectronique devralt s'assurer si le m$me travail n'est pas d~j~ falt ni en train de se faire; il aurait tout avantage aussi h s'informer des m~thodes utilis6es par ses coll~gues pour traiter des probl~mes analogues au sien: il pourrait alors les employer lui aussi sans avoir ~l rdinventer ce qui a 6t6 ddj~ trouvd, ou bien les porter un point de plus grande perfection. This prompts me to remark that Dom Froger does not help the reader as much as he might in these ways. Though he does mention Computers and the Humanities in his bibliography (it is also mentioned in the first sentence of the preface, by R. Marichal), that is almost the only reference in English. One would like to know, also, why Dom Froger says that E. C. Berkeley's Giant Brains in its French translation and adaptation has "int~ressants renseignments.., sur la technique de calcul de la machine" "sans doute dus a " the French translator and adapter (italics mine). On the other hand, American readers may, like me, be no great researchers in European publications, and may find titles, organizations, and persons mentioned in the bibliography and the text that will widen their horizons.
Oettinger Discovers America A N T H O N Y G. O E T T I N G E R with S E M A M A R K S , R u n , C o m p u t e r , R u n :
The Mythology o f Educational Innovation, Cambridge, Mass., Harvard University Press, 1969, xx + 303 pp., $5.95. By Thomas Chinlund THIS BOOK HAS BEEN S U B J E C T E D to a number of preliminary panels,
discussion groups and critical reviews. In these, representatives of the educational establishment and principal originators of computer-assisted instructional (c AI) systems have made their comments, and been answered by the authors. It is evident from the most extensive published version of these 1 that the central theme of the work has not been clear to some of the participants. Since the theme is in fact very dear, the failure of communication is undoubtedly due to the inevitably accusatory import of much of the exposition. Both the educational establishment as it is organized and c M systems as they are used are shown as seriously defective, not to say perverse, in view of overtly accepted goals of both. The emotional reaction which could result when someone prominent in either effort reads this book may explain its failure to communicate in these cases. The last of the pre-publication discussions (at the meeting of the Association for Computational Linguistics, Boston, May 13th, 1969) was the occasion of my first encounter with the book. At this panel, Miss Marks summarized the results: (1) CAI is used in the school system IDiscussion, Harvard Educational Review, 38:4 (Fall 1968), 718-55. T h o m a s Chinlund is on the s t a f f o f the Columbia University C o m p u t e r Center.
154
primarily to imitate old educational methods. Teachers value discipttne, quiet, and order over energy, noise, and originality. This is required by the educational system, c M , as it is (mis)applied, serves to reinforce these values. (2) There is a threshold of frustration caused by downtime and obscurity above which users can no longer profit from a system. Most r systems in ordinary use are well above this threshold. Point ( I ) i s directed at the inability of the school system to use the new systems to achieve the ends for which they were originally intended; point (2) is directed to the inability o f manufacturers to develop production systems that can be appropriately used in the schools. Since the book is largely devoted to depicting the state of affairs indicated in this brief statement, it appears at first glance primarily intended to expose these two very defective systems. As such, the argument is open to the obvious questions: Are things really so bad? (Perhaps not.) Are there not exceptions? (Of course.) Given the incisive title and Miss Marks's succinct summary I felt I knew all I wanted to know about cAI in the schools. But when I read the book, later, I was agreeably surprised. Run, Computer, Run is an essay in a genre Oettinger developed some time ago which might be called anti-journalism. Anti-journalism is a form of essay which traces the degradation in the form, content, and application of ideas as they perfuse the educational, military, or industrial infrastructure from their sources in universities and research laboratories. (Readers who enjoy this genre, as I do, will be entertained also by an earlier example, z ) The evidence is presented in the form of excerpts from committee reports, advertisements, newspaper articles, manuals of instruction, proposals for grants, government documents, and pontifications of the prominent. These are contrasted with relevant material from original research papers, working laboratory systems, and carefully qualified, precise statements by innovators and competent development engineers. Items from the two collections are displayed and commented on with effective irony. The whole is embellished (in this essay) with quotations from Donald Barthelme, T. S. Eliot, Tom Lehrer, and the Andrews and Company Illustrated Catalogue of School Merchandise (Chicago, 1881). Mark Twain, Joseph Heller, and Sinclair Lewis also appear. The "Thickness and feel of the magnificent paper" for the title page of a content-free "seminar" announcement are evoked. The central theme of the book is-in an entertainingly vivid phrase-the "artificial dissemination" of some good ideas and systems for individual, graphic, interactive learning into the educational establishment. Read through for its theme, the essay is incisive, ironic, witty and good, dirty fun. An evening's divertissement. The essay is much more than this, however. Though sugar-coated with wit, the theme is intended in full earnest. The situation differs by 2A. G. Oettinger, "An Essay in Information Retrieval, or the Birth of a Myth," Information and Control, 8:1 (February 1965), 64-79.
155
several orders of magnitude from other kinds of journalistic degradation, with which scientists are more familiar. The magnitude of the folly, waste, and deception is increased by the interaction of three institutional systems, any one of which can be (and is) the cause of run-of-the-mill instances of distortion. The first of these institutions is the funding apparatus for the support of technological research, development and application. The second is the educational establishment and the schools. The third is the corporate arrangements by which computer based systems are produced, sold, manned, and maintained. The effect of each of these systems on the misunderstanding and misapplication of CAI is described and documented. And one major aim of the essay is to demonstrate why, in this context, the ordinarily difficult transition from laboratory prototype to working, deployed system is here impossible without radical change in the receiving environment. To support this conclusion, and to contend with the wide audience for which the essay is intended, the authors devote much-indeed m o s t - o f their efforts to explaining aspects of the two worlds (computer based systems and the schools) to those readers who are outside one or the other, or both. The state of the schools is shown by example: some in near hopeless disarray and irrelevant to their constituencies ("Small City"); some in an aU-too-normal condition of mediocrity and rigidity(a language"Laboratory" whose effect is to intensify the "lock-step" rather than to foster the individuality and experimentation for which it was intended); and some with an air of openness to new ideas and individual differences (two unusual high schools and an extraordinarily well-conducted coUeze biology course). The educational establishment is shown in its critical distorting role in the process of premature dissemination. At a "Hawaiian Rain Dance," sponsored by a well-known foundation, the mystical trinity of Involvement, Identity, and Innovation is invoked to bless the success of the communicants as they go forth to sell systems that have not been explained and that cannot be used in the schools as they are. This picture of the school system should surprise no one, and yet it reads like the discovery of another country, whose magnitude and extent many of us, apparently including the authors, did not previously apprehend. The picture is undoubtedly unfair. But if the instances cited are at all representative, it must be taken seriously. (Bel Kaufman's Up the Down Staircase is cited frequently.) In the other direction, the nature of computer based systems is explained to educators and others lacking hands on experience in both implementation and use. This explanation is done, first of all, by showing how the "systems concept" (an o.k. phrase in the educational establishment) really applies in the existing technology: blackboards are flexible, but adapted to serial communication (from teacher to class). Paper is parallel, suited to colnmunication from many students to one teacher. 156
Reliability, for a teacher, "includes the ability to withstand insult or injury; for the f'dm projector, resistance to . . . jarrings, joggings and occasional droppings." The tendency of chalk to break and ball point pens to skip illustrates unreliability in "the most modest of devices." The school library is as much an information retrieval system as any made of more recondite components. The observed attitude of school librarians under typical environmental pressure illustrates the way the schools (must) handle their systems. Individual reading, special group visits are discouraged. Quiet, order, and conformity are enforced. (Note the reference to environmental pressure: the authors do not intend to pillory individuals, even in large numbers. The structure of the system accounts for the behavior of its elements in these matters.) The failure of the schools to achieve Individually Prescribed Instruction (an_other o.k. mystical invocation) with the admirably flexible technology of books, blackboards, and paper is shown to result from attitudes (or system configurations) which produce similar failures with more intricate technologies. Thus, a printout from a conversational computing system illustrates the encoding of a rigid, multiple-choice sequence of programmed text, with pathetically fatuous comments and encouragements sprinkled along the various programmed branches. By contrast, systems in (unusual) high schools, colleges, and laboratories have been successfully used to accommodate a moderate variety in students' rates and purposes. Computer-based systems for displaying elementary mathematical curves and surfaces, and for exploring molecular structures are presented to illustrate some of the real promise of computer assisted study (at widely varying levels of sophistication). On the other hand, the embarrassing failure of some very good research groups to estimate the real production parameters and limitations of time-shared systems is evoked to illustrate a difficulty in actually employing computers with which many readers of this journal are familiar. This failure, which is inevitable in some degree wherever new systems are to be installed in unfamiliar circumstances, requires considerable sophistication on the part of those responsible for operating the system. The unusual limitations of the educational establishment, well documented in the book, exaggerate the effects of this failure. Our present utter inability to design automata which can cope with language as it is really produced by people is cited in a prefatory reprint in the book as one of a number of important essential limits on the possible features of computer systems for education. The authors' concluding recommendations seem weak and platitudinous at first sight: better people, more money, better ideas, competition (possibly achieved in part by allowing parents to "vote" for schools and teachers with tuition credits of some sort). But in making these very general recommendations the authors wish to emphasize that we know 157
very little of use about the educational process, that experimentation is therefore inevitable and had better be undertaken with full awareness, and that the amount of money and the quality of new people and ideas needed have been grossly underestimated. This book has the defects of its virtues. The complexity of the situation described, involving the interaction of the three institutional systems noted above, may vitiate the main point: that honesty in evaluating the capabilities of our systems is essential, and that premature dissemination of good ideas can lead in this case to a manic-depressive cycle that could even be of some danger. Another aspect which may tend to obscure the impact of the book is that it touches on some of the deepest and most desperate contradictions of our society: the principle of democracy and the fact of power, the dream of equality and the fact of diversity, the dignity of every person and the need for excellence, individual variety and the pressure o f numbers. The realities of the educational establishment as described in this book seem to reflect our present state of compromise in coping with these contradictions. The brief conclusions of the authors suggest the improbable collective wisdom needed to do better. Norbert Wiener, in his last book (which deserves continuing scrutiny in spite of some errors of diagnosis and prescription) made the same point in a still more general context (this passage has just appeared for the second time in slightly over a year on the cover of Computing Reviews): .No, the future offers very little hope for those who expect that our new mechanical slaves will offer us a world in which we may rest from thinking. Help us they may, but at the cost of supreme demands upon our honesty and our intelligence. The world of the future will be an ever more demanding struggle against the limitations of our intelligence, not a comfortable hammock in which we can lie down to be waited upon by our robot slaves (God and Golem, lne. p. 69).
158