Computers and the Humanities 19 (1985) ©Paradigm Press, Inc.
An Investigation of Morton's Method to Distinguish Elizabethan .Playwrights M. W. A. S m i t h
On Sunday, 14 March 1976, The Observer--one of London's quality newspapers, published since 1791 --printed an article by Nigel Hawkes announcing that " a new t e c h n i q u e . . , provides for the first time a simple and apparently foolproof way of distinguishing the styles of different w r i t e r s . . . The new technique . . . produces unambiguous resuits." The method was that developed by the Rev. Andrew Q. Morton (1978) originally to distinguish the authors of Greek texts but subsequently adapted for use when the language is English. The cheerful claim of virtual infallibility would certainly astound workers in the more straightforward mathematical sciences and their applications, who have learned often through painful experience the vicissitudes of both natural phenomena and dynamic artefacts. A further shock was in store for readers of The Observer. On Sunday, 6 July 1980, its front page contained another article by Nigel Hawkes under the headline "Computer finds 'new' play by Shakespeare." Thomas Merriam, impressed by the original article, had since been applying Morton's method to plays which some scholars believe may not be entirely Shakespeare's work. He had also tested Sir Thomas More. The source of this play is a manuscript in which much of the handwriting has been identified as that of various Elizabethan playwrights. However, a few short passages are believed by some to be Shakespeare's work. If so they would be the only examples extant of any part of any poem or play written in his own hand. Nevertheless, against overwhelming scholarly opinion, the Wilfrid Smith, a BSc with first class honours in electrical and electronic engineering from the Queen's University o f Belfast and a PhD in Control Theory, has been a lecturer in mathematics and computing and is presently a senior lecturer in computing at the University o f Ulster.
results of Men'iam's investigation (1982) had demonstrated, apparently, that at least 90% of the play was by Shakespeare. Specialists in Elizabethan literature, by and large, maintained the silence of the unimpressed. Either their scholarship was of little value or Merriam and machine were wrong. Anyway, opinion alone does not arm combatants. The authorship of the play Pericles is also problematical. The almost unanimous view of literary scholars is that only Acts III, IV and V are Shakespearean. In contrast, a study by Morton (1978) and another by Merriam (1982) using Morton's approach, appear to show that Pericles in its entirety is by Shakespeare. Although Pericles was excluded from the first Folio of Shakespeare's works, another play, Henry VIII, whose authorship has also been questioned, was included. In fact, for more than a century controversy has persisted over whether John Fletcher contributed to Henry VIII or not. Merriam (I 979, 1980), again using Morton's method, tested its authorship and apparently confirmed that more than one playwright had been involved. With particular reference to these recent studies, the purpose of this paper is to investigate the reliability of Morton's technique when it is applied to the particular case of distinguishing authors of Elizabethan drama. Morton's method for identifying authors of works written in English consists of tests of position, collocation and pairs of words. Morton calls these word-pairs" proportional pairs" o r ' 'proportionate pairs." Counts of such features from a work of uncertain authorship are compared with similar counts from the suspected author's known writings. A test of position consists of comparing the number of occurrences of a word both in a prescribed position and elsewhere in the doubtful work with the corresponding values obtained from sam-
4
M.W.A. SMITH
ples of authentic writings by the suspected author. The prescribed positions are usually the first or last words of sentences (fws or lws). Typical of such tests are those numbered 1 through 13 in Table 2. Tests involving collocations are performed similarly except that the occurrences of a prescribed word both when followed by (fb) or preceded by (pb) a second prescribed word, and when otherwise used, are compared in the two samples of text. T~sts numbered 12 through 31 in Table 1 are examples typical of collocations. Tests of word-pairs consist of comparing the use of one prescribed word relative to another in each of the samples of text. Tests 1-11 of Table 1 consist of typical word-pairs. The statistical measure of the homogeneity, or otherwise, of the texts under comparison is the chisquare. While Morton (1978) does not do so, Merriam adds the values of chi-square from all the tests. He then accepts a total significant beyond the 5 °70 level as evidence for more than one author. In such cases Merriam uses the probability corresponding to the total value of chi-square as the chance that the same author wrote both samples of text under comparison. Morton's method is established in Chapters 10 and 11 of his book Literary Detection (1978). However, when Smith (1984a) scrutinized these two Chapters he considered that sufficient evidence had not been presented to justify the claims advanced for the technique. Indeed, when that evidence was examined closely the method, as Morton himself applies it, appears to lack powers of discrimination. The immediate aim of this paper is to examine critically Merriam's experiementation on the authorship of Sir Thomas More and Henry VIII with the objective of determining whether or not his conclusions ought to be taken seriously. A wider and more important reason for undertaking the present study, however, is to investigate the accuracy of Morton's method itself, as a technique for resolving the many problems of authorship which pervade Elizabethan and Jacobean drama. The first section of this paper contains a comparison of Sir Thomas More with figures which have been compiled by Merriam to represent Shakespeare's habits. While a superficial assessment of the outcome would appear to support his accreditation of most of the play to Shakespeare, such an ascription is shown to become less tenable when
examined more closely. Next, a preliminary investigation of Morton's method is performed by comparing a number of plays with the same Shakespeare Control. The main outcome is for distinguishing authors is unreliable. Morton's method, as applied by Merriam to test Shakespearean works, consists o f one of two groups of tests. For his investigation o f Henry VlllMerriam used 20 tests, 13 of which rely on punctuation. By dividing Henry VIIIinto two parts in a number of different ways, Merriam hoped to determine if some scholars were correct in attributing part of the play to Fletcher and if so, to refine their judgment. However, when The Winter's Tale is subjected to the same treatment it is shown to behave similarly, thus undermining Merriam's conclusions. Whatever the merits of using tests dependent upon punctuation for comparisons of one portion of Henry VIIIwith another, such tests are inappropriate for comparisons between plays written in an era when authors' punctuation was not preserved. In the section which follows, a number of plays are compared with various combinations of parts of The Winter's Tale. The pattern o f testing is similar but, instead, Merriam's second group, which consists of 31 tests of collocations and word-pairs only, is adopted. Thus the possibility is eliminated that any degradation of Morton's method is attributable to differences in the style of pointing introduced by contemporaneous scribes and compositors, and also by subsequent editors. Nevertheless, the outcome is that no individual test clearly differentiated between the authors. However, the results of comparisons over all 31 tests do perhaps indicate that a method could be developed from Morton's approach to provide a guide as to which of the possible playwrights should be investigated first in any in-depth studies subsequently undertaken to resolve questions of authorship of plays of this period. The Ascription of Sir Thomas More to Shakespeare To test the authorship of Sir Thomas More Merriam counted the occurrences of the features listed in Table 1 using the text contained in The Shakespeare Apochrypha edited by C. Tucker Brooke (Oxford, 1908). Comparing these counts with those of his Shakespeare Control (also given in Table 1) by means of 2 × 2 contingency tables, he obtained a value of chi-square of 36.732 for 32 degrees of
AN INVESTIGATION OF MORTON'S METHOD
freedom (df). As this value is not significant at the 5 % level, Merriam ascribed the play to Shakespeare. Merriam anticipated that his conclusions could be rejected on grounds that Morton's stylometry might not be sufficiently sensitive to distinguish works of different authors. To refute this suggestion he also compared John a Kent and John a Cumber by Anthony Munday with the Shakespeare Control. The outcome was a value for chi-square of 91.276. As this is significant well beyond the 0.1 °7o level, thereby apparently demonstrating that Shakespeare did not write that play, it would appear to confirm that the method does function correctly. But this test is of much more importance to Merriam's case: since the original version of Sir Thomas More is written in Munday's hand, as is the manuscript of John a Kent, such an outcome would seem to conf'trm that Munday had only copied, not composed, the first version of Sir Thomas More. As a second demonstration that the method can distinguish authors, Merriam compared the anonymous Edward III once more with the Shakespeare Control. In this case the value of chi-square was computed as 77.291 which again is highly significant and would appear to indicate that, if Shakespeare had a hand in Edward111, his part was not a major contribution. For a final demonstration of the sensitivity of Morton's stylometry, and simultaneously to show that Carol Chillington (1980) was incorrect in her deduction that Sir Thomas More was written by Munday, Dekker, Chettle, Heywood and Webster, Merriam recorded the occurrences of the 32 features in plays or portions of plays by each of these five playwrights. Then, for each author, he calculated the occurrence of every prescribed feature as a fraction of the total usage in each of the 32 tests. As an example, for the word-pairs test of the indefinite article in which A and AN occur 40 and 8 times, respectively, the ratio would be 8/48 = 0.167. Similarly, for the test of, for instance, the collocation IN fb THE, where IN occurs 60 times of which 9 are followed by THE, the ratio would be 9/60 = 0.150. His next step was to obtain for each test an average fraction. The average values
5
were used to determine hypothetical occurrences of the prescribed features given the total number of actual occurrences in Sir Thomas More for all the tests. Finally, these figures, which correspond to an artificially produced play, were compared with the Shakespeare Control. The result was a value of chisquare of 55.492 which is significant at about the 0.5% level. As a consequence Merriam implied that his evidence eliminated the five authors, designated by Carol Chillington, as the main writers of Sir Thomas More, in favour of William Shakespeare. Even assuming that Morton's method does function reliably, it is difficult to understand how Merdam's approach of equally weighting the characteristics of the five authors could produce a result which is convincing evidence for or against any standpoint. To be valid, the composite fraction for each of the 32 tests would have to be formed by weighting the five individual fractions, respectively, in the ratio of the total number of words postulated as having been written by each playwright. Such a procedure doubtless would produce a very different set of figures to compare with the Shakespeare Merriam then considered the opposite reason which might be used to reject his conclusion that Sir Thomas More should be accepted into the Shakespeare canon: that Morton's stylometry cannot demonstrate identity of authorship when texts are shown to be by the same writer. Although only Acts III, IV and V of Pericles are generally attributed to Shakespeare, Morton (1978), using his method, published data which showed that the entire play was by one playwright. Using his own counts of the features Merriam (1982) compared Pericles with his Shakespeare Control and obtained a value for chi-square of 37.403. Since this figure is less than the 5°7o level o f significance it is taken to show that Morton's method is capable of demonstrating identity of authorship by confirming that all of Pericles is by Shakespeare. (Since Merriam uses Morton's method, this argument is itself defective.) Satisfied that these four examples have demonstrated the reliability of Morton's method, Merriam feels that the corresponding value of chisquare of 36.732 establishes Shakespeare's authorship of Sir Thomas More.
6
M . W . A . SMITH
When Smith conducted a study of Pericles (1982b, 1983a) he showed that a statistical difference between Acts I and II and the rest of the play does exist. Moreover, when he examined (1984c) Morton's and Merriam's work on Pericles he discovered very many errors. After as many of these mistakes had been corrected as the availability of data permitted, Smith found that the results when calculated again were consistent, on the basis of Morton's and Merriam's own criterion of the 5% level of significance, with traditional literary opinion that Pericles was written by more than one playwright. (Details are given in Table 1 .) As many of the errors in Merriam's paper also affect his study of Sir Thomas More, it was considered that a new investigation of the latter would be warranted. In the description of his method Morton advises caution before making a comparison involving different literary forms. The figures for Merriam's Shakespeare Control for Tests 1 through 11 in Table 1 include counts from the non-dramatic works. However, since the effect of these counts is small and very much less than that due to the exclusion of elisions (discussed later), the contribution of the poems and sonnets has not been subtracted from the total for any of the comparisons described in this paper. Independent counts of the features required for Merriam's tests were taken for Sir Thomas More. These are the values given under STM(1) in Table 1. To maintain consistency with the rules governing the acquisition of data for the Shakespeare Control, elisions are omitted. As one of Merriam's 32 tests is not independent of the others it is eliminated. When the new figures were tested against the Shakespeare Control, the outcome was a value of chi-square of 37.38 for 28 degrees of freedom. As this figure for chi-square is less than the 5°70 level of significance, it is in broad agreement with Merriam's results. To terminate the investigation at this point, however, would be to provide only the most superficial of studies. Some of the handwriting of the additions in the manuscript has been identified. As these parts are most likely to be other than the authors' own work they have been removed from STM(1) to give STM(2) in Table 1. In particular, Heywood's lines (Hand B) have been deleted from Act II, Dekker's (Hand E) and Heywood's from Act III and Chettle's (Hand A) from Act IV. (See
Notes on Table I). If Sir Thomas More is predominantly by Shakespeare then this reduced version should appear as purer Shakespeare. In contrast, when STM(2) is tested against the Shakespeare Control, the value obtained for chi-square is increased to 41.98 for 28 degrees of freedom. As the figure for the 5°7o level o f significance is 41.34, a rigid application of Morton's and Merriam's criterion for different authorship would indicate that Shakespeare was not responsible for most of Sir Thomas More. So, as in the case of Morton's and Merriam's investigation o f Pericles, when the evidence in examined more closely, it can be seen to be insubstantial. Consequently, the grounds for advocating a revolutionary overthrow of a great weight of scholarship are inadequate. The Reliability of Morton's Method--a Preliminary Investigation Aspects of Morton's method itself and some results obtained from its application have recently been questioned. First, the conclusions reached by Smith (1982b, 1983a) concerning the authorship of Pericles are consistent with literary evidence but are the opposite of those advanced by Morton and Merdam. Second, both Morton's demonstration (1978) that Marlowe or Bacon could not have written Shakespeare's works and his and Merriam's treatment of Pericles were shown to be seriously deficient (Smith, 1984b,c). Third, it has been pointed out that Morton's presentation of his method (1978) relies on sparce evidence, which is also incomplete (Smith, 1984a). Finally, further severe doubts have been cast on Morton's method by the examination above of Merriam's claim that Shakespeare wrote Sir Thomas More. Consequently, most of this paper is devoted to a scrutiny of Morton's method in the context of Elizabethan drama. Merriam reported (1982) that during preliminary testing of Sir Thomas More he had noted similarities with Shakespeare's The Winter's Tale, a play he had chosen as a basis for comparison during his earlier examination (1979, 1980) of Henry VIII. It is therefore included in the present investigation. As can be seen from Table 1, when The Winter's Tale is compared with the Shakespeare Control a value of chi-square of 62.17 for 31 degrees of freedom ensues. This value is approximately the 0.1 °70 level of significance. If the rule which Merriam adopts were now applied it would correspond to a
AN INVESTIGATION OF MORTON'S METHOD
chance of only one in a thousand that Shakespeare wrote the play. When four further plays (not by Shakespeare) were tested against the Shakespeare Control, even larger values of chi-square resulted, namely, 70.35 for 31 df, 136.40 for 27 df, 137.05 for 28 df and 169.89 for 30 df. The premise upon which the chi-square test is used is that every author exhibits his own multinomial distribution of the features compared and that the figures compiled for each test are random samples from such distributions. Morton's criterion is that a chance of 1 in less than 20 (i.e., not significant at the 5% level) that the sample is from a particular author's hypothetical distribution is assumed to cover his normal variation while a lesser probability would be taken to indicate different authorship. This tentative working hypothesis has never been accepted by the present writer. Consequently, in the context of Morton's method the significance levels under the null hypothesis, would not be taken to reflect the probabilities of making a Type I error. Moreover the probability of a Type II error is usually larger in practice than that of a Type I error. Thus, if Smith's view were accepted, then the total values of chi-square in Table 1, only by mere comparison of their magnitudes, could be taken to support Shakespeare's authorship of all or most of both Pericles and Sir Thomas More. Ironically, therefore, the approach of the writer of this paper would leave the question open, while a strict interpretation of Morton's and Merriam's procedures actually would rule out their own rather revolutionary standpoint. In any case, a very much more thorough and searching investigation than Morton's method stipulates would be essential, in order that any conclusions reached would be sufficiently reliable to overturn decades of traditional scholarship. A preliminary step for any serious investigation is the elimination of as many sources of ambiguity as possible. One such problem is the use of elisions. In Table 1 each of the counts (except those for Sir Thomas More) is followed in parenthesis by the figure (if different) which is obtained when elisions are expanded. It can be seen, for example from Tests 9, 21, 22 and 27 in Table 1, that they can contribute substantially to the totals. As there appears to be no general justification for assuming that an elision is other than a bona fide occurrence of a feature
7
(e.g., 'TIS THE includes the collocation IS fb THE) their omission from Morton's and Merriam's counts would appear unjustified. However, a difficulty that occurs when counts are made by computer, as by the program developed by Smith (I 983b), is the importance assumed by the form of an elision in the machine-readable text. For instance, even in critical editions, an elision printed in different ways on the same page may be found-such as on't and on "t. In a machine text, the latter only would normally contribute to the total number of occurrences of ON, and in a count of IT, 'T would not be recognized in either form. It was therefore decided that all subsequent testing would be based on texts of the plays in which elisions were expanded. A few ad hoc rules were nevertheless necessary. For instance o'clock and expletives such as "Sblood were left unaltered; also when an occasional apostrophe indicated that a complete word had been omitted but was understood, that word was not inserted. In order not to miss an elision which possibly distinguishes an author, the writer of this paper prepares two versions of the text, one with elisions retained, the other with them expanded. Nevertheless, it must always be remembered that the occurrence of a contracted form, e.g., "em, may not always reflect an author's preference. (Techrjical details, relating to Table 1 and its use, are brought together in Notes on Table 1.) In the next section Merriam's approach for investigating the authorship of Henry 1/111 forms the basis for performing similar but more comprehensive tests on The Winter's Tale. Comparing the results with Merriam's work on Henry H I / n o t only permits an assessment of the reliability of his conclusions but also provides considerable data on the viability of Morton's method itself. An Examination of Morton's Method by Means of Merriam's Investigation of Henry VIII Some scholars maintain that Henry VIII was the work of Shakespeare alone while others feel that much of it was written by John Fletcher. James Spedding (1850) initiated the controversy by attributing about two-thirds of the play to Fletcher. Although later scholars, who believe that Henry VIII is not entirely the work of Shakespeare, disagree on how much of the play should be accredited to Fletcher, Merriam's starting point (1979) is Hickson's
8
M . W . A . SMITH
independent division (1850), which was subsequently approved by Spedding. Since Merriam accredits this division of the play to Spedding alone, the same attribution will, for ease of reference, be used in this paper. To determine if this distribution of Henry VIII is correct, Merriam selected the 20 tests shown in Table 2, by means of which he compared the portion deduced by Spedding to be Shakespearean with the remainder of the play. The outcome was that there was no statistically significant difference between the two parts, as measured by the 5°70 significance level of chi-square. Merriam therefore concluded that either Spedding was incorrect in his division, or the play was Shakespearean in its entirety. For his initial test, one portion of the play, con sisting of eight parts, was compared with the other portion containing seven. All parts except two correspond to scenes (one scene is divided). For subsequent testing Merriam retained the same fifteen parts and always compared a combination of eight different parts with the remaining seven. Since the primary purpose of this paper is to investigate Morton's method, the objectivity of Merriam's procedures, the correctness of his statistical handling of the data and the quality of his deductions are not appraised in the argument which follows. Such a restriction can be imposed because, if Morton's approach is shown to fail, Merriam's work on Henry VIII is thereby invalidated. As the main part of his study, Merriam (1979) performed about 200 different comparisons of eight parts of the play against seven, each involving the 20 tests of Table 2. He expected that if Henry VIII were written entirely by Shakespeare, the number of results showing statistical significance would be in accord with chance variation. His intention was stated in these terms: " I f . . . we can show that a certain grouping of the scenes consistently maximises the chi squares obtained by contrasting Shakespearean and non-Shakespearean scenes, then we have discovered the correct division of the play." Of the 20 tests, Merriam found that "four produced at least one chi squared over 15, representing less than a chance of one in one thousand that the difference between the two groups was a random one . . . . Spedding was therefore certainly correct in proposing multiple authorship. His mistake lay in assigning the particular scenes he chose to the
two authors." Merriam also found a division of the play which gave rise to a value of chi-square of 66.35. Partially on literary grounds, he then divided another scene and recalculated the value of chisquare. Obtaining a figure of 72.90 for 20 degrees of freedom he stated: " O n the basis of this exceptionally large chi squared, the chances are less than one in one thousand million" that Shakespeare wrote the entire play. As part of his study Merriam compared two acts of The Winter's Tale with the portion of Henry ,VIII he ascribed to Shakespeare. He obtained a value of chi-square of only 12.96 for 20 degrees of freedom which he interpreted as support for his division of Henry VIII. Since The Winter's Tale is contemporaneous with Henry VIII (within a few years) and as its text and authorship are as reliable as any Shakespearean play, it is a very suitable control from which to determine if Merriam's statistical results for Henry VIII are likely to be due to a cause other than authorship. For this purpose The Winter's Tale was divided into 14 approximately equal parts for which Table 2 shows the counts of the 20 features. All possible combinations of seven parts were tested against the remaining seven. Thus, 1716 comparisons were performed, each consisting o f 20 tests. For each test a 2 x 2 contingency table was formed and the value of chi-square both with and without Yates' correction was calculated. The former values are used for assessing individual tests while the latter are added to give a measure of the overall comparison of one part of the play with the remainder. Table 3 shows, for each of the 1716 comparisons, a breakdown of the results for the i~ividual tests and also a summary of the totals over the 20 tests. The totals indicate that about one-third of the comparisons reveal a difference between the two parts at the 5 070 level of significance. Such a result implies that if Morton's criterion were adopted to distinguish authors, the chance of a wrong interpretation (Type I error, if the null hypothesis is assumed) would occur about once in every three cases. Moreover, when Yates' correction, with its associated reduction in the value of chi-square, is adopted, three of the 20 tests produced a value which exceeds 15, while the value of a fourth is 14.79. Such an outcome is very comparable with that reported by Merriam from examining Henry VIII. Furthermore, the last column in Table 3 shows the average of the six largest values of chi-square
AN INVESTIGATION OF MORTON'S METHOD
for each test and for the comparisons as a whole. The proximity of the average to the maximum value confirms that the large values of chi-square are not exceptional occurrences. Also, the maximum overall value of chi-square of 60.92 for 18 degrees of freedom is broadly of the same magnitude as the corresponding figure of 66.35 for 20 degrees of freedom which Merriam obtained from Henry VIII. No obvious pattern emerges from the results of the individual tests, except that collectively they add to the general impression of pervading unreliability. Excluding the possibility of problems concerning the authorship of The Winter's Tale, it is now evident that Merriam's results cannot be relied upon to refine, or add to, existing knowledge about the composition of Henry VIII. Even though The Winter's Tale was edited by one scholar it could be argued that since 13 of the 20 tests depend on punctuation and since scribes, compositors or theatre personnel could have altered the text sufficiently to affect the remaining tests, the results do not necessarily detract from Morton's method. Such objections can be reduced by incorporating only tests which are independent of punctuation. When tests of position are excluded, comparisons between works edited by different scholars are meaningful. Therefore, in the next section, to examine the performance of Morton's method more rigorously, all possible combinations of seven parts of The Winter's Tale are tested against the plays (except Sir Thomas More) given in Table 1. Comparisons of portions of The Winter's Tale with five other plays. After completing his study of Henry VIII, and recognizing the difficulties inherent in applying Morton's stylometry to works in which the author's original punctuation is not available, Merriam switched to tests which either do not depend at all, or depend only minimally, on that element of the text. (The present writer excludes any collection which contains .:;!? between its two prescribed words.) These are the tests defined in Table 1. They are used in this section to provide a more broadly based examination of Morton's method. The Winter's Tale is again divided into 14 parts as described in the Notes on Table 2, and all possible combinations of seven parts are compared with each of five other plays: Pericles, The White Devil, The Atheist's Tragedy, The Revenger's Tragedy, and Women Beware Women. The values of the
9
counts are again those given in Table 1, but in this case the variants in parenthesis are adopted. These are the figures obtained when elisions in the scripts of the plays are expanded. Occurrences of the features in each part of The Winter's Tale for the 31 tests are listed in Table 4. Thus, by means of a 2 x 2 contingency table for each of the 31 tests, the 3432 possible combinations of seven parts of The Winter's Tale are compared in turn with each of the other five plays. As in the previous section, chisquare is evaluated for every contingency table, both with and without Yates' correction. The former values are used for determining the significance levels of the results for the individual tests, while the latter are added within each of the 3432 comparisons to give a total from which an overall significance level is calculated. In the context of testing Morton's method, the otherwise questionable use of Pericles is particularly appropriate because of the confident claims by both Morton and Merriam that it was all written by Shakespeare. Table 5 summarizes the results of all the comparisons for each of the 31 tests. Morton, when describing how he had established his method for texts in English, stated unambiguously (1978, p. 132): "Only if the occurrence [of a collocation (or a word in a prescribed position)] was consistent within a sample [of text by the same author] and differed between samples [of text, each by different authors] was it accepted as b e i n g . . . a test of a u t h o r s h i p . . . " From Table 5 none of the 31 tests distinguishes satisfactorily Shakespeare's work from that of the other authors. In fact, when the criteria adopted are that the number of occurrences should permit at least half of the comparisons to be valid and that 5°7o or less of the comparisons should not be significant at the 5°7o level, The Winter's Tale is distinguished only by Test 4 (i.e., the pair CAN/CANNOT) and only from Pericles. This means that one test alone distinguishes only one play from The Winter's Tale--and that result is contrary to the ascription Morton and Merriam have claimed for Pericles! Turning now to the overall results at the end of Table 5, The Winter's Tale is distinguished by the same criterion from all the other plays except The White Devil and Pericles. Moreover, observing the percentages of comparisons in each of the categories of significance, there is perhaps a tenuous affinity between the two plays of which Shakespeare wrote all or part. It is therefore conceivable that a
I0
M . W . A . SMITH
method sufficiently reliable to distinguish authors may be developed from Morton's approach. That it failed so decisively during the preceding investigation is not surprising when we consider some of the individual tests in Table 5. In Test 1, a comparison of the occurrences of A and AN, the playwright's use of AN is determined by the word which follows. Since nouns and adjectives are generally deemed too dependent upon their context to distinguish authors, it follows that the use of A N is also unsuitable. Test 4, the word-pair CAN and CANNOT, is a curious choice in that it is only by convention that CANNOT is written as a single word. It would seem more logical to add the number of occurrences of CANNOT to the total for NOT in Test 6, which consists of the word-pairs NO and NOT, and thereby to eliminate Test 4. Test 5, the word-pair DO and DID, compares tense and would be expected to exhibit a greater dependence on the plots of the dramas than on their authorship. In any case DO and DID are not conjugational past and present tense equivalents. This discrepancy alone would degrade any discriminatory qualities which might be attributed to them. In Test 12, AND fb TO occurs too infrequently to be of much value, as does AS fb A in Test 13, and DID pb I in Test 18. For reasons similar to those which cast doubt on the merits of Test 1, it would seem more appropriate to define the following word as A or AN rather than only A in Tests 13, 15, 23, 29 and 31. In practice, however, the inclusion of AN might have little effect. Insufficient thought seems to have been given to the verbs in Tests 16, 18, 20 and 28 in that one particular form alone is tested in each. The occurrence of that form, rather than other forms of the same verb, associated in particular with different person or tense, could reflect the subject matter of the plays as much as authors' habits. Test 25, ME fb TO is another curiosity in that ME would normally be associated with the word(s) preceding it, while TO in general either would be a preposition governing a following noun or would form an infinitive. Summary and Comment It has been demonstrated that Morton's method as applied by Merriam cannot distinguish reliably
between the work of Elizabethan and Jacobean playwrights. Consequently, there is insufficient evidence - - t o attribute most of Sir Thomas More to Shakespeare; - - t o ascribe Act I and II of Pericles to Shakespeare; - - t o conf'n'm that Henry VIII is not entirely Shakespeare's work, or to contribute to the distribution of its script between Shakespeare and Fletcher and/or any other playwright. In short, Shakespearean scholarship has been shown to survive unscathed. From a broader view, the weight and variety of evidence against Morton's method is now large. Of his work on Greek texts, Herdan (1965) wrote that "it can only serve to demonstrate how not to do it," while Smith (1984a, b, c) has found that Morton's own description and application of his method is utterly unconvincing. However, when Smith (1982a, b; 1983a) expanded the method by using large numbers of tests, similar in form to those advocated by Morton himself, and applied it to two cases for which the evidence of authorship derived through traditional literary approaches was already strong, almost all groups of tests confirmed the scholarly position. In view of their number, had the groups of tests no powers of discrimination such an outcome would be unlikely to occur. There were, however, two exceptional groups, both based on personal pronouns (Smith, 1982b), but when they were investigated more widely, they were found to be lacking in powers of discrimination. Thus an enlargement of Morton's method could be valuable as a prelude to an in-depth study by suggesting the most or least likely candidates from a number of possible authors. If this approach were established, then, in order to eliminate other factors which could obscure authorship, it might be found beneficial when analyzing plays to distinguish between rapid dialogue and long reflective passages, verse and prose, and such like. Finally, the reader is warned against comparing the figures presented in this paper for The Revenger's Tragedy with those for The A theist's Tragedy and then announcing that a long outstanding problem of authorship has at last been settled!
AN I N V E S T I G A T I O N OF M O R T O N ' S M E T H O D
No° 1
Tes t A
Con trol 14821
STM(1)X 2 335
STM(2)X2 312
0.59 AN ALL
1572 4037
41
888
BETTER
634
I~ 17
487
22
CAN
1315
41
4 828
12
DID
1794
24
5
Ii
3998
59
53
81
59 0.29
195
ON
3181
63
B5191) 56
4.9~ ~
32
77
2.76 ~O
0.35 28
O. 36 16
4.98 29
39(40) ~O 36 II 6.25 0.26 8.6~ 10.20 1.02 [39(142) i8(59) 40(43) 69(81) ]O4(110)
127
74
0.10 [55
60
15 0.04
51
1.24
34 30
22
0.00 ii(12)
194(196) 0. i~
38
0.30
1.39
O.621
7
43(44)
[83
16
35
89(92)
10
5.89
0.8
59
0.98
0.00 17
30
0.00
4090
23
9.50 13
23
DO
9080
53
76 22 13
0.86 24
0.~
39
2.1C 19
0.23
3,89
NO NOT
31 23
13
36
127(130)
Women X2 636
0.84
5O
2.09
19 13
20
0.o8
6
103(104)
RevenBer~
1351
0.3
53
O.0]
1.62
4.95 CANNOT
82
14 16 2.07
BEST
45
0.90
Atheistx2 405
0.06
26
87
Devil x2 547
0.85
39
I01
Winter X2 404
0.75
1.67 ANY
Periclesx2 301
11
55(60) 0.98
3.95
79
!21
1oo(106) O.00
0,32
4.54
0.50 305
69(90) 2.6~
76
0,08
t77
191
20(37) 30.11
118 0.15 249
38(59) 67(101) 19.58 6.40
UPON
1850
30
27
48(50)
58(59)
55(56)
1 47(49)
56(59)
62(70)
THEY
2618
73
67
60(62)
74(76)
116(125)
[ 44(51)
40(46)
76(87)
THEM
2075
70
THEIR
2315
66
8 O.41 9
0.26 64
1.29
No,
Test
Control
UP
1125
1.28 59
• STM(1)X2 25
28
ii
1850
30
WITH
7908
178
[66 0.13
383
7
3643
~80
fb TO
35
5
AS
762
122
WITHOUT 12
AND
~42 *
13
AT
31 317
2
36
ii
Ii
~03
185
fb A
35
9
BEEN
89
22
0.00
15
pb HAVE BY
25
5
457
82
fb THE DID
60
17
227
24
pb I
42
3
37
629
4 0.34
0.13
36(137) 1.15
250(253) 0.04
9(iO)
9
73 17
O.00 17(23)
43(44) *
38 0.25
6
120
iii
0.02
8
~43
117(118) O.18
4 ~28 0.30
0.0 9(14) 36
O.00 5
Table 1. Comparison of two versions of Sir Thomas More and other plays with Merriam's Shakespeare Control.
0.04 7(8)
71
0.0l
30
21 O.69!
5
85
0.28 6
28
0(2)
39(40) 1.40
13 199 1.40
*
12
O.01 8 [17(120) 0.18 0.01
5
17
13(15)
11
2 49
i0
5
204
*
172
0.03
O.O9 5
104
65(67) 0.08
24(25) 4.65
565 *
1
2 50(52) 2.21
9.10 16
* O
*
166
143(148) 1.52
II
2.08
13
125 2.49
19
23
0.09 9
95 4.44
478
4.51 62(70)
145
24.23
486(488) 3.82 i0
57
o.oe 56(59)
23
7
38
0.59 10
156(158)
0.02
6(7)
9
26
206(208)
9
60
7
47(49~
2.12
0.O1
5
2
228
Women X2
o.oo
55(56)
11 0.75
32.45 i 54
0.75
0.34
O. 19
0.00
*
18
29
9
70
20
2.69
17
41(43)
16.O4 10(53)
66
0.00
0.52
6 0.06
16
1.35
7
72
845
BE
~
12
131 *
0.41 fb THE
Atheistx2
199(200)
8
2
74
14
186
[19 *
fb A
Devil X2
* 4
1.56
Winter X2
0.00
7
O.Ii 7(45) 66 28.35
87
58(59)
t33 0.03
7
1.42
0.00 ~8(50)
27
0.02 11(29) 23.21
45
34
1.87
2.5~ 71
O.12 53
25
1.03
0.95 61(67)
S'£'M(2)X2 Periclesx2
i0 UPON
0.02 44(45)
69(79) 0.25 7(10) II
O.7~ 4
* 4
12
M.W.A. SMITH
No.
Test
Control
FOR
982
. STH(I)x2 208
19
STM(2)X2 184
5.55 fb THE HAVE
20
pb I IN
50
20
806
138
206
44 280
258
37
1193
198
21 IS
2.49
LIKE
90
18
240
41
fb A MAY
66
I0
250
52
33
8
ME
1001
193
25
OF
26
50 2173
0.09
9
14(15)
7
361
)01(304)
341 0.43
0.03
0.74
72
15
15
IO
468
63
60
55(60)
fb THE
58
i0
Test
Control
27
0.33
No.
SHALL
515
. STM(1)×2 87
28
;8
65(66)
fb BE SUCH
57
13
182
44
13
TO
53
14
2613
~81
119
24
fb BE WITH
1026
L78
~53 O.00 20 L66
fb A
57
T o t a l chi-square
D e g r e e s o f f r e e d o m (dr)
15
1.27 9(22)
112(113) 0.29 15 37
133(362)
0.14 9(22) 44 12
41
0.89
65 0.30
0.72 22
49
45(47)
O. 70 8
0.00
2.02
7
218(219)
2
209
157
0.47
2.00 14
406(415) 7.14
15(34)
9
41
148(515)
0.19 12(35)
0.02
2.O1
2,13
O. 73
5(6)
iI
584(623) 0.99
14
251(264) 4.00
420(428) 2.70
2
7
20(37)
38(59)
67(lOl)
5(lO)
2(4)
2(3)
0.00
Devil ×2 142(144) 0.56 12 36
1.14 7
143(278)
O.66
9(15)
I Winter ×2
72(113)
4.31
Atheist×2 70(74) 6.8( 16 21
91(92) 7.34 20
93
33
4O
5.5~
0.09 12
12
Women X2
O.ii 12
2.73 15
0.05 13
~89(493) 0.38 26
617(623) 14.18 52
591(610) O.OO 26
~68(470) 518(522) $32(651) 22.39 2.7] 1.22 47 33 36
L33
199(2OO) 0.42 14
206(208) 6.68 22
156(158) L45 2.22 14 15
2.47
1.06 4
15
30.03 17(41)
0.00
7.63 17
14
1.74
31
30 0.30
0.09
3O
8
40 0.30
fb A
0.01
1.54
12.58 26(40)
69(90)
12(13)
Pericles×2
5.02 21(44)
3
0.06
STM(2)XZ
0.74
470(495) 0.76
lO0(106)
8(9)
6.22 35(67)
5
20
0.53 IO
16.34 74(76) 332(381)
1205
19
0.01 12
181(201)
3.69
2.83
13
2.16 22(30) 297(338)
10
222
163
170
242(261)
0.73 117(133)
44
9
ON
fb THIS
29
8
8
187(193)
8.28 21(25)
5.35 28(33) 198(237)
O.O1
O.17
Women X2
0.00 54(55) 311(360)
32
61(62)
48
195(197)
0.22 15(i7)
3.55 7
O.O9
O.O0 fb TO
15
246(254)
98
0.24
9 0.04
fh BE
51
0.15
50
24
2.45 11(18)
64
39
]Atheistx2
183(378)
245(390) 0.46
8(22)
0.05
4.18 27(47)
0.37 17
23
3.42 143(267)
186 O.371
2.71 74 233(256)
24(28)
35
22 fb THE
28 204(211)
Devil X2
214(217)
237(241) 1.19
1.56 40 268
3.22 19
135(136)
128
2.39 fb THE
0.25 II
18
Winter X2 225(230)
173(181) 5.38
2.11
1503
Pericles×2
[43(148) 4.25
9.20 18
37.38
41.98
47.80
62.17
70.35
136.4£
137.O5
169.89
28
28
31
31
31
27
28
30
Table 1 (cont.)
AN I N V E S T I G A T I O N
OF MORTON'S
METHOD
13
N o t e s on Table 1 The values of chi-square for the individual tests incorporate Yates' correction. The total values of chi-square are the sums of values of chi-square from the individual tests calculated without Yates' correction. *indicates an expected value less than 5.00 was encountered. Such a test is therefore eliminated. For plays involved in further tests (for which results are given in later tables) the values in parenthesis are the revised counts when elisions are expanded. Test 8 compares the rate of occurrence of THEY with THEM + THEIR. Test 9 compares the rate of occurrence of THEM with THEY + THEIR. STM(I) is the entire published text of Sir Thomas More in William Shakespeare: The Complete Works. Ed. C.J. Sissons. (Odhams Press Ltd., Long Acre, London, 1954). Stage directionsare omitted. Use of A (with or without an apostrophe) other than as the indefinite article is excluded from the total in T~st 1 and does not affect Tests 13, 15, 23, 29 and 31. Use of AN where W W Greg (The Book of Sir Thomas More. The Malone Society Reprints. Oxford University Press, 1911) gives AND is altered to AND in the text entered into the computer. STM(2) is obtained from STM(I) by eliminating lines which appear to be later additions and written in hands which have been identified. Thus, the lines below have been removed: Act II, scene i, il. 1-15, 18, 19, 32-35, 45, 46, 51-54, 66-70, 77, 78. (Hand B)t Act II, scene iv, U. 44, 73, 74, 84-90. (Hand B)t Act III, scene i, 11. 259-292. (Hand E)t scene iii, 11.293-346. (Hand B)t Act IV, scene iv, 11.57-125. (Hand A ) t The removal of these lines reduces the total number of words in the play from about 20,040 to about 18,560. Test 6. While " N o , n o . " has been tolerated when appearing once within a play, the count of NO is amended in STM(I) and STM(2) by subtracting 4 and in Atheist by subtracting 5 to obtain the figures shown. This is to regularize an obvious source of distortion, the repeated use of " N o . "
t Greg's identification (1911)
The figures for the Shakespeare Control are those published by Merrlam (1982). In Tests 1-11 when Pericles and The Winter's Tale are tested against the Control the occurrences in these plays are removed, respectively, from the Control. Figures from Pericles are not incorporated in the Control for Tests 12-31. However, it was not possible to remove the contribution from The Winter's Tale from the Control for Tests 12-31. Therefore the values calculated for chi-square when The Winter's Tale is compared with the Control may be very slightly too small. For the first eleven tests the figures are taken from Marvin Spevack's concordances of Shakespeare's complete works. The counts therefore include the poetic works. Since comparisons between works in different literary forms are probably invalid, it would have been good practice to subtract the f~rer for the non-dramatic material from Spevack's totals. However, since the poems and sonnets constitute a small part of Shakespeare's total output, any difference resulting from such a course of action would not only be negligible but would be swamped by the effect of the uncertainties due to Merriam's omission of words contracted in elisions. For the remaining tests Merriam obtained his figures for the Control by counting the occurrences of the features in Hamlet, King Lear, Julius Caesar and Titus Andronicus. To these he added values obtained by taking one page in every 41 pages over 26 plays in the Alexander edition of The Complete Works. Excluded were Henry Vlll, Henry V1 Part One and Pericles, but Hamlet was retained, although the figures already included the values from that play taken complete. The text used for Pericles (c1608) taken from the new Arden edition (Methuen, London, 1963). The Play Winter is The Winter's Tale (c1610) is taken from the new Arden edition (Methuen, London, 1963). Devil denotes the play The White Devil(el 612)by John Webster. The text used is that given in the series The Revel Plays (Manchester University Press, 1960). Atheist, Revenger and Women denote, respectively, The Atheist's Tragedy (c1609) by Cyril Tourneur, The Revenger's Tragedy (c1607) (which is an anonymous play), and Women Beware Women (c1623) by Thomas Middleton. These three plays were taken from the volume Jacobean Tragedies edited by A.H. Gomme (Oxford University Press, 1969). In these three plays reflexive pronouns and some other words, printed as two separate words, have been reproduced in the computer text as single words in accordance with present-day convention. In all eases only the words actually spoken on stage are contained in the computer text.
14
M.W.A. SMITH
N o ~ar\t TestXh
I
2
3
4
5
6
7
8
9
IO
II
12
13
14
I
A fws
33 I
34 1
30 I
21 7
27 1
32 7
39 IO
29 O
39 O
21 1
38 3
23 O
24 I
14 0
2
AND fws
51 9
47 5
35 6
43 IO
50 5
55 II
52 7
51 7
48 IO
42 9
41 7
46 IO
41 7
27 5
3
AS fws
23 0
26 1
16 1
17 1
16 1
8 2
7 1
16 O
14 0
21 1
9 0
23 O
6 0
26 2
4
BUT fws
19 8
13 7
15 5
II 6
18 4
16 6
13 6
15 4
19 9
24 4
16 6
15 4
14 9
16 I0
5
FOR fws
13 2
15 6
24 5
15 7
25 7
22 4
14 2
14 4
17 2
20 5
14 2
12 2
IO 5
15 8
6
IF fws
5 2
16 8
10 4
7 2
6 4
14 iO
8 6
6 2
7 4
13 5
8 4
4 2
8 3
8 6
7
IN fws
21 i
24 2
21 I
14 2
22 2
15 2
18 1
17 O
13 O
18 O
19 2
16 O
23 0
15 0
8
IT fws lws
38 5 7
37 3 8
31 i 5
50 I0 7
35 I 9
46 3 12
17 2 3
22 1 3
35 6 9
19 4 5
31 1 9
17 3 2
28 0 6
35 4 4
I0
NO fws
12 4
7 I
II 4
7 3
12 I
6 O
II 4
I0 5
6 3
I0 2
9 1
IO 3
7 2
9 5
iII
OF fws
23 1
32 O
22 O
34 O
50 O
35 O
54 2
41 I
35 O
35 O
25 O
27 O
58 0
24 0
12 THAT fws
26 1
34 1
23 1
31 1
22 4
12 1
27 3
26 0
18 1
26 2
19 1
26 2
23 1
26 1
13
THE fws
46 3
49 2
45 4
53 5
69 10
105 ll
69 8
64 2
42 2
48 2
75 6
54 7
84 8
40 5
14
A AN
33 3
34 2
30 4
21 2
27 3
32 1
39 3
29 5
39 0
21 1
38 6
23 3
24 5
14 7
15
ALL ANY
5 4
I0 I
8 0
6 2
15 I
3 5
4 1
9 3
9 4
5 2
4 2
12 1
I0 3
4 2
16
NO NOT
12 24
7 26
Ii 20
7 27
12 24
6 18
II 12
I0 17
6 26
I0 27
9 22
IO 24
7 17
9 21
17
BY Eb ~ E
9 i
15 3
8 0
8 0
6 I
13 3
4 1
7 2
9 4
Ii 2
5 2
9 O
I0 2
II 2
18
IN fb THE
21 3
24 1
21 5
14 0
22 4
15 6
18 7
17 3
13 1
18 5
19 2
16 1
23 7
15 2
19
IT fb IS
38 9
37 II
31 5
50 13
35 6
46 5
17 5
22 4
35 6
19 2
31 3
17 6
28 2
35 ii
20
TO fb ~{E
49 4
51 3
36 4
42 6
60 8
49 ii
42 3
43 1
33 4
49 2
51 I0
43 4
37 4
38 0
9
Table 2. The occurrences o f features in 14 parts o f The Winter's Tale for each o f 20 tests.
AN INVESTIGATION
OF MORTON'S
METHOD
N o t e s o n Table 3
Notes on Table 2 The text adopted for The Winter's Tale is the new Arden edition (Methuen, London, 1963). The parts of the play are: Part S e c t i o n o f pla'.
1
2
3
~ i l 1-232
I ii 233-465
II i
Part
6
Section o f p l a y
i,
7 IV i , l i , i i l
IV i v 6 3 4 - 8 4 3
V i
II il, IlI
Part Section
III ill
ii of p l a y
12
15
4
5
II ill
8 IV iv 1-210
13
9 IX' iv 211-421
III
ii
10 V iv 422-633
The symbols .:;!? are accepted as terminators of sentences. The ranges of significance include the upper value but exclude the lower. A test is accepted only when all expected values are not less than five. For the individual tests the values of chi-square and hence the significance levels are calculated using Yates' correction. The total values nf chi-square are the sums of the values of chi-square from the individual tests calculated without Yates' correction. These total values are used to obtain the " t o t a l " significance levels of the
comparisons,
14
Vii
I i, V ill
NB: One count had to be corrected manually due to OF at the beginning of Iii 233 being treated by the computer program as fws.
N~
Test
Total ngoof Comparisons
Not significant at 5% level
Significan~ between 2.5-5.0 1.0-2.510.5-I.0 3.1-1.OI 0-O.i
i A fws
1716
1268
2 AND fws
1716
1707
9
40
40
O
1535
72
3 AS fws
82
36
Max ×2
Size of 6 largest×~
16
66
248
22.61
21.65
0
0
0
0
4.41
4.18
0
0
0
0
2.61
1.78
56
26
22
5
13.20
11.49
4 BUT fws
1716
5 FOR fws
1716
1606
41
45
ii
12
i
11.33
9.49
6 IF fws
1716
1675
32
8
I
O
O
6.74
6.15
7 IN fws
1716
1691
16
6
3
0
0
7.31
6.27
8 IT fws
1716
1512
84
59
25
28
8
15.68
12.83
9 IT lws
1716
1693
18
4
I
0
O
6.68
5.61
IOiNO fws
1716
1654
37
16
6
3
O
9.04
7.88
ii OF fws
0 1716
1711
5
0
0
0
0
4.66
4.16
13 THE fws
1716
1674
28
Ii
3
O
O
7.63
6.69
14 A/AN
1716
1549
77
56
20
13
I
11.22
10.05
15 ALL/ANY
1716
1599
54
29
21
13
0
9 . 4 9 ! 9.00
16 NO/NOT
1716
1702
12
2
0
O
0
5 . 1 1 1 4.95
17 BY fbTHE
1716
1695
20
I
O
O
0
5.13
4.75
18 IN fbTHE
1716
1512
88
50
30
26
I0
14.79
13.13
19 IT fb IS
1716
1571
56
50
16
18
5
13.47
11.87
20 ro fb THE
1716
1449
95
82
31
43
16
15.38
13.74
Overall total for the 20 tests
1716
1154 (including 22 95Z)
123
143
87
124
85
60.92 56.71 (ISdf) (18df)
121THAT fws
Table 3. Summary of all possible comparisons of seven parts of The Wfnter's Tale with the remaining seven parts.
0
i~"
O~ ~
O"~J
~-~
~.n
~o
~
o~
~
I-~
0
~-'~
>-~ ~ -
o~
~
&~'aD
~oL~
o~
N
~"
I-J~
01"-~
0
o~
~
0
0
0
oD
O I -~
I- ~ ' J
~'-~ & "
o~
N
0
0
000
o~
0"-4
000
~
~
o
I L l -~
~-~ (Y~
0
~
o
~I~
~--~ I-"
O0 "..~
~
~
~
~-~
~
~
~
~
~
~
CO
.~
~
~OOO
I'-~
CO,.~
~
~
~
C~
~D
"~4 ~-
I~.~1- ~
" ~ l . &TM
~
~
'...~ I--'
I'-~ 4.~,
~
o
O 0
I-~
~,
~
I'-'
~
.~'~
~
o~
~J
0'.0
~1~
~
,-,
',.0
o
5O
.>
AN INVESTIGATIONOF MORTON'S METHOD
17
ar"
No.
I
2
3
4
5
6
7
8
9
iO
ii
12
13
14
Test 21
IN fb THE
21 3
24 i
21 5
14 O
22 4
15 6
18 7
17 3
13 i
18 5
19 2
16 I
23 7
15 2
22
IS fb THE
32 0
42 2
35 0
27 2
31 I
30 I
23 3
20 2
28 i
17 i
34 3
29 2
II 0
31 O
23
LIKE fb A
7 2
3 0
2 0
4 0
3 0
7 I
O 0
4 3
I 0
I 0
6 O
4 0
4 I
5 0
24
MAY fb BE
7 O
4 I
3 2
4 0
0 0
7 O
7 2
0 0
I i
ii i
8 2
2 0
3 0
5 O
25
ME fb TO
18 2
23 i
15 I
21 O
19 7
Ii 2
29 O
i0 O
14 O
9 0
14 I
18 3
II O
IO 2
26
OF fb THIS
23 i
32 i
22 3
34 I
50 O
35 i
54 2
41 0
35 2
35 3
25 i
27 2
58 2
24 I
27
ON fb THE
6 I
4 O
Ii 0
14 O
4 O
9 I
7 4
Ii 3
6 O
7 0
8 2
9 i
2 0
8 I
28
SHALL fb BE
6 I
3 O
15 O
8 i
13 4
7 O
5 I
4 O
6 O
16 2
14 3
8 2
I 0
7 I
29
SUCH fb A
I I
2 I
I 0
3 O
2 I
3 0
3 2
2 O
2 I
2 0
I 0
6 0
6 1
3 O
30
TO fb BE
49 6
51 4
36 I
42 2
60 3
49 5
42 2
43 2
33 3
49 4
51 7
43 3
37 7
38 3
31
WITH fb A
23 O
16 3
19 2
13 O
15 i
I0 0
15 0
17 2
6 0
II I
15 2
18 2
ii I
ii O
Table 4. The occurrences o f features in 14 parts o f The Winter's Tale (as defined in the Notes on Table 2) for tests which are not dependent u p o n punctuation.
18
M, W. A. S M I T H
Percentage of comparisons No, Test
Winter VS
I A/AN
2 ALL/ANY
3 BETTER/ BEST
4 CAN/ CANNOT
5 DID/DO
6 NO/NOT
7 ON/UPON
8 THEY/ THEM+ THEIR
9 THEM THEY+ THEIR
I0 UP/UPON
iI WITH/ WITHOUT
.!
Total n~ of comp- Not significant arisons at 5% level
Atheist 3432 3432 Devil pe---~les 3432 Revenge_._._r 3432 Uomen 3432 3432 Atheist 3432 iDevil !Pericles 3432 Revenger 3432 3432 Wom~n 3432 Atheist 3432 Devil Pericles 3432 Reven~er 3432 3432 Atheist 3432 Devil 1432 ~_/_icles 3432 Revenger 3432 t,bmen 3432 A theist 3432 D evil 3432 P eric ]e~ 34~2 Kevengel 3432 Women 3432 Atheist 3432 Devil 3432 pe ~ l e ~ 3432 3432 3432 Atheist 3432 Devil 3432 Pericles 3432 Revenger 3432 3432 Atheist 3432 D~ev~-Vil-- 3432 p'ericles 3432 Reven~er 3432 3432 iwomen Atheist 3432 Devil 3432 Pericles 3432 Reve----~ 3432 Women 3432 Atheist 3432 3432 Periclesi3432 eve~ 3432 Women 3432 Atheist Devl--~ Pericles Revenger ~m~,
3432 2795 2786 3429 3432
Significant between
(%)
2.5-5.0
1.0-2.5
0.5-1.0
0. I-I.O
99.45 98.57 97.58 i00.00 96.18 I00.00 63.52 98.83 98.89 74.91 I00.00 99.91 I00.00 98.83 I00.00 30.71 99.36 3.67 73.69 20.22 94.61 98.14 94.41 23.46 42.74 i00.00 98.57 I00.OO I00.00 I00.O0 ii.01 99.65 89.77 56.96 97.81 IOO.00 IO0.00 IO0.00 74.01 I00.00 23.08 54.66 87.47 90.38 79.40 i00.00 99.18 i00.00 I00.00 97.29
0.47 1.19 1.92 0 2.71 0 12.12 1.08 1.02 9.99 0 0.09 0 0.84 0 15.62 0.55 4.20 11.60 12.70 3.58 1.43 3.55 16.78 16.96 0 1.40 0 0 0 13.64 0.26 6.09 17.25 1.60 O O 0 16.05 0 11.33 12.24 7.11 5.65 9.59 0 0.73 0 O 2.19
0.09 0.23 0.50 0 0.96 0 12.18 0.09 0.09 7.43 0 0 0 0.32 0 19.41 0.09 9.70 8.62 15.85 I. 52 O.t+l 1.69 21.04 19.03 0 0.03 0 O 0 23.02 0.09 3.21 14.48 0.50 O 0 0 8.36 O 15.97 13.00 4.05 3.21 6.67 0 0.09 0 0 0.52
0 0 0 0 0.15 0 4.66 0 0 3.85 0 0 0 0 0 11.54 0 10.52 3.76 11.95 0.29 0.03 0.32 13.46 9.04 0 0 0 0 0 16.20 0 0.67 5.97 0.09 0 0 0 1.49 0 11.31 7.26 1.19 0.70 2.68 0 0 0 0 0
O 0 0 0 0 O 6.12 0 0 2.97 0 0 0 0 0 16.05 0 25.44 2.13 22.87 O O 0.03 18.15 10.02 0 0 O 0 0 25.50 0 0.26 4.72 O O O 0 0.09 0 21.68 9.59 0.17 0.06 1.63 O O O O O
6.67 0 46.47 0.20 16.40 0 0 0 7.11 2.19 0 0 0 0 0 10.64 0 0 0.61 O 0 0 0 O 0 16.64 3.24 0 0 O.O3 O 0 0 O 0
77.04 I00.00 IO0.00 i00.00 99.83
17.64 0 0
5.64 0 0
0.26 0 0
0 O 0
0 O 0
0 0
0 0
0 O
0 0
0 0.17
Table 4 (cont.)
~
0-0. I
O 1.40 O 0 0.84
AN I N V E S T I G A T I O N O F M O R T O N ' S M E T H O D
19
Percentage of comparisons No. Test
Winter VS
Total n o of comp- Not Significan= arisons at 5% level
Significant between 2.5-5.0
12 AND fb TO
Atheist 0 Devil 3098 _pe--~les _ 1513 Rev~ O ~om~n O O 13 AS Atheist fb A D-~I 2099 Pericles 317 O Revenger Women 7 14 AT A~theist 2310 fb THE Devil 190 Pe~les 143 Revenger 132 46 W omPn 380 15 BE Atheist 1645 fb A Devil Pericles 3262 26 W ~ 377 Atheist 0 16 B E E N 3418 pb HAVE Devil Pericles 3002 R~venser 207 W °men 2913 17 BY Atheist 3432 fb THE ~ 3432 Pericles 3432 Revenger 3432 Women 3432 18 DID Atheist 365 pb I Devil 1505 Pericles 39 l~evenger 20 Wom~ 0 19 FOR Atheist 3432 fb THE Devil 3432 :P-~es 3431 Kevenger 3432 Women 3318 20 HAVE Atheist 3432 pb I ~evil 3432 Pericles 3432 el~venger L432 ] ~ 3432 AEheist 3432 21 IN fb THE Devil 3432 pericles 3432 Revenger 3432 3432 Atheist 3432 22 IS 3432 fb THE Devil .pe--~les 3432 Reve~ 3432 oW~ 3432
1.0-2.5
IOO.00
i00.O0
93.57 i00.00 IOO.OO iOO.00 99.47 95.80 94.70 97.83 IOO.OO I00.OO IOO.00 IOO.OO i00.00
5.34
I.I0
0 0.53 4.20 5.30 2.17
0.5-1.0
(%)
O.I-i.O
O-O.i
0 0
O 0
0 0
0 0
O 0
O 0
0 0 0 0 0 O O O 0 O 0
O 0 O 0 0 O O O 0 0 O
O O O O O O 0 O O O O
83.56 i00.00 I00.00 i00.00 IO0.00 98.83 iOO.OO IOO.00 99.97 IOO.00 IOO.O0 I00.OO IOO.OO
13.19 0 O O O 1.17 0 O 0.03 0 O O O
3. I0 0 0 o 0 o 0 o o O O O O
0.15 0 0 0 0 0 0 0 0 0 O O O
0 0 O O 0 0 0 0 0 0 O 0 O
O O O O 0 O O O 0 0 0 0 O
90.53 99.10 98.95 99.59 83.00 IOO.OO 95.37 62.97 84.73 88.23 99.56 99.21 87.21 62.50 44.67 91.52 99.80 88.05 51.63 98.08
5.27 0.76 O.90 0.35 7.96 O 3.76 15.68 10.64 7.87 0.32 0.61 6.56 11.28 11.86 "5.77 0.20 7.87 19.93 i .52
2.97 O.15 0.15 0.06 5.76 0 0.87 15.03 4.17 3.44 0.12 0.17 4.28 11.25 13.67 2.36 0 3.38 17.37 O.41
0.87 O 0 0 2.05 0 O 4.84 0.47 0.47 0 0 1.28 5.80 8.60 0.32 0 0.52 6.56 0
0.35 0 0 O 1.18 O O 1.49 O O 0 0 0.67 6.85 12.82 0.03 0 0.17 4.25 O
O 0 O 0 0.06 O 0 O 0 O 0 0 0 2.33 8.39 0 0 0 0.26 O
Table 4 (cont.)
20
M . W . A . SMITH
Percentage of comparisons No. Test
Winter VS
23 LIKE fb A
Atheist Devil P ericl~s Revenger Wom~n A~heist Devil Pericles Revenger ~omen Atheist ~evil Pericles Keven~er ~omen A thelst Devil ~er1-~es Revenger women Atheist Devil ericles Revenger Women A.theist Devil P - - ~ le s Reven~ Women A theist D evil Pericles ~evenger Wome~ Atheist
Total no. of .comp- Not significant arlsons at 5% level
2540 3366 2084 1419 3331 24 MAY 2125 fb BE 2749 1561 984 3 25 ME 3423 fb TO 1758 3432 3429 1946 26 OF 3432 fb ~ I S 1382 3432 2675 3300 27 ON 3336 fb THE 3430 3328 571 104 28 SHALL 3432 fb BE ~251 3163 3432 3278 3429 29 SUCH ~216 fb A 3432 3357 ~034 3432 30 TO 3432 fb BE Pericles 3432 !Revenger 3432 4omen 3432 3432 31 WITH 3432 fb A ericles L325 IKevenger 3432 ~omen 3432 3432 3VERALLTOTAL Atheist D evil )432 FOR THE 31 Pericles )432 TESTS 3432 W omen )432
97.99 64.35 99.57 I00.OO 63.82 IO0.O0 99.78 I00.OO IOO.00 33.33 93.78 5.12 98.69 98.75 15.62 88.49 0 I00.00 16.82 68.42 76.20 97.17 I00.00 96.24 0 95.60 95.38 I00.OO 93.85 IOO.00 37.65 99.19 30.54 70.93 99.12 96.82 25.29 74.42 99.59 75.50 99.91 98.19 88.53 99.27 94.61 1.60 5.74 5.65 0.64 0.09
Significant between (%) 2.5-5.0
1.0-2.5
O. 5-1.O
0.I-i.O
1.26 15.09 0.43 0 14.80 0 0.II 0 O 66.67 3.94 5.80 0.87 1.05 9.82 6.88 0.72 0 29.72 15.88 10.94 2.19 0 3.76 0 3.35 4.04 O 4.37 0 19.71 0.27 15.12 13.97 0.29 2.68 15.41 13.26 O.41 12.65 0.09 1.40 10.19 0.64 3.82 2.77 5.51 8.97 2.27 0.67
0.75 12.86 0 0 14.05 0 0.II O O 0 1.81 14.96 0.38 0.20 18.19 3.93 14.91 0 32.07 10.88 9.62 0.61 0 0 3.84 0.99 0.58 0 1.66 0 19.66 0.54 22.52 10.19 0.59 0.50 20.95 9.32 0 8.42 0 0.38 1.28 0.09 1.40 5.42 10.55 17.07 7.58 2.65
0 5.41 0 0 5.01 0 0 0 0 0 0.44 13.88 0.06 0 13.82 0.67 12.01 0 13.42 3.39 3.12 0.03 0 0 15.38 0.06 0 0 O.12 O 12.39 0 11.51 3.81 0 0 13.17 2.33 O 2.53 O 0.03 0 O O.15 6.59 9.29 15.18 7.02 4.05
0 2.29 0 0 2.31 0 0 O 0 0 0.03 32.14 0 0 26.31 0.03 46.74 0 7.93 1.42 0.12 0 0 0 58.65 0 0 0 0 O 9.68 O 16.29 I.I0 0 0 19.00 0.67 O 0.90 O O O 0 0.03 19.52 23.02 29.46 21.88 10.93
i i
i ~
0-O. I 0 0 0 0 O 0 O 0 0 0 O 28.10 O 0 16.24 0 25.62 0 0.04 0 0 0 0 0 22.12 0 0 0 O O 0.90 0 4.02 0 0 0 6.18 O 0 0 O O O O 0 64.10 45.89 23.66 60.61 81.61
Table 5. Summary of all possible combinations of seven parts of The Winter's Tale with five other plays.
AN I N V E S T I G A T I O N O F M O R T O N ' S M E T H O D
21
REFERENCES Notes on Table 5 The Notes on Table 3 apply to Table 5. The total number of words in each of the plays, when elisions are expanded, is: Atheist 20595, Devil 24655, Pericles 17965, Revenger 20223, Winter 24928, W o m e n25916.
ChiUington, Carol A. (1980). Playwrights at Work: Henslowe's, not Shakespeare's Book o f Sir Thomas More. English Literary Renaissance, 10, 439-479. Greg, W.W. (1911). The Book o f "'Sir Thomas More, ""The Malone Society Reprints. (Oxford University Press). Herdan, G. (1965). [A contribution to a] Discussion on the Paper by Mr. Morton. J. Royal Statistical Soc. A., 128,229-231. Hickson, S. (1850). Who wrote Shakespeare's Henry VIII, Notes and Queries 24 Aug. pp. 198; 12 Oct. pp. 306-7; 16 Nov. pp. 401; 18 Jan. 1851, pp. 33-4. Merriam, T. (1979). What Shakespeare wrote in Henry VII1 (Pt. I). The Bard, 2, 81-94. - - - - - - 0980). What Shakespeare wrote in Henry VIH(Pt. II). The Bard, 2, 111-118. ---(1982). The Authorship of Sir Thomas More. A L L C Bulletin, 10, 1-7. Morton, A.Q. (1978). Literary Detection (Scribner, New York). Smith, M.W.A. (1982a). A Stylistic Analysis of Hero and Leander. The Bard, 3, 105-132. - - - - - - (1982b). The Authorship of Pericles: an Initial Investigation. The Bard, 3, 143-176. - - - - - - (1983a). The Authorship of Pericles: Collocations Investigated Again. The Bard, 4, 15-21. - - - - - - (1983b). Stylometry: the Detection of Literary Authorshipp. BCS Computer Bufletin, Ser II, 35, pp. 8, 9, 11. - - - - - - (1984a). An Investigation of the Basis of Morton's Method for the Determination of Authorship. Accepted by Style, to appear in 1985. - - - - - - (1984b). Critical Reflections on the Determination of Authorship by Statistics. Part 1. Shakespeare, Bacon and Marlowe. The Shakespeare Newsletter, 34:1 (no. 184), pp. 4,5. - - - - - - (1984c). Critical Reflections on the Determination of Authorship by Statistics. Part 2. Morton, Merriam and Pericles. The Shakespeare Newsletter, 34:3 (no. 186), p. 28, 33. Spedding, J. (1850). Who wrote Shakespeare's Henry VIII? Gentleman's Magazine, 177:155-24 and 381-2.