Computers and the Humanities 16 (1982) 107-117 North-Holland Publishing Company
107
A U T O M A T E D IDENTIFICATION O F M E L O D I C VARIANTS IN FOLK MUSIC M A R T I N D I L L O N and M I C H A E L H U N T E R
Introduction
The idea of automatically identifying musically related melodies is not new. As early as 1949, Bertrand H. Bronson suggested the use of machinereadable encodings of the initial phrase of songs (the incipit) as an aid in the study of variation in music. Since that time, the increasing acceptance of DARMS (Erickson, 1975) as a system for encoding music opens the tantalizing prospect of a general procedure for automatically identifying musically related melodies. A typical application, and the motive for the research reported here, arises in the repertory of the American shape-note hymnody, a body of music created by and for the singing school tradition, a thriving musical and social feature of rural America in the 18th and 19th centuries (Chase, t966). Shape-note tunes are the earliest transcription into music notation of widely ranging melodies adapted from oral tradition and later compiled in book form. The repertory raises a number of interesting questions concerning the nature of musical variation, including its precise definition, and thus represents a fine environment for investigating the feasibility of automatic matching of melodic variants. Before explaining a method developed for automatic variant matching in the tunebook repertory, let us first consider two other general approaches to the same problem : the thematic index, especially
Martin Dillonis an AssociateProfessorat the Schoolof Library Science, University of North-Carolina, Chapel Hill, NC, and Michael Hunter is with Hobart College,Middlesex,NY.
in the tbrm advocated by Harry B. Lincoln (1968), and the measure of association.
Thematic index
This means of locating a melody when its first few measures are known may, for example, provide additional information about the melody such as its composer or source. In this respect, the thematic index is like a dictionary for words, and like a dictionary, its success depends on the degree to which a melody can be found easily in the index. Thus, entries in such an index must be ordered so that one may search for an individual melody systematically and find it quickly. Different orderings are possible, depending on how a melody is represented. One obvious approach is to represent the notes by their traditional alphabetic equivalents, in which a melody in the key of C for example, would be represented by the letters C, D, E, F, G, A, and B. A collection of melodies can then be alphabetized following rules similar to those used in ordering words in a dictionary. (Some accommodation must be made for chromatic alterations, i.e., for notes falling between those represented by the letters.) If an index is to be used effectively for research into melodic variation, however, this simple strategy is not workable. In these cases, the same melody represented in two different keys, say C and G, would be widely separated. To overcome this deficiency, the most obvious strategy transposes each melody into some standard key, usually C, as for example in Barlow and Morgenstern's Dictionary of Musical Themes. One can carry this strategy further, as does Lincoln in the index
108
M. Dillon, M. Hunter / Automated identification of melodic variants
mentioned above. Concerned more with the broad contours of a melody than with its precise pitches, Lincoln represents the intervals of a melody rather than its individual notes. Thus, a sequence of notes s u c h as 'C E G C' is converted to the intervals between the notes, that is, ' +3 +3 - 5 ' . An ordering of melodies so represented facilitates variant matching by drawing together in the index all melodies that begin with the same interval, regardless of key. Benjamin Suchoff (1968) pursues this approach in the study of variants in the folk song repertory. A thematic index for a collection of melodies, however derived, is limited in its effectiveness as a tool for identifying variants. When variation occurs at the beginning of two melodies, they will be separated in the index no matter how alike they are thereafter.
Measure of association
A second general approach to automatic variant matching based on a measure of association between two melodies can overcome this drawback. Wolfram Steinbeck (1976) suggests that the essential properties of a melody, such as its meter, its first and last notes, highest and lowest notes, and others, can be abstracted from a melody and represented in a form suitable to the calculation of such a measure. Individual properties can be weighted to reflect their importance to musical variation, as disclosed by empirical studies. Many formulae exist for reducing the shared property list of two entities to a single value reflecting their similarity, and one of these could then be applied to an appropriately encoded property list of two melodies. (See Nicholas Jardine and Robin Sibson (1971) for a discussion of this and related matters; it is likely that more than one formula would give equivalent results.) The example presented by Steinbeck uses the single property of like notes, after both melodies have been transposed to a standard form. The melodies are aligned, typically starting from the first note; a simple measure might then be the percentage o f identical notes in the two melodies.
In contrast to an index, differences at the beginning of the melodies count no more than differences elsewhere. Stressed notes, usually of more import in melodies and in variant melodies, could be weighted to contribute more heavily to the measure; a match on stressed notes could be given a weight of 2, let us say, with a value of 1 counted for a match of unstressed notes. The degree of association between any two melodies would range between 0 and I00%; the higher the score, the more likely that two melodies are variants. (There is more subtlety to the technique as described by Steinbeck, but this should give the basic idea. See Deborah and Philip H. Scherrer (1971) for a somewhat different route to the same goal.) Measures of association have much to recommend them in the automatic identification o f melodic variants, but they are unlikely to provide a complete solution. Such techniques lead to results that are interesting but often impractical. Calculating measures of association for thousands of items, for example, is feasible but quite expensive, even using today's computers. A more serious obstacle to their general use, in this and similar applications, is the difficulty in making sense of the results. Cluster analysis, a sophisticated technique for grouping items related through a measure of association, as suggested by Steinbeck, has rarely been used with success on large data bases of any description. In general, the more subtle and numerous the properties on which a measure of association is based, the more difficult their manipulation, comprehension, or in the case of weak or unacceptable results, their improvement. Steinbeck, for example, after posing the problem of determining an appropriate set of musical properties and their effective weighting, a necessary prerequisite in such attempts, does not solve it.
Automated identification of melodic variants
Thus, each of the two major techniques for automatic variant matching - the thematic index and the measure of association - is deficient for a different reason. The thematic index is too rigid and fails to group variants when the variation occurs
M. Dillon, M. Hunter / Automated identification of melodic variants
in the beginning measures of a melody. Measures of association depart so far from standard musical expression that their comprehension is difficult, and they offer no straightforward means for a general implementation. The method outlined below for variant matching attempts to remedy these flaws. As in the thematic index, melodic properties are represented in recognizable form; as with the measure of association, problems of position that reduce the effectiveness of the index are largely overcome. To assure ease of implementation and generality, two additional constraints were imposed on its design. First, only those musical properties are used which can be derived automatically from DARMS encoded music. Second, the grouping operation itself is designed to exploit the facilities normally available in bibliographic processing systems. This latter point must be elaborated on. In a bibliographic processing system, the file normally comprises records similar to the entries in a library catalog, from which a patron can retrieve items about topics of interest. DIALOG, a retrieval service made available by Lockheed, and M E D L I N E by the National Library of Medicine, are typical. Both can be accessed by patrons through national telecommunications networks. To use such systems, patrons designate to the system a topic in a Boolean language designed for retrieving items from files. A simple example will give the basic idea. A patron might be interested in items dealing with the topic "Canons or Fugues by Bach". To formulate this request as a Boolean expression, one joins its components with the logical connectors AND and O R : Bach AND (Fugue OR Canon). This command to the system compares the topic description in each of the items in its file with the Boolean expression and retrieves those which conform to it. In the example, any item containing a treatment of Bach, combined with either the fugue or canon, would be retrieved. In brief, we propose an encoding scheme that represents the properties of a tune in a form that allows them to be treated like the components of the topic in the example above. Given a collection of tunes, and some tune for which one seeks melodic variants, one constructs a set definition that combines the properties essential to a tune for it to be a
109
variant. The set definition is then automatically matched against the collection; then tunes that correctly match the definition are retrieved by the system as variants of the original. The advantages to this approach are many. (1) The language used to express melodic variation - the Boolean query - is straightforward and easy to understand. (2) The software for carrying out retrieval operations is available generally. (3) The retrieval mechanism of a bibliographic system is only one function among many suitable for carrying out research on a musical repertory: variant matching can be one o f many lines of scholarly inquiry for which an appropriately structured data base could be used. The research on melodic variation in the tunebook repertory is limited: Bronson (1949) and Bayard (1961), however, have established typical forms of variation in Anglo-American folk song. The elements contributing most to melodic variation are rhythm and meter, pitch, and phrasing. The following discussion will explore the degree to which these elements are useful in identifying melodic variants by the method described here. Rhythm and meter Rhythm is generally the most flexible element in Anglo-American vocal folk music. In adapting tunes to fit both traditional and newly-written texts, a composer often alters the rhythm in a variety of ways, including repetition or elision of individual notes for free variation of the rhythm of a complete phrase. This practice reduces the usefulness of rhythm and musical meter as characteristics common to variant melodies. As a consequence, perhaps, Bronson does not include rhythm as an element of potential importance in relating tune variants. Bayard, while discussing rhythm as a source of variation in melodies, does not treat rhythm as a c o m m o n element in variants of the same tune. Pitch Melodic adaptation and permutations produce changes in pitch between a melody and its variants,
M. Dillon, M. Hunter / Automated identification of melodic variants
i 10
though these changes occur to a lesser degree than with rhythm. Bayard has concluded that, among Anglo-American folk melodies, variation tends to occur at the very beginning of a phrase, at the very end of a phrase, or at the midpoint cadence if one is present (this last point is disputed by Bronson). Although such generalizations can never be conclusive, they suggest a need for identifying, as variants, melodies with identical mid-phrase measures but with variations in initial, midcadence or final measures. Bronson and Bayard advocate rhythmically stressed pitches as a means for validly relating variant tunes differing in pitch. Rhythmically stressed pitches can occur at the beginning and, in certain instances, at the midpoint of a measure. A stressed pitch, as the term is employed here, expands on the definition of the term used by the National Tune Index (Keller, 1978): specifically, stressed pitches occur (1) on the first beat of a measure, regardless of meter signatures, (2) at mid-measure in C, 4/4, 4/8 and 6/8 measures. (Pitches held over a stressed beat, i.e., syncopes, are not considered stressed.) The melodies in I,"-C l i d .I dl.J.' ~'~- .'At I I i ("Supplication", p. 5, HH) l
Stressed pitches : E A A G B E etc. (a) 0 IX*.
I 7'~d I
;I..I
(Supplication", p. 45, SAH) Stressed pitches : E A A O B E etc.
°l ("Bridgewater". p. 276, OSAH)
(b) ~'.......... i ' !
i : : '
:~i,l
i'
~ '
("Bridgewater". p. 92, CAS}
Fig. 1. Stressed pitches in v a r i a n t melodies.
1See " K e y to a n t h o l o g y a b b r e v i a t i o n s " at p. 1 t6 for the sources o f the tunes.
Fig. l(a) illustrate variant melodies grouped through stressed pitches. The identification of variants based on patterns of stressed pitches allows for commonly encountered variations in related melodies, such as repeated pitches, ornamental or 'filler' pitches, and other variation occurring on unstressed beats. However, differences in the transcription of these melodies can create problems. Melodies in the same basic meter type (duple or triple) are often notated differently, or recast from one basic type to the other. The melodies in Fig. l(b) illustrate this problem. Variant identification on the basis of measure-based stressed pitches alone would not collocate these variants.
The musical phrase Bayard has found strong evidence of the migration of entire phrases from one melody to another, especially those adapted from dance or fiddle tunes. The phrases do not necessarily recur in the same position relative to the rest of a melody, but may be found at any point. The major difficulty involved in identifying variants that contain identical phrases or phrases with identical stressed pitches is one of musical judgment. Where do phrases begin and end? Although most tunebook melodies have relatively distinct musical phrases, many ambiguities do exist. For example, the phrase structure of the text may conflict with the integrity of the musical phrases. Frequently a phrase ending is elided by the beginning of the next phrase. In many nonrepeating (through-composed) melodies, phrase divisions are quite obscure. Although many of these ambiguities can be resolved by establishing arbitrary guidelines, the division of a melody into phrases is still a matter of musical judgment and personal interpretation. For our purposes, textual phrase divisions take precedence over musical phrase divisions. In the tunebook repertory, pitch and musical phrase are especially useful in identifying variants. Translating melodies into a machine-readable form should maximize variant matching based on these properties and yet adequately distinguish individual melodies from one another. The following
M. Dillon, M. Hunter / Automated identification of melodic variants
capabilities are included in the system : (1) Matching melodic material regardless of its key or meter. (2) Matching melodies or phrases with identical or similar patterns of stressed pitches. (3) Matching discrete phrases regardless of where they may appear in melodies. (4) Matching variants through combinations of any or all of these capabilities.
Basic representation scheme Two melodies identical in every respect except the key in which they are written clearly must be considered variants, and the first problem of any encoding scheme for matching variants is to eliminate such differences. An obvious solution is to represent the key in which a melody is written separately from its sequence of pitches, using a standard form for the pitches. Manually produced indexes such as Barlow and Morgenstern's Dictionary of Musical Themes, as well as machine-readable systems such as that of the National Tune Index, accomptigh this end by transposing all melodies into a common key, usually C major, or A or C minor. These systems do not. however, lend themselves to automated matching of modal melodies or to melodies varying in the 6th and 7th degrees of the minor scale. The version of the tune 'New Jordan" given in Fig. 2
111
The National Tune Index D O R E M I version is: 36671765355 73345432722
(SAHversion) (UHversion)
(DOREMI is a digital translation where: 1 = t o n i c of the major key, 6 = t o n i c of the minor key, 2=final of dorian mode, 3=final of phrygian mode, etc.) These two melodies are identical except for mode, and encoding them in either of these schemes complicates the task of recognizing them as variants. Both modal and tonal tunebook melodies are largely diatonic, consisting of pitches that can be represented by the solfege syllables usually associated with the seven steps of a major scale. The basic representation in this project assigns digits to these syllables as in Fig. 3. (a) 1234567(1) Digital transcription of the sol-fa syllables (b)
(major s~,ale)
L71234567UI
(c)
(minor scale}
L71234567UI (d)
0 /]~
....
= -Imw
• e(+)
(Aeolian mode)
L71234567UI
(a)
0.~ ,Ig
D
_ ¢.~
(Phrygian mode)
-1 (-~("New Jordan", p. 442, OSH)
L71234567Ul ff
(b)
9
_ . to)
(Dorian mode)
.a (-e-)" "
L71234567U1 ("New Jordan", p. 37. UH)
Fig. 2. Melodies differingin mode. The key signature of three sharps given in (b) may be a typographical error in the source, illustrates this shortcoming. The transposition likely to be found in a manually prepared index is : EAAB CB A G E G G EAABbCBbAGEGG
(SAH version) (UH version)
Fig. 3. The basic representationof tunes. (a) The tones of the diatonic scale; (b) tones above and below the octave; (c) the harmonic minor; (d) the representation of modal scales. Thus, the octave in which most of the melody occurs is represented by the digits 1 to 7, as in Fig. 3(a). Notes below this octave are represented by a capital L before a digit; similarly, a capital U before a digit indicates a note in the octave above,
112
M. Dillon, M. Hunter / Automated identification of melodic variants
as in Fig. 3(b). Minor melodies often exhibit variations in the pitches represented by 6 and 7. O f the three possible forms of the minor scale, the one chosen here as standard is the harmonic minor as in Fig. 3(c). The representation for the three most important modal scales are as in Fig. 3(d). In this notation both versions of "New Jordan" would be encoded as follows: L5 t 1 2 3 2 1 L 7 L 5 L 7 L 7 Pitches sometimes occur outside this spectrum (chromatic alterations). For a minor loss in accuracy - one might argue that F sharp is not the exact equivalent of G flat, especially in music that is primarily sung - standardizing chromatic alterations eases the task of defining a set of variants. Thus, in this system, all such alterations are notated as a flat (F) o f the appropriate pitch. (See Example 2 below in the discussion of matching operations, where a C sharp in the key of E minor is coded as F7.) This encoding scheme eliminates differences among melodies that depend only on key or mode and provides a basis for a set of five partial representations of a melody, each designed to capture one aspect of a melody in a form usable for variant matching operations. O f the five partial representations, two are required for matching melodies based on their pitches - one incorporating measure divisions, the other omitting them. Two are required for matching based on stressed pitches alone, similarly divided into those with measures and those without. Finally, one representation is used for matching based on the phrases o f a melody. Each partial representation is described below with a comment on its functions in matching variant melodies. Examples are based on the melody for 'New Jordan' in Fig. 2.
The representation of pitches In order to avoid matching melodies whose pitches are identical but which differ metrically, or to use selected measures for matching two melodies, indications of measure divisions in one representation of the melody are necessary. The device used here is the inclusion of the number of a
measure at its beginning, preceded and followed by hyphens. Thus, the start o f the first measure is indicated by '-1-', the second by '-2-', and so on. The representation of 'New Jordan' in this form would be : (1)
-1-L5-2- t 123-3-21L7L5L7L7
To match melodies, identical or variant, cast in various meters or cast differently in the same meter, a representation of pitches without measure divisions is required. In this form, the encoding for 'New Jordan' would be: (2)
L5112
(and continuing as in (1) without measure di,~isions)
Encodings (1) and (2) differ only with respect to the inclusion or absence of measure designations and allow the full melody or portions of a melody to be specified, either using measures or not, depending on the nature of the variant melodies one wishes to match.
The representation of stressed pitches Matching operations based on metrically stressed pitches have the same requirements as those using the complete pitch representation, and they are designed in the same way. Stressed pitches measured in (3) and stressed pitches unmeasured in (4) below illustrate these representations for 'New Jordan': (3) (4)
-1- L5 -2- 1 2 - 3 - 2 L7 L5122L7
In 'New Jordan" the metrically stressed pitches occur on the first and third beats of each measure, in accordance with the interpretation followed in this system. The opening measure, for example, contains a single sustained pitch and is represented here by a single stressed pitch (first beat).
The representation of phrases Matching musical phrases requires that the entire melody be represented, with phrases designated in some way within the melody. The usefulness o f measure designations in representing phrases is uncertain. For a repertory as homogeneous as that of the tunebooks, it is unlikely that
M. Dillon, M. Hunter / Automated identification of melodic variants
two phrases containing the same pitches would be unrelated; but without tests on a large data file, the consequence of ignoring measure divisions within phrases is also uncertain. Provisionally, the phrases of a melody are represented as a series of unmeasured pitches, separated by ~/'. The representation for 'New Jordan' is as follows: (5) /L5 1 1 2 3 2 1 L 7 L 5 L 7 / L 7
Summary of encodings In the examples below, we refer to these five encodings with the following mnemonics: (1) PTM = pitches with measures ; (2) PTU = pitches unmeasured ; (3) STM =stressed pitches with measures; (4) STU = stressed pitches unmeasured ; (5) PTP = pitches with phrases. The matching operations themselves, described below, may make use of any of these elements, either singly or in combination, to define a set of acceptable variations. Each tune in the collection is compared with this definition ; a successful match implies that the tune is likely to be a variant of the original.
Matching operations The basis of the matching operation is described above. Here we describe briefly how the logic of Boolean expressions may be used in conjunction with the five encodings to define a set of variations. Let us suppose that we wish to identify variants of "Kedron'. (See Example 2, below.) We wish to match any melody that is identical to this source, but we also wish to match melodies differing from it only in half-step differences in the sixth and seventh scale degrees. To define such a set, we first specify the exact match required for the measured pitch representations of any tune in the PTM field as follows : (1) -1-321 t - 2 - 5 5 - 3 - 4 3 2 - 4 - 3 2 1 1 - 5 - U t U 2 U t We specify as well the allowable variations in
113
measure 6, also in the PTM field, as follows: (2) -6- 7 6 5 or
(3) -6- 7 F7 The set of acceptable variants will be interpreted as any melody whose PTM representation contains (1) above, and in addition, either (2) or (3). Matching is carried out by comparing this definition to the PTM encoding of each melody in the file. For a match to be successful, it must be identical to the model through the first five measures, and contain either (2) or (3) in measure six. Any melody so matched would be listed along with its source. Though the degree of subtlety these definitions allow in the definition of melodic variations is limited (these will be discussed briefly in the conclusion), the potential for great flexibility and gradations of refinement should prove useful in searching large data bases for melodic variations. In the next section we show how this flexibility is achieved.
Examples The following examples have been selected for their importance in defining important sets of variations and for demonstrating the range of possibilities available: (1) Exact match using pitches with measures (PTM field). (2) Variations in pitch using pitches with measures (PTM field). (3) Exact match using unmeasured pitches (PTU field). ( 4 ) Exact match using unmeasured stressed pitches (STU field). (5) Variation in the first measure, using pitches with measures and stressed pitches with measures (PTM and STM fields). (6) Variation throughout, using pitches with measures and stressed pitches with measures (PTM and STM fields). (7) Matching melodies by key phrase using pitches with phrases (PTP field). Each of the examples first presents a melody as the basis for a definition of a variant set. It is given
114
M. Dillon, M. Hunter / Automated identification of melodic ~'ariants
Example 1. Exact match using pitches with measures. Model: 2-'~--~i--
ii °
.I' x
("Meat",
"'
'
Definition of variant set: 1321L713225543221 Sample result:
"
~:rJi I ]j ,., . . . . . . . -'~°"~d f" ip f i i i P ~ "
from SAH, p. 49)
PTM field: -l- 1-2- 55 -3- 33 -4- 123 -5- 22
I
I"Russia", from D U L , p. 25)
PTU field: !321 L 7 1 3 2 2 5 5 4 3 2 2 1
Definition o f variant set: -1-1-2-55-3-33-4- 123-5-22
Comment: Matched melodies would include versions set in any major or minor key or mode, thus accounting for the commonly occurring variations in the 7th scale degree, as well as those which vary metrically.
(PTM)
Sample result:
("Mear",
(PTU)
Example 4. Exact match using unmeasured stressed pitches. Model:
from OSAH, p. 49)
PTM field:-1- 1-2-55-3-33-4- 123-5-22
JJJ Comment: Matched melodies would include versions set in any major or minor key or any mode, as well as those varying rhythmically.
_j,
("Supplication", from HI-I, p. 5l
STU field: L51 l L72 L5
(etc.)
Definition of variant set: Example 2. Variations in pitch using pitches with measures. Model:
~-~.~-~: ~ .,]_ _J,l~j..
,4
~
"
"
.....
i~._;~,._ ,..i r .~'~"~r r
L511L72L5
(STU)
Sample results:
F~
{"Kedron", from OSAH, p. 43~
I"Supplication", from K N H , p. 12}
PTM field;-l- 3 2 I I -2- 5 5 - 3 - 4 3 2-4- 3 2 I I - 5 - U ! U 2 U I . 6 - 7 6 5
Definition o f variant set: {1) -1-321 t - 2 - 5 5 - 3 - 4 3 2 - 4 - 3 2 1 1 - 5 - U 1 U 2 U 1 (2) - 6 - 7 6 5 (PTM) (3) - 6 - 7 F 7 5 {PTM)
(PTM)
(etc.)
Comment: Variations included in the results of this operation incorporate added pitches (passing tones), metrical recasting and transposition from mode (Aeolian) to key (a minor).
(1 and {2 or 3)) Sample result:
--.Z-J. 2-J J i f
("Supplication", from SAHo p. 45~
STUofboth:L511L72L5
F I~--'.J
i--'. d" _.,'i~
~ i I'F-"FF
PTM field -t- 3 2 1 1 -2- 5 5 - 3 - 4 3 2-4- 3 2 I I - 5 - U I U 2 U t -6- 7 F7 5
Example 5. Variation at the beginning of a melody. Model:
Comment: This example shows how possible variations in the 6th and 7th scale degrees encountered in minor and modal melodies can be handled in this system. ("Columbus", from SOCH, p. 109)
Example3. Exact match using unmeasured pitches. Model:
("Russia", from OSH, p, 107}
PTU field: 1321 L 7 1 3 2 2 5 5 4 3 2 2 1
P T M f i e l d : - l - l - 2 - 3 3 3 2 1 - 3 - 3 3 2 1 L 6 - 4 - 1 2 3 2 1 L6-5-L6 STM field: -1- 1 -2- 33 -3- (etc.) Definition of variant set: (1) - 2 - 3 3 3 2 1 - 3 - 3 3 2 1 L 6 - 4 - 1 2 3 2 1 L 6 - 5 - L 6 (2) -1- 1 (STM) {1) and (2)
(PTM)
M. Dillon, M. Hunter / Automated identification of melodic variants Sample result:
~f " ~ ~ ' ;
; I
-- ....
r i
"': ~.
PTM field : -1- 12 -2- 33321 -3- 332 l L6 -4- 12323 L6 -5- L6 STM field: -t- 1-2- (etc.) Comment: The common practice of adding pitches to an upbeat figure is allowed for in this operation. Example 6. Variation at the beginning, middle and end. Model:
("Bridgewater".fromOSAH.p. 276) PTMfield:-l-131-2"22"3-lL71"4"321-5-43"6"2132 STM field: -1- 13-2- 2 2-3- 1 1 -4- 2-5- 43-6- 22 Definition of variant set:
(1) (2) (3) (4) (5)
-2- 2 2 (PTM) - 4 - 3 2 1 - 5 - 4 3 (PTM) -1- 1 3 (STM) -3- 1 1 (STM) -6- 2 2 (STM)
115
in the form o f its original source, a n d in its e n c o d e d f o r m for the fields used for definining a set o f variants. Next, the set definition is given as d e r i v e d from the relevant portion(s) o f the m o d e l m e l o d y . W h e r e m o r e t h a n one a s p e c t o f the m e l o d y m u s t be c o m b i n e d to represent a v a r i a n t set a c c u r a t e l y , e a c h is n u m b e r e d a n d the c o m b i n a t i o n is e x p r e s s e d using t h o s e n u m b e r s . F o r e x a m p l e , '1 a n d (2 o r 3)' as in E x a m p l e 2 is to be i n t e r p r e t e d to m e a n t h a t a v a r i a n t m u s t exactly m a t c h item (1), a n d m u s t also c o n t a i n either o f items (2) o r (3). T h e t h i r d p a r t o f each e x a m p l e gives a p o s s i b l e result o f the m a t c h i n g o p e r a t i o n , listing the m e l o d y a n d its musical n o t a t i o n , a source if there is one, a n d the e n c o d e d p o r t i o n s o f the m e l o d y r e s p o n s i b l e for its h a v i n g m a t c h e d the set definition. A n e x p l a n a t o r y c o m m e n t follows the e x a m p l e .
Conclusion
(1 and 2) and (3 and 4 and 5) Sample results:
PTM field:-l- 123 1 -2-22-3- 1 U6 1 -4-32 1 -5-43 -6-2 12 STM field: -1- 1 3 -2- 2 2 -3- 1 1 -4- 2 -5- 4 3 -6- 2 2 Comments: The intent here is to restrict variants to exact sequences of pitches in measures 2, 4 and 5, but to allow variations of pitch to occur in measures 1, 3 and 6 so long as they are stressed the same as the source melody. Example 7. Matching melodies by key phrase. Model:
ir-r Fir
A
{"GoodShepherd",fromHH.p. 207) PTP field:.. •/567U176563/ Definition of variant set:
/567 U17 65 63/(PTP) Sample result: ----~
; ~ i ~l -"I .-':"
if, ~.
~5 f~~
('~ReturnAngel"fromOSAH,p. 335) PTP:.../567 U1765653/ .-. Comment: This operation illustrates the matching of discrete phrases regardless of their position in melodies. In this variant definition the pitch pattern was truncated to allow for possible variation at the phrase end, The use of stressed pitches (STU) would yield a wider range of variation in results.
T h e a p p r o a c h r e c o m m e n d e d here m u s t r e m a i n p r o v i s i o n a l until it can be tested on a large file o f s u i t a b l y e n c o d e d music. W i t h o u t s p e c u l a t i n g a b o u t its effectiveness, it us useful to c o n s i d e r h o w an e v a l u a t i o n might proceed. T h e m a t c h i n g o p e r a t i o n is m o d e l e d on general retrieval systems, a n d t w o e v a l u a t i v e criteria have been f o u n d useful for such systems. T h e first is a m e a s u r e o f the precision o f a retrieval. In this context, a retrieval results in t h o s e tunes that satisfy a set definition b a s e d o n an original tune, a n d the precision w o u l d be the p r o p o r t i o n o f tunes retrieved b y the system t h a t are true v a r i a n t s o f the original. T h e s e c o n d is a m e a s u r e o f the recall o f a retrieval. F o r a set definition, the recall w o u l d be the p r o p o r t i o n o f true v a r i a n t s retrieved o f all those t h a t o u g h t to h a v e been retrieved. P r e c i s i o n measures the a c c u r acy o f a r e s p o n s e ; recall m e a s u r e s the c o m p l e t e n e s s o f a response. T h e system m u s t p e r f o r m satisf a c t o r i l y in b o t h a r e a s to be t r u l y effective. I d e a l l y , r e t r i e v a l for a given set definition w o u l d i n c l u d e all a n d only those tunes t h a t were v a r i a n t s o f the original. T h e b o u n d a r y between a tune a n d its true variants, however, a n d tunes that are similar in o n e or a n o t h e r respect but are n o t variants, is so o b s c u r e t h a t an expert consensus c o u l d rarely be r e a c h e d on
1t6
M. Dillon, M, Hunter / Automated identification o f melodic variants
what constitutes the full set of variants for any tune. To the degree that matching operations pinpoint such differences of judgment and shed light on these differences, our understanding of what constitutes the language of musical variation will be extended and refined. A major impetus for developing this system is the existence of a machine-readable file of over 6 000 tunes from the American religious tunebook repertory. The file includes information on the composer, the tune name and the first line of text for each tune. Obtaining suitable encodings of these tunes for automatic matching of variants is a two step operation. To avoid repetitive data input, and to have available in the file an exact representation of the tune, the first step encodes each tune in DARMS, which maintains differences in key, mode, rhythm and meter, and thus cannot be used directly in matching operations. The second step therefore converts the DARMS representation to the five melodic properties detailed above. An experimental system for carrying out this conversion is now being tested for application to the full file. If successful, the method could be applied to other DARMS-encoded files as well. Experience in variant matching with such files might welt disclose weaknesses and suggest modifications in the encodings described above. It may be, for example, that an additional field incorporating interval sequences such as were used by Lincoln would improve variant matching. Such a possibility is one among many that we hope to explore. Critics, cultural historians and aestheticians have all sought to define the particular form of creativity that manifests itself in artistic alteration and elaboration. Ideally, a system for furthering these investigations in the realm of melodic variation should not only be capable of identifying sets of melodies that are variants - a n ambitious goal in i t s e l f - but should also aid in illuminating the musical properties of such a set. In the system described here, an attempt has been made to achieve both of these goals. Its success in automatically recognizing variants can be determined by experimentation with a suitable data base. We expect as well that such experimentation will also shed some light on the question of what constitutes melodic variation.
Acknowledgements The authors wish to acknowledge the many contributions to the research reported here of Professor Daniel Patterson of the University of North Carolina and Margaret Lospinuso, Librarian, of the Music Library, University of North Carolina. Ms. Lospinuso, who directs the ongoing project for automating the shape-note repertory, was instrumental in designing and implementing the system for converting DARMS encoded music to the forms required here for retrieval and has been especially helpful.
References Barlow, Harold and Sam Morgenstern (1975). A Dictionary of Musical Themes (Crown Publishers, New York). Bayard, Samuel P. (1961). Prolegomena to a study of the principal melodic families of folk songs, in: MacEdward Leach and Tristram P. Coffin, Eds., The Critics and the Ballad (SIU Press, Carbondale, IL). Bronson, Bertrand H. (1949). Mechanical help in the study of folk songs, Journal of American Folklore 62, 81-86. Chase, Gilbert (1966). America's Music from the Pilgrims to the Present, 2nd revised edition (McGraw-Hill, New york). Erickson, Raymond, F. (1975). The DARMS project: A status report, Computers and the Humanities 9, 291-298. Jardine, Nicholas and Robin Sibson (1971). Mathematical Taxonomy (John Wiley, London). Keller, Kate Van Winkle and Carolyn Rabson (1971). The National Tune Index Encoding Manual, Typewritten (Potsdam, New York). Lincoln, Harry, B. (1968). Development of Computerized Techniques in Music Research with Emphasis on the Thematic Index (U.S. Office of Education, Bureau of Research, Washington, DC). Scherrer, Deborah K. and Scherrer, Philip H. (t971). An experiment in the computer measurement of melodic variation in folksong, Journal of American Folklore 84, 230-241. Steinbeck, Wolfram (I976). The use of the computer in the analysis of German folksongs, Computers and the Humanities, 10, 287-296. Suchoff, Benjamin (1968). Computerized folk song research and the problem of variants, Computers and the Humanities 2, 155-158.
Key to anthology abbreviations CAS
- Carmina Sacra Enlarged, by Lowell Mason (F.J.
Huntington and Mason, New York, 1852).
M. Dillon, M. Hunter / Automated identification of melodic variants
DUL
- The Dulcimer, by I.B. Woodbury (F.J. Huntington
and Mason, New York, c.1850). HH - The Hesperian Harp, by William Houser (T.K. and P.G. Collins, Philadelphia, PA, 1848). HS - The Harmonia Sacra, by Edward L. White and J.E. Gould (Ditson, Boston, MA, 185l). KNH - T h e Kentucky Harmony, by Ananias Davisson (Harrisonburg, VA, 1820). OSAH - Original Sacred Harp, compiled by the Committee
117
on Revision of the Sacred Harp of the United Sacred Harp of the United Sacred Harp Musical Association (Ruralist Press, Atlanta, 1929, c.1911). SAH - The Sacred Harp, by B.F. White, E.J. King (and D,P. White) 4th ed. (C,P, Byrd, Atlanta). SOCH - The Social Harp, by John G. McCurry (S,C. Collins, Philadelphia, PA, 1859). UH - The Union Harmony, selected by George Hendrickson (Funk, Mountain Valley, VA, t848).