Psychonomic Bulletin & Review 2008, 15 (6), 1035-1053 doi:10.3758/PBR.15.6.1035
THEORETICAL AND REVIEW ARTICLES Semantic processing in “associative” false memory C. J. BRAINERD, Y. YANG, AND V. F. REYNA Cornell University, Ithaca, New York
M. L. HOWE Lancaster University, Lancaster, England AND
B. A. MILLS Cornell University, Ithaca, New York We studied the semantic properties of a class of illusions, of which the Deese/Roediger–McDermott (DRM) paradigm is the most prominent example, in which subjects falsely remember words that are associates of studied words. We analyzed DRM materials for 16 dimensions of semantic content and assessed the ability of these dimensions to predict interlist variability in false memory. For the more general class of illusions, we analyzed pairs of presented and unpresented words that varied in associative strength for the presence of these same 16 semantic properties. DRM materials proved to be exceptionally rich in meaning, as indexed by these semantic properties. Variability in false recall, false recognition, and backward associative strength loaded on a single semantic factor (familiarity/meaningfulness), whereas variability in true recall loaded on a quite different factor (imagery/concreteness). For word association generally, 15 semantic properties varied reliably with forward or backward association between words. Implications for semantic versus associative processing in this class of illusions, for dual-process theories, and for semantic properties of word associations are discussed.
Some years ago, Deese (1959) made a theoretical proposal that has become a stalking horse in the continuing debate between associative and semantic approaches to memory. His proposal was an attempt to explain why normed values of word association are positively correlated with errors on a laboratory memory illusion. The illusion involved 36 cue words (e.g., dark) that Deese had sampled from the Russell and Jenkins (1954) norms and had used to generate a pool of lists, each of which consisted of the first 12 forward associates of one of the cue words (e.g., light, night, room, black, etc.). When subjects performed free recall after studying one of these lists, the intrusion rate for unpresented cue words, called critical distractors or critical lures nowadays, was high; 24% on average. There was also wide variability around this mean—from above 40% for needle, rough, and sleep, to below 5% for butterfly, mutton, and whistle. In line with then-current associationist ideas, Deese (1959) thought that such variability was caused by what later researchers called “backward” association, the ten-
dency of list words (light, night, room, black, etc.) to produce the critical distractor (dark) on tests of word association.1 Here, it is important to remind ourselves that in that era, word association was conceptualized as a verbal analogue to operant conditioning (e.g., Skinner, 1957)—that is, as a primitive and decidedly noncognitive form of verbal learning. Deese’s exact proposal was that intrusions are caused by preexperimental backward associations triggered by memory tests: Outputting list words during recall (e.g., night) automatically triggers associations to words that were not on the list (e.g., dark), which are then output. Again, note the analogy to operant conditioning. Deese’s evidence for this idea was that across lists, the correlation between the intrusion rate and mean backward associative strength (MBAS) accounted for 76% of the variance. He went on to conjecture that another memory illusion of that era, intrusions in repeated recall of narratives (Bartlett, 1932), was due to the tendency of recall to trigger backward associations to unpresented words.
C. J. Brainerd,
[email protected]
1035
Copyright 2008 Psychonomic Society, Inc.
1036
BRAINERD, YANG, REYNA, HOWE, AND MILLS
There are four problems with concluding that the intrusion– MBAS correlation demonstrates that this illusion—now known as the Deese/Roediger–McDermott (DRM) illusion—is due to test-induced backward associations to critical distractors. One problem was acknowledged by Deese (1959) himself; the others, which are more serious, have emerged in recent years. The problem that Deese acknowledged is that other, unmeasured variables may account for significant slices of the variance in intrusion rates. He argued that this possibility could be safely ignored, because the intrusion–MBAS correlation accounted for more than three quarters of the variance. This point is well taken, but contemporary studies (e.g., Roediger, Watson, McDermott, & Gallo, 2001) have shown that the correlation accounts for slightly more than half of the variance, leaving considerable room for other factors to contribute. To modern researchers, a hoary principle that comes to mind here is that associatively related words are semantically related, though most semantically related words are not associatively related (e.g., Anisfeld & Knapp, 1968; Grossman & Eagle, 1970; ThompsonSchill, Kurtz, & Gabrieli, 1998). An example will clarify this principle: Everyone knows that the word salt shares meaning with basil, butter, cinnamon, garlic, herbs, oregano, pepper, and seasoning, but in only one instance could that understanding have been triggered by the fact that the word is an associate of salt. Because DRM materials are constructed from norms of word association, some of the semantic variables prominent in memory research— concreteness, familiarity, meaningfulness, pleasantness, synonymy, and the like—may simultaneously explain variability in the illusion and variability in MBAS. We explore that possibility in the sequel. The second problem with concluding that the intrusion–MBAS correlation shows that the DRM illusion is caused by test-induced backward associations is more serious. This notion cannot explain why critical distractors also display high false alarm rates on recognition tests. Some years after Deese’s (1959) article appeared, Roediger and McDermott (1995) found that recognition tests yield high false alarm rates for critical distractors (roughly twice their intrusion rates). More generally, it has been known since a classic paper by Underwood (1965) that false alarm rates are elevated when distractors are associates of studied targets (see also Anisfeld & Knapp, 1968). Deese’s account is not suited to explaining such data because the administration of recognition probes is random. Hence, although an intrusion is typically preceded by the recall of list words that could trigger backward associations, a critical distractor probe is rarely preceded by a target probe that could trigger a backward association to that distractor. The third problem, which is more serious still, is that recall and recognition data disconfirm obvious predictions of the hypothesis that errors are caused by test-induced backward associations. Concerning recall data, under this hypothesis the intrusion rate ought to be positively correlated with the rate of true recall, because outputting more list words necessarily triggers more backward associations to the critical distractor (Roediger et al., 2001).
However, a consistent finding in the literature is that the intrusion rate is not positively correlated with the level of true recall, and indeed, negative correlations predominate (see Gallo, 2006). Concerning recognition data, under this hypothesis the false alarm rate ought to be higher when critical distractors are preceded by targets to which they are backward associates than when they are preceded by targets to which they are not backward associates. However, DRM false alarm rates are not increased by this manipulation (Gunter, Ivanko, & Bodner, 2005). This brings us to the fourth problem. The second and third difficulties can be accommodated if Deese’s (1959) hypothesis is reformulated to include associative priming at study—that is, if intrusions and false alarms are assumed to be caused by backward associations from targets that automatically primed critical distractors as lists were being presented. Considering the difficulties with Deese’s proposal, it is not surprising that the reformulated hypothesis, first suggested by Underwood (1965), figures centrally in recent associative accounts of the DRM illusion, especially Roediger et al.’s (2001) activation/monitoring theory, and Howe’s (2006, 2008a) associative-activation theory. However, there is an empirical impediment: The DRM illusion is exceptionally long-lived (for a review, see Gallo, 2006), whereas associative priming is transitory. On the former point, it is well established that elevations in intrusion and false alarm rates persist for weeks after lists are studied (e.g., Seamon, Luo, Kopecky, et al., 2002; Toglia, Neuschatz, & Goodwin, 1999). On the latter point, a familiar finding of lexical decision experiments is that priming of unpresented associates dissipates rapidly and is no longer detectable a few seconds after a target word is presented (Dannenbring & Briand, 1982; Joordens & Besner, 1992; Masson, 1991). It has been proposed (e.g., Tse & Neely, 2005) that studying DRM lists produces more stable priming, owing to the fact that there is massed presentation of associates; some investigators have found that such lists produce critical distractor priming that lasts for more than 1 min (e.g., Tse & Neely, 2005). However, other investigators (e.g., McKone, 2004; Zeelenberg & Pecher, 2002) have found that studying DRM lists produces no priming of critical distractors. A recent study by Cotel, Gallo, and Seamon (2008) found that automatic priming of unpresented associates could be detected after a single DRM list had been studied, but it could no longer be detected when tests were administered a few minutes later, after several lists had been studied (a standard design feature of experiments in which the illusion’s long-term stability has been measured). The fourth problem, then, is simply this: Even if studying DRM lists produces automatic priming that lasts for several minutes, the illusion persists for weeks. Toglia et al. (1999), for instance, reported that the illusion was as strong after 3 weeks as it was immediately following list presentation. There is no theoretical conception known to us that predicts that studying a few words that are associatively related to some other word yields priming that does not dissipate for weeks.2 This has led many investigators (e.g., Brainerd & Reyna, 2005; Payne, Elie, Blackwell, & Neuschatz, 1996) to posit that stable gist memories of semantic content must
SEMANTIC PROCESSING underlie the DRM illusion and other tasks in which unpresented associates are falsely remembered. In addition to eliminating the fourth problem, this hypothesis eliminates the first problem, because semantic variables would simultaneously predict variability in the DRM illusion and in MBAS; the second problem, because the same gist memories of semantic content would be the basis for false alarms and intrusions; and the third problem, because true recall would be based on memories of different content than false memory. However, this hypothesis demands a fuller understanding of the semantic properties of DRM materials, which is another objective of the present article. To sum up the discussion so far: The idea that intrusions are due to backward associations that are fired as targets are recalled (Deese, 1959) conflicts with the finding that true and false recall are not positively correlated, and even if they were, this would not explain why critical distractor false alarm rates are also high. Both problems can be handled by the more recent proposal that errors are caused by automatic priming of critical distractors as lists are studied, but this notion conflicts with the fact that priming is transitory, whereas intrusion and false alarm rates remain elevated for weeks after lists are studied. All of these problems would be resolved if semantic factors were responsible for the DRM illusion. However, that hypothesis requires (1) articulation of the semantic properties of this paradigm, and (2) articulation of the relations between those properties and false memory, true memory, and MBAS. Those were key aims of the investigation that we describe in the remainder of this article. OVERVIEW OF THE INVESTIGATION The present investigation grew out of an important study by Roediger et al. (2001). That study addressed the first of the four problems discussed above (that variables other than MBAS may predict variability in the DRM illusion) by exploring other potential predictors of variability in false alarm and intrusion rates among 55 DRM lists. Specifically, the predictive power of critical distractors’ length, critical distractors’ frequency, critical distractors’ concreteness, mean forward associative strength (MFAS), mean list connectivity, and mean list true recall probability were examined. (MFAS is the average probability that list words are given as forward associates of critical distractors on tests of word association, and mean connectivity is the average number of other list words that are given as forward associates of each list word on such tests.) All of these variables were entered as predictors of false alarm and intrusion rates in multiple regressions. There were three principal results. First, a new predictor of errors was identified: True recall accounted for 21% of the variance in false recognition and 12% of the variance in false recall. In both instances, it was a negative predictor, which of course runs counter to the notion that test-induced associative priming causes errors.3 Second, MBAS was a positive predictor of both types of errors, accounting for 13% of the variance in false recognition and 36% in false recall. Third, none of the other variables accounted for unique variance.
1037
Although the first result runs counter to Deese’s (1959) original hypothesis, suggesting as it does that associative priming suppresses the illusion, the result brings the DRM task in line with dual-process conceptions of false memory (Brainerd & Reyna, 2005). In such models, the retrieval processes that foment true and false memory are different, with processes that access memories of actual target presentations being especially important in true memory. On its face, true recall is an index of this verbatim ability. The second result is also probative because it demonstrates that MBAS can account for list variability in false recognition as well as false recall. Roediger et al. (2001) went on to argue that the correlation between MBAS and errors supports the hypothesis that the DRM illusion is due to associative activation on study or test trials. Conceptually, the argument is that backward associations foment false alarms and intrusions because the processes that activate words when subjects are instructed to freely associate to word cues occur automatically without instruction as they study DRM lists and respond to memory tests (see also Gallo, 2006). This interpretation of the correlation between MBAS and false recall/recognition still conflicts with the fact that these errors are extremely stable whereas automatic priming is transitory, and it fails to explain why true recall is not a positive predictor of false recall/recognition. Roediger et al.’s (2001) third result, that other factors do not predict list differences in errors, does not bear directly on the question of whether semantic properties of DRM materials are responsible for these errors. With the exception of critical distractors’ concreteness ratings, none of the new predictor variables was a dimension of meaning content. We filled this gap by conducting a semantic scoring of critical distractors and list words, using familiar objective dimensions of meaning content. Intuitively, generating lists by selecting forward associates of critical distractors ought to result in materials that are rich in meaning content and that therefore stimulate intense semantic processing. However, objective metrics are needed to give quantitative expression to that hunch. We scored the 55 lists in Roediger et al.’s pool for specific dimensions of meaning, which enabled us to construct semantic profiles of DRM lists. Before presenting the results, we briefly recount those dimensions and two other features of the investigation. Semantic Variables We expanded the roster of predictors to include three groups of variables that are expressly semantic. One consisted of the seven dimensions of Toglia and Battig’s (1978) semantic word norms (familiarity, meaningfulness, concreteness, imagery, categorizability, number of attributes, pleasantness).4 For each DRM list, scores for these variables were included for the critical distractor and for each of the 15 targets. A second group of semantic variables measured the emotional content of DRM materials—specifically, the three dimensions of Bradley and Lang’s (1999) emotion word norms (arousal, dominance, valence). For each DRM list, scores for all three emotional variables were included for the critical distractor and for each of the 15 targets. The third group of semantic
1038
BRAINERD, YANG, REYNA, HOWE, AND MILLS
Table 1 Summary of the Scoring Rubric for Wu and Barsalou’s (2008) Four Conceptual Relations Relation Entity properties
Introspective properties Situation properties Taxonomic properties
Rubric External component; external surface feature; internal surface feature; entity behavior; quantity; systemic feature; larger whole; spatial relation Affect/emotion; evaluation; contingency; representational state; quantity; negation Function; action; participant; location; origin; time; manner; associated entity Superordinate; subordinate; individual; coordinate
variables came from a model of six conceptual relations between pairs of words (such as the critical distractors and targets of DRM lists) developed by Wu and Barsalou (2008): synonymy, antonymy, taxonomy, entity relations, introspective relations, and situational relations. For each conceptual relation (cf. Table 1), there was a single score for each list, because each measures the number of times that the relation is present in the 15 possible pairings of the critical distractor with the target words. These semantic variables were chosen for a mix of empirical and theoretical reasons. Concerning the synonym/ antonym relations, these variables were included on the basis of many prior experiments with non-DRM lists in which semantic false-recognition effects have been detected for distractors that are synonyms or antonyms of targets (e.g., Anisfeld & Knapp, 1968; Brainerd, Reyna, & Mojardin, 1999; Fillenbaum, 1969; Grossman & Eagle, 1970; Underwood, 1965). The fact that such effects have been observed raises the possibility that synonymy or antonymy may predict variability in the DRM illusion. The other conceptual relations in the Wu and Barsalou (2008) model were included because when they are combined with synonymy and antonymy, the result is a reasonably comprehensive roster of semantic relations between critical distractors and targets. On the practical side, the work of scoring the 55 Roediger et al. (2001) lists for these relations and for establishing interrater agreement has already been completed by Cann, McRae, and Katz (2006), although we conducted a separate scoring in order to determine whether interlaboratory agreement could be achieved. The Toglia and Battig (1978) dimensions were included because of their historical importance in the larger memory literature. The Toglia and Battig norms bring together meaning dimensions that are central to classical theories of memory (e.g., Craik & Lockhart, 1972; Paivio, 1971), and for that reason, they are one of the most widely used tools in memory research (e.g., Starns & Hicks, 2005; Wenger & Townsend, 2006). Variables such as concreteness, imagery, and pleasantness have been extensively studied in connection with true memory, so we thought it likely that some of them might predict variability in DRM true recall. For theoretical reasons (Brainerd & Reyna, 2005), we thought it particularly likely that familiarity or meaningfulness might predict variability in intrusions and false alarms.
The third group of semantic variables—emotional arousal, dominance, and valence—can be measured for critical distractors and targets by finding the words’ values on these dimensions in the affective norms for English words (ANEW; Bradley & Lang, 1999). These variables, too, were included chiefly because of their prominence in memory research. There is a burgeoning literature that deals with how the emotional content of events influences memory (for a recent review, see Kensinger, 2004). A common finding has been that negatively valenced words are recalled better than are positive or neutral words (Rivers, Reyna, & Mills, 2008), suggesting that valence might explain some of the interlist variance in DRM true recall. Recently, research of this ilk has been extended to false memory, and the ANEW norms have been used to devise word lists that separate the influences of arousal, dominance, and valence (e.g., Budson et al., 2006; Howe, 2007). Other findings show that DRM false recognition and false recall are affected by subjects’ emotional states (Corson & Verrier, 2007; Storbeck & Clore, 2005), which suggests that these variables might explain some of the interlist variance in the DRM illusion. Once scores for all of the aforementioned variables had been obtained for each of the lists in Roediger et al. (2001), the analysis unfolded in two phases. First, we constructed an objective semantic profile of the lists—a summary of how they stack up on the semantic variables. The aim was to determine whether the density of semantic content is consistent with the extremely long-lived nature of the illusion. Second, we conducted an exploratory factor analysis of lists’ scores on the aforementioned variables, together with their respective levels of true recall, false recall, and false recognition. We replaced the multiple regressions that Roediger et al. conducted with factor analyses to eliminate problems of multicolinearity that would arise with such a large number of variables. A single factor analysis allowed us to determine the structure of the entire set of variables and to answer several key questions simultaneously, such as the factors that DRM false memory loads on and whether they are semantic, whether false recall and false recognition load on the same or different factors, and the factors that MBAS loads on and whether they are semantic. Additional Data Sources for Recognition A second feature of this investigation is the analysis of additional DRM data. A potential limitation of Roediger et al.’s (2001) data set, which these authors commented upon, is that the recognition results may have been contaminated by the fact that the recognition tests were administered after subjects had recalled the lists. Roediger et al. suggested that this confounding was not problematic for their analysis, and they cited data subsequently published by Gallo and Roediger (2002). Gallo and Roediger compared recognition performance on 28 DRM lists and found that the correlation between the false alarm rates for critical distractors, with and without prior recall, was .90. Despite this finding, prior recall may produce process-level effects that carry over to recognition, such as strengthening verbatim or gist memory for DRM lists,
SEMANTIC PROCESSING creating additional verbatim memories of recalled items, and shifting subjects’ decision criteria. It is conceivable that these effects could alter the relations observed between predictor variables and interlist differences in false recognition of critical distractors. To deal with this potential problem, we conducted a follow-up analysis, in which the false alarm rates in the Roediger et al. (2001) data set were replaced by recognition data from other sources. The replacement data came from two studies. The first generated recognition data for 36 of the 55 Roediger et al. lists, for which the false alarm rates used by Roediger et al. were originally reported by Stadler, Roediger, and McDermott (1999). In our study, 190 subjects (undergraduates) listened to subsets of 18 of the 36 Stadler et al. lists, then responded to recognition tests, so that false alarm rates were not contaminated by prior recall. The procedure involved two steps: (1) Subjects listened to 9 of the 18 lists and responded to a recognition test like that described by Stadler et al.; and (2) subjects listened to the remaining 9 lists and responded to a second test. The second study used the same procedure to generate recognition data for the remaining 19 Roediger et al. lists. A total of 93 subjects (undergraduates) listened to 10 of these lists, responded to a recognition test like that described by Stadler et al., listened to the remaining 9 lists, and responded to a second test. These additional data were used to determine whether interlist variability in false recognition is seriously distorted by prior recall tests. Semantic Analysis of Word Pairs From the Nelson, McEvoy, and Schreiber (1999) Norms A third feature of this investigation is the analysis of an additional data set derived from the Nelson, McEvoy, and Schreiber (1999) norms of word association. The question of associative versus semantic processing in the DRM illusion is a special case of the more fundamental question: What are the specific semantic consequences of word associations? The general principle that associative relations necessarily mean semantic relations has often been enunciated (e.g., Anisfeld & Knapp, 1968; Grossman & Eagle, 1970; Mandler, 1962), and demonstrations of false-memory effects for items semantically but not associatively related are commonplace (for a review, see Brainerd & Reyna, 2005). However, what are the specific semantic consequences of word associations? DRM lists do not provide an optimum answer, because they are constructed via a highly constrained procedure. To answer this question, we constructed a stratified sample of 400 cue–target word pairs, using the Nelson et al. (1999) norms. We sampled pairs as follows: (1) 100 pairs with very strong cue-to-target associations (association probabilities in the .57 to .97 range); (2) 100 pairs with strong cue-to-target associations (association probabilities in the .27 to .56 range); (3) 100 pairs with moderate cue-to-target associations (association probabilities in the .07 to .26 range); and (4) 100 pairs with weak cue-totarget associations (association probabilities in the .01 to .06 range). Next, we scored the 400 pairs for the variables used in our study of DRM lists, and we also recorded the forward (target-to-cue) association probability for each
1039
pair. With this more representative sample of words, we identified the specific semantic dimensions that covary with strength of word associations. METHOD Conceptually, there were two types of variables in this study, criterion variables and predictor variables, the general aims being to use the latter to construct a semantic profile of DRM materials, to determine the factorial structure of the complete variable set, and to identify the factors that true and false memory load on. We describe the criterion variables first, followed by the predictor variables. Criterion Variables The criterion variables in some analyses were false recognition of critical distractors, false recall of critical distractors, and true recall of list words, with scores for these variables being obtained for the pool of 55 DRM lists from Appendix B of Roediger et al. (2001). The criterion variables for other analyses were true recognition of list words and false recognition of critical distractors, with scores for these variables being obtained from the recognition-only data sets described earlier. Predictor Variables There were four groups of predictor variables: (1) the dimensions of the Toglia and Battig (1978) semantic word norms, (2) the dimensions of the ANEW emotion word norms (Bradley & Lang, 1999), (3) the six interword conceptual relations of the Wu and Barsalou (2008) model, and (4) the nonsemantic predictors of Roediger et al. (2001). Toglia–Battig dimensions. In Toglia and Battig’s (1978) norms, subjects rated 2,854 words on seven semantic dimensions, from 1 (lowest) to 7 (highest): (1) categorizability (the ease with which a word can be placed in a taxonomic category); (2) concreteness (the extent to which a word refers to objects that can be perceived with the senses); (3) familiarity (how common a word is, in subjects’ experience); (4) imagery (how readily a word arouses sensory images); (5) meaningfulness (the ease with which a word can be associated with other words); (6) number of features (how many attributes or properties characterize a word); and (7) pleasantness (the extent to which a word provokes positive feelings). For the 55 DRM lists, values of these dimensions for each critical distractor and target were obtained from the Toglia and Battig norms. The values of the critical distractor on the seven dimensions and the mean values of the corresponding targets on the seven dimensions were used as predictor variables. Thus, 14 variables were generated in all, 7 for critical distractors and 7 for targets. ANEW dimensions. In the ANEW norms (Bradley & Lang, 1999), subjects rated 1,000 words on a 1 (lowest) to 9 (highest) scale with respect to emotional arousal, dominance, and valence. The specific dimensions on which subjects were instructed to rate words were (1) feelings of excitement (low end) to feelings of calmness (high end), (2) feelings of being completely controlled or submissive (low end) to feelings of being in control or dominant (high end), and (3) feelings of happiness (low end) to feelings of sadness (high end). For the DRM lists, values of arousal, dominance, and valence for critical distractors and targets were obtained from these norms. For each list, the tabled value of the critical distractor for each of the three dimensions and the mean of the value for the corresponding targets were used as predictor variables; that is, there were six predictor variables generated from the ANEW norms, three for critical distractors and three for targets. Although all 55 critical distractors appear on the ANEW norms, some of the targets do not. For lists for which some targets were not on the ANEW norms, the mean values for arousal, dominance, and valence were based on the residual targets that appeared on the norms. Wu–Barsalou dimensions. We scored the 15 critical distractor– target pairs of each of the 55 lists for whether the two words were synonyms (0, no; 1, yes) or antonyms (0, no; 1, yes). Scorers used available archives of synonyms and antonyms, and interrater agree-
1040
BRAINERD, YANG, REYNA, HOWE, AND MILLS
Table 2 Descriptive Statistics for the New Semantic Predictor Variables Variables
M
SD
5.20 5.12 6.32 5.40 4.84 4.17 4.31 4.96 5.32 5.64
0.86 1.04 0.39 0.77 0.57 0.75 1.08 0.91 0.64 1.33
5.88 5.95 6.48 5.93 5.41 4.94 5.23 7.14 6.28 7.32
5.04 5.04 6.23 5.26 4.66 4.00 4.28 4.86 5.11 5.49
0.52 0.59 0.15 0.42 0.28 0.36 0.52 0.74 0.51 1.24
Distractor–Target Relations Antonymy 0 3.00 Synonymy 0 7.00 Entity relations 0 13.00 Introspective relations 0 10.00 Situational relations 0 12.00 Taxonomic relations 0 12.00
0.44 1.02 2.65 1.65 5.22 3.64
0.76 1.53 3.03 2.35 3.17 3.15
Categorizability Concreteness Familiarity Imagery Meaningfulness Number of attributes Pleasantness Arousal Dominance Valence Categorizability Concreteness Familiarity Imagery Meaningfulness Number of attributes Pleasantness Arousal Dominance Valence
Minimum
Maximum
Critical Distractors 3.13 6.47 2.66 6.36 4.50 6.85 3.27 6.38 3.34 6.07 2.79 5.83 2.24 6.22 2.80 7.63 3.79 7.38 2.13 8.13 Targets 3.78 3.33 5.73 4.10 4.15 3.22 2.76 3.37 3.87 2.61
ment was perfect (i.e., 1.0). For the other four conceptual relations of the Wu and Barsalou (2008) taxonomy, Cann et al. (2006) previously scored these pairs with respect to entity, introspective, situational, and taxonomic relations. They developed a scoring rubric (see their Appendix A), which we summarize in Table 1. For each of the 55 lists, we scored all 15 pairs for whether the words were taxonomically related (0, no; 1, yes), whether there was an entity relation between them (0, no; 1, yes), whether there was an introspective relation between them (0, no; 1, yes), and whether there was a situational relation between them (0, no; 1, yes). In the Cann et al. study, two raters scored the 825 DRM distractor–target pairs, with the level of interrater agreement being 98.5%. To measure interlaboratory agreement, we independently scored these 825 pairs, using Cann et al.’s rubric, then correlated the two scorings for each of the four relations, across the 55 lists. The average correlation was .97. Because the results of the two scorings were in such close agreement, we simply used the values for the Wu–Barsalou conceptual relations reported by Cann et al. in the analyses discussed below.5 Roediger et al. (2001) variables. The original set of predictors consisted of MBAS, length of critical distractors, frequency of critical distractors, mean forward associative strength (MFAS), mean list connectivity, and mean list true recall probability. (Critical distractor concreteness, discussed above, was also included.) MBAS, connectivity, and MFAS values were obtained from Appendix B of Roediger et al. (2001), as were the critical distractor frequency and length values. For each of the 55 lists, we added values for mean frequency and mean length of targets.
RESULTS The results are reported in three stages. First, we use descriptive statistics for the new predictor variables to con-
struct an overall semantic profile of DRM lists. Next, two factor analyses are reported, one for the original Roediger et al. (2001) data, and one for the recognition-only data. Last, we report descriptive and inferential statistics for the relation between MBAS and specific forms of semantic processing. Here, we examine findings for the 55 DRM lists of Roediger et al. and for the sample of word pairs from the Nelson et al. (1999) norms. Semantic Profile of DRM Lists Descriptive statistics for the new semantic variables are reported in Table 2; the statistics for the Roediger et al. (2001) variables are reported in Table 3. According to Table 2, using word associations to construct lists has profound semantic consequences: DRM lists are rich in meaning content at all levels (distractor–target semantic relations, intertarget semantic relations, and semantic properties of individual words). These lists are dense with semantic relations between critical distractors and targets. Mean values for the six Wu–Barsalou conceptual relations are reported at the bottom of Table 2. These values are the average numbers of distractor–target pairs, out of 15, that display each of these relations. One way to express the level of distractor–target semantic relatedness is simply to add these means, thereby generating an index of total relatedness without regard to specific relations. The sum of the means is the average number of relations per DRM list. The resulting number, 14.62 out of a possible high score of 15, means that virtually all of the 825 DRM distractor–target pairs display one of the six semantic relations. The modal tendency exhibited by 39 of the 55 lists was for all 15 distractor–target pairs to exhibit one of these six semantic relations. Of the remaining 16 lists, 14 pairs exhibited one of the relations on 13 lists, 13 pairs exhibited one of the relations on 2 lists, and 11 pairs exhibited one of these relations on 1 list. The most common relations were situational (5.22 pairs per list), taxonomic (3.64 pairs per list), and entity (2.65 pairs per list). Further evidence of DRM semantic relatedness is apparent from the mean values of the Toglia and Battig (1978) dimensions (Table 2) and from the mean connectivity value (Table 3). Concerning the former, remember that two of the Toglia–Battig dimensions, categorizability and meaningfulness, involve relations between words. For critical distractors, the pool of referent words on which categorizability and meaningfulness are based includes DRM targets, thereby providing further measures of distractor–target relatedness. For DRM targets, the pool Table 3 Descriptive Statistics for the Predictor Variables of Roediger et al. (2001) Variables Distractor length Mean target length Distractor frequency Mean target frequency MBAS MFAS Connectivity
Minimum 3.00 4.07 2.00 11.93 0 0.01 0.47
Maximum 1,209.00 1,206.47 1,207.00 1,891.47 1,200.43 1,200.06 1,204.93
M 5.26 5.22 113.58 112.75 0.13 0.04 1.64
SD 1.42 0.62 199.92 140.87 0.10 0.01 0.82
SEMANTIC PROCESSING of referent words on which categorizability and meaningfulness ratings are based includes other DRM targets and critical distractors, thereby providing measures of intertarget relatedness as well. Thus, categorizability and meaningfulness for distractors are indexes of distractor–target relatedness, whereas the corresponding values for targets are indexes of intertarget and distractor–target relatedness. The mean categorizability and meaningfulness of the 55 critical distractors are 5.20 and 4.84, respectively. Both values are nearly a full standard deviation (SD) above the means for Toglia and Battig’s word pool (4.33 and 4.03); t tests revealed that mean critical distractor categorizability is reliably higher than the corresponding mean for Toglia and Battig’s word pool [t(54) 7.59, p .0001], and that mean critical distractor meaningfulness is also reliably higher than the corresponding mean for Toglia and Battig’s word pool [t(54) 13.21, p .0001]. Similarly, the mean target categorizability and meaningfulness values, 5.04 and 4.66, are three quarters of an SD above the corresponding means for Toglia and Battig’s word pool. Both of these differences are also reliable [t(54) 8.92, p .0001, and t(54) 17.07, p .0001, respectively]. In short, DRM critical distractors and targets are perceived to be strongly related to other words. The story is the same for the connectivity variable (Table 3), another index of intertarget relatedness. For a given DRM target, this variable measures how many of the remaining 14 words are forward associates of that target. The mean value of 1.63 signifies that, on average, each target on each DRM list is a forward associate of 1 or 2 other targets on that list, a high level of intertarget connectivity. This can be demonstrated by computing connectivity values for random samples of 15 words. To generate some data of this sort, we took 100 samples of 15 items from among the 2,854 words in the Toglia and Battig (1978) norms and computed the connectivity value for each of these sets of items. The modal connectivity was 0 and the mean was .1. Thus, the level of intralist connectivity displayed by DRM lists is very high. Earlier, we commented that DRM lists’ semantic richness extends to meaning properties of individual words. We were referring to critical distractors’ and targets’ status on the remaining five dimensions of the Toglia and Battig (1978) norms (concreteness, familiarity, imagery, number of attributes, and pleasantness). In Table 2, mean values of these dimensions are reported separately for critical distractors and targets. For all five dimensions, the means for distractors and targets are well above the corresponding means in the Toglia and Battig norms (cf. Table 1 in Toglia & Battig, 1978). On average, these means are 0.6 SDs above the corresponding normed means, so that roughly 70% of the 2,854 words in the norming sample have lower levels of concreteness, familiarity, imagery, number of attributes, and pleasantness than DRM critical distractors and targets do. A series of t tests showed that all five means for distractors and all five means for targets were reliably higher than the corresponding Toglia–Battig means were. Finally, with respect to emotional content, remember that 5 is the midpoint of the rating scales for arousal, dominance, and valence. As can be seen in Table 2, the means
1041
for critical distractors and targets on these dimensions were close to this midpoint, the average of the six means being 5.23. Thus, with respect to emotional content, as a group DRM critical distractors and targets are (1) moderately arousing, (2) neutral with respect to valence (i.e., positive–negative), and (3) moderately dominant. To sum up, when lists are constructed by selecting the most common associates of cue words, the resulting materials are very rich in semantic content. The objective profile of DRM lists is one of high levels of semantic relatedness between the members of distractor–target pairs and among targets. It is also one of high degrees of word-level semantic properties. From the perspective of explaining the DRM illusion, the semantic richness of these materials is a fundamental consideration, for three reasons. First, when it comes to baseline DRM false memory, a core fact is that baseline levels are very high, compared with more traditional tasks that involve word lists. Although it is true that there is interlist variability in false memory, as Deese (1959) showed, it is equally true that such variability occurs against a backdrop of false memory that is high in absolute terms: The mean probability that critical distractors are falsely recognized is .59, and the mean probability that they are falsely recalled is .30. These values are four to five times those for more traditional word-list tasks (e.g., Bjorklund & Muir, 1988; Gillund & Shiffrin, 1984; Tussing & Greene, 1999). Second, the semantic richness of DRM materials supplies the sort of theoretical leverage needed to account for the long-lived nature of the illusion. As noted, intrusion and false alarm rates remain elevated for weeks following exposure to DRM lists (e.g., Seamon, Luo, Kopecky, et al., 2002; Toglia et al., 1999); as also noted, this is inexplicable via automatic associative priming, because priming is transitory. In contrast, gist memories of semantic properties are stable over retention intervals of the lengths used in DRM research (e.g., Kintsch, Welsch, Schmalhofer, & Zimny, 1990; Reyna & Kiernan, 1994, 1995). Because, as we have seen, DRM lists trigger intense semantic processing, it is natural that the illusion should persist over such intervals, as long as semantic processing is the basis for the illusion. Third, there is now an established semantic context for interpreting predictors of interlist variability in the DRM illusion–specifically, that variability occurs, given high baseline levels of semantic processing. Variables that predict interlist variability are not, in themselves, a basis for conclusions about the essential nature of the DRM illusion, because variability in those predictors occurs against a backdrop of intense semantic processing. Thus, the theoretical meaning of the interlist variability question is different than has been previously supposed. It is actually a question about what types of processing account for residual variance in the DRM illusion, given high baseline error rates and high levels of semantic processing. Factor Analyses In this section, we report the factor structure of the variables in Tables 2 and 3 and use that structure to explain interlist variability in false recall, false recognition, true
1042
BRAINERD, YANG, REYNA, HOWE, AND MILLS
recall, and MBAS. The first analysis was of the Roediger et al. (2001) data set, and the second was of our recognitiononly data. We used exploratory rather than confirmatory factor analysis, because theoretical hypotheses or prior data would be necessary to justify the latter. Although theoretical hypotheses could be formulated, it seemed unwise to generate results that would be dependent on the validity of such conjectures. Also, exploratory factor analysis preserves the spirit of the hypothesis-free multiple regression approach that Roediger et al. adopted. Before proceeding, two methodological comments are in order. First, whereas Roediger et al. (2001) used proportions of true recall, false recall, and false recognition as criterion variables, our factor analyses used the familiar logit transformation of these variables, logit( p) log [ p/(1p)], where p is a proportion. Multiple regression and factor analysis are model fitting procedures—explicitly, procedures for fitting linear models—and the use of proportions can be problematical, especially when the assumption of homogeneous variance of residuals is violated (e.g., Papke & Wooldridge, 1996). The traditional solution (e.g., Armitage, Berry, & Matthews, 2002) is to perform logit transformations, which eliminate such problems by mapping proportions onto the real number line. Second, following standard practice in research on interlist differences in the DRM illusion, the results for false recognition involved the logit transformation of false alarm probabilities for critical distractors, rather than statistics such as d or A that take account of the false alarm probabilities for unrelated distractors (i.e., critical distractors for unpresented DRM lists). The use of the critical distractor false alarm probability itself, rather than statistics that take account of the false alarm probability for unrelated distractors, is standard practice (e.g., Roediger et al., 2001; Stadler et al., 1999), for two reasons. First, the false alarm probability for unrelated distractors is normally quite low, which is sensible considering that DRM lists are thematic. (It is sensible that after studying, say, the angry, bread, rubber, spider, and window lists, subjects would not be strongly inclined to accept, say, cold, cup, needle, or shirt.) When DRM lists were presented at a rate of 1 or 3 sec, Gallo and Roediger (2002) found that the false alarm rate for critical distractors from unpresented lists was only .09. Second, false recognition of critical distractors for unpresented lists does not exhibit interlist variability. In the Gallo and Roediger study, the false alarm probability for “strong” lists (higher levels of false recall/recognition) was .11, and the corresponding probability for “weak” lists (lower levels of false recall/recognition) was .06. Likewise, Brainerd, Forrest, Karibian, and Reyna (2006) found that both of these probabilities were .09. Exploratory factor analysis of the Roediger et al. (2001) data set. We conducted a principal components analysis with orthogonal (varimax) rotation, the most commonly used form of exploratory factor analysis in psychological research (Fabrigar, Wegener, MacCallum, & Strahan, 1999). Ten factors were extracted, using the standard eigenvalue cutoff of 1. Beginning with Factor 1 and ending with Factor 10, the percentages of variance ac-
counted for were 20%, 14%, 12%, 10%, 6%, 5%, 5%, 4%, 3%, and 3%. The total explained variance was 81%. Only the first and third factors are of focal interest here, because these are the ones on which false recall/recognition, true recall, and MBAS loaded. The rotated loadings of the measured variables are reported in Table 4. For convenience of interpretation, the variables are organized into five blocks: (1) DRM criterion variables (false recall, false recognition, and true recall); (2) the seven Toglia and Battig (1978) predictor variables (for both critical distractors and targets); (3) the three ANEW predictor variables (for both critical distractors and targets); (4) the six Wu and Barsalou (2008) predictor variables; and (5) the seven Roediger et al. (2001) predictor variables (with length and frequency for targets as well as critical distractors). Following the usual convention in factor analysis, only factor loadings .40 are regarded as significant, and those loadings appear in bold type in Table 4. Variability in DRM performance is explained by Factors 1 and 3, because those are the factors on which true recall and false recall/recognition loaded. Inspection of the variable loadings reveals a pattern remarkably consistent with dual-process accounts of DRM performance (Barnhardt, Choi, Gerkens, & Smith, 2006; Brainerd, Payne, Wright, & Reyna, 2003; Brainerd, Wright, Reyna, & Payne, 2002; Payne et al., 1996). In such conceptions, true recall and false recall/recognition are assumed to involve dissociated representations, with true recall being based primarily on the retrieval of verbatim traces of targets’ surface forms, and false recall/recognition being based primarily on the retrieval of gist traces of meaning content. Consistent with the general idea that dissociated representations underlie true and false memory, it can be seen that true recall and false recall/recognition loaded on different factors. True recall loaded on factor 1, whose interpretation is straightforward and consistent with the more specific proposal of dual-process theories that memory for surface features is involved. This is an imagery/concreteness factor, because the variables that loaded most highly on this factor are target imagery/concreteness and critical distractor imagery/ concreteness. It is important to remember that in the Toglia and Battig (1978) norms, a word’s imagery rating reflects its ability to arouse visual or auditory images of the named object, and its concreteness rating reflects its ability to bring to mind things that can be seen, heard, touched, smelled, or tasted. Target categorizability and critical distractor categorizability also loaded positively on Factor 1. That is not surprising, because categorizability ratings correlate highly with both imagery ratings (r .91) and concreteness ratings (r .89). A likely reason is that in the categorizability instructions that subjects were given, the word used to illustrate high categorizability (buffalo) was very concrete, but the word used to illustrate low categorizability (relevant) was very abstract. The other variable that loaded positively on Factor 1 is Wu–Barsalou taxonomic relatedness, suggesting that Toglia–Battig categorizability ratings reflect true categorizability. Turning to false memory, consistent with the notion that intrusions and false alarms involve the retrieval of gist traces of meaning content, false recall and false rec-
SEMANTIC PROCESSING
1043
Table 4 Loadings on the 10 Factors for the Variables That Were Measured for the 55 Roediger et al. (2001) Lists Factor 1
2
3
F Recog F Recall T Recall
.27 .04 .43
.17 .11 .25
.74 .86 .34
CD Fam CD Mng CD Con CD Img CD Cat CD Noa CD Pls TAR Fam TAR Mng TAR Con TAR Img TAR Cat TAR Noa TAR Pls
.09 .15 .84 .83 .85 .17 .16 .10 .05 .88 .91 .88 .04 .21
.03 .22 .02 .02 .08 .36 .78 .04 .03 .10 .08 .12 .16 .80
CD Arousal CD Valence CD Domn TAR Arousal TAR Valence TAR Domn
.15 .08 .08 .32 .07 .00
.23 .81 .63 .29 .91 .91
.04 .05 .01 .17 .02 .06
Synonymy Antonymy Taxonomy Entity Situation Introspect
.43 .23 .43 .37 .01 .58
.20 .25 .01 .09 .01 .06
4
5
7
8
.18 .02 .27
.01 .09 .20
.19 .08 .25
.15 .08 .07
.16 .07 .16
Toglia–Battig Variables .77 .14 .22 .06 .64 .48 .19 .19 .13 .09 .16 .03 .21 .01 .08 .04 .13 .05 .02 .03 .42 .58 .05 .16 .10 .23 .07 .27 .48 .43 .18 .38 .84 .15 .16 .14 .03 .04 .04 .26 .08 .12 .04 .13 .17 .16 .09 .13 .86 .01 .06 .17 .06 .15 .10 .28
.01 .06 .29 .17 .24 .05 .08 .32 .15 .03 .08 .13 .03 .02
.03 .03 .00 .09 .04 .01 .25 .16 .06 .05 .05 .13 .05 .03
.02 .10 .15 .26 .19 .09 .09 .17 .02 .13 .17 .12 .01 .02
.04 .04 .22 .16 .19 .40 .05 .12 .25 .02 .08 .06 .11 .15
ANEW Variables .22 .06 .05 .12 .07 .07 .26 .11 .01 .02 .03 .03
.77 .17 .20 .63 .07 .11
.17 .07 .27 .12 .17 .06
.00 .12 .26 .03 .11 .03
.14 .25 .17 .05 .09 .19
.13 .29 .35 .20 .16 .20
Wu–Barsalou Variables .24 .04 .15 .10 .26 .05 .12 .01 .18 .03 .00 .03 .12 .14 .15 .06 .28 .03 .33 .01 .60 .14 .12 .01
.34 .22 .24 .80 .24 .06
.20 .12 .76 .04 .61 .02
.30 .08 .11 .07 .39 .01
.06 .75 .09 .15 .08 .06
Memory Variables .07 .05 .24 .05 .38 .17
6
9
10
Roediger et al. (2001) Variables .70 MBAS .27 .04 .25 .13 .00 .28 .06 .03 .08 .76 MFAS .04 .06 .08 .02 .03 .11 .01 .13 .07 .45 Connect .33 .30 .05 37 .38 .03 .07 .21 .07 .47 .42 CD Lng .08 .09 .38 .25 .14 .24 .20 .06 .77 CD Freq .01 .16 .23 .17 .13 .02 .09 .04 .09 .53 TAR Lng .24 .19 .33 .28 .22 .01 .17 .35 .08 .57 TAR Freq .19 .13 .06 .33 .19 .12 .09 .04 .09 Note—F, false; T, true; Recog, recognition; CD, critical distractor; TAR, target; Fam, familiarity; Mng, meaningfulness; Con, concreteness; Img, imagery; Cat, categorizability; Noa, number of attributes; Pls, pleasantness; Domn, dominance; Entity, entity conceptual relations; Situation, situational conceptual relations; Introspect, introspective conceptual relations; Connect, connectivity; Lng, length; Freq, frequency. Factor loadings in bold are regarded as significant.
ognition both loaded on Factor 3, whose interpretation is also straightforward. It is a critical distractor familiarity/ meaningfulness factor: The highest positive loading on this factor was for critical distractors’ Toglia–Battig familiarity, and there was also a high positive loading for critical distractors’ meaningfulness. The other variables that loaded positively on this factor were critical distractors’ number of attributes and MBAS. The loading for number of attributes is presumably due to the fact that it correlates strongly with meaningfulness in the Toglia–Battig norms (r .75). The positive loading for MBAS is especially important, because it confirms that, against the backdrop of high
distractor–target semantic relatedness, interlist variability in distractor-to-target association has definite semantic correlates—specifically, critical distractors with stronger associations are also more familiar and more meaningful. To follow up these results, we reran the exploratory factor analysis twice, forcing it to extract only three or four factors. The purpose was to determine whether the variable loadings of the first and third factors changed drastically when only enough factors were extracted to include the theoretically important part of the factor structure. They did not. When either four factors or three factors were extracted, Factor 1 was still an imagery/concreteness
1044
BRAINERD, YANG, REYNA, HOWE, AND MILLS
factor on which true recall loaded, and Factor 3 was still a familiarity/meaningfulness factor on which false recall, false recognition, and MBAS loaded. We mentioned earlier that the four problems with the associative interpretation of interlist variability in the DRM illusion would vanish if semantic variables predicted variability in the illusion, if the same variables predicted variability in MBAS, and if variables other than these predicted variability in true recall. This is how things have fallen out. At a more specific level, the picture that emerged from the factor analysis has four elements. First, false recall and false recognition are similar types of memory errors, inasmuch as they loaded on the same factor and only on that factor. Second, beyond high baseline semantic processing, interlist variability in the illusion is strongly and uniquely associated with variability in semantic properties of critical distractors—specifically, familiarity and meaningfulness. Third, backward association itself loads positively on the same familiarity/ meaningfulness factor. Fourth, all of this converges on the conclusion that interlist variability in the semantic properties of critical distractors—more particularly, their familiarity and meaningfulness—underlies interlist variability in intrusions and false alarms. Different patterns of factor loadings would be required to argue otherwise. Although the other seven factors in Table 4 are not related to DRM performance, we provide brief interpretations of them. Factor 2 is an emotional valence factor, because the ANEW valence scores of targets and critical distractors had high positive loadings on this factor, as did the Toglia–Battig pleasantness values of targets and distractors. Factor 4 is the target counterpart of Factor 3, because the Toglia–Battig meaningfulness and familiarity of targets, as well as their numbers of attributes, had high positive loadings on this factor. There are also positive loadings for critical distractors’ meaningfulness, number of attributes, and length. Continuing with this traditional method of interpreting factors in terms of their highest positive variable loadings, Factor 5 is a critical distractor frequency factor, Factor 6 is an emotional arousal factor, Factor 7 is a connectivity factor, Factor 8 is a taxonomic relations factor, Factor 9 is an MFAS factor, and Factor 10 is a critical distractor number of attributes factor. Exploratory factor analysis of recognition-only data. Roediger et al.’s (2001) recognition data came from tests that were preceded by recall; findings might be different, therefore, when recognition tests are not preceded by recall. This criticism is mitigated by the fact that our factor analysis produced the same picture for false recall and false recognition and by Gallo and Roediger’s (2002) finding that interlist variability in false recognition was not affected by prior recall tests. However, it remains possible that the factor analytic picture might differ in important respects, if false recognition were measured without prior recall. Also, Roediger et al. did not include true recognition scores in their analyses (see note 3). Therefore, we conducted a further exploratory factor analysis with the recognition-only data described earlier, focusing on predictors of interlist variability in true and false recognition. As mentioned, for the 36 lists in Stad-
ler et al. (1999)—a subset of the 55 lists of Roediger et al.—data were available from a sample of 190 subjects, who responded only to recognition tests, whereas data for the remaining 19 lists were available from a sample of 93 subjects, who responded only to recognition tests. The items on the old/new recognition tests that generated these data consisted of two targets from each presented list (the words in the sixth and ninth presentation positions), the critical distractors for these lists, and an equal number of unrelated distractors (critical distractors from unpresented lists). We first investigated relations between the hit and false alarm rates for the 36 Stadler et al. (1999) lists that Roediger et al. (2001) used and the hit and false alarm rates for these same lists from our data sets. (Relations between the two sets of data could not be investigated for the full set of 55 lists, because Roediger et al. did not report hit rates for the 19 lists not normed by Stadler et al.) We examined the relations between the two sets of false alarm rates to test Roediger et al.’s and Gallo and Roediger’s (2002) hypothesis that interlist variability on the recognition side of the DRM illusion is not seriously compromised by prior recall tests. The results confirmed that hypothesis. The mean false alarm rate for these 36 lists was .66 in Roediger et al. and .69 without prior recall tests. This difference was not reliable [t(35) 0.86]. More important, like Gallo and Roediger, we found that the two sets of false alarm rates were highly correlated [r .91], demonstrating that prior recall tests do not appreciably permute the ordering of interlist differences in false alarm rates. The effects of prior recall tests on hit rates were more pronounced. The mean hit rate for the 36 lists was .73 in the Stadler et al. norms, and .82 without prior recall tests, a highly reliable difference [t(35) 4.68, p .0001]. This difference is most likely due to the fact that (1) Stadler et al.’s recognition tests involved twice as many DRM lists as ours, and (2) the elapsed time between list presentation and the recognition testing was much greater in Stadler et al.’s study than in ours. Furthermore, although the two sets of hit rates were correlated [r(35) .39, p .05], the correlation accounted for only 15% of the variance. Thus, unlike false alarm rates, the ordering of interlist differences hit rates is substantially permuted by prior recall tests. We repeated the exploratory factor analysis, using hit and false alarm data that were not contaminated by prior recall. The results of this factor analysis are displayed in Table 5. A comparison of Table 5 with Table 4 reveals that, although the numerical values of the factor loadings differed somewhat, the picture was the same in four key respects: (1) Ten factors that accounted for more than 80% of the variance were extracted; (2) false recognition again loaded positively on a critical distractor familiarity/ meaningfulness factor and on no other factor; (3) MBAS again loaded on the same familiarity/meaningfulness factor as false recognition; and (4) the first four factors, which accounted for 50% of the variance, were again an imagery/concreteness factor, an emotional valence factor, a critical distractor familiarity/meaningfulness factor, and a target familiarity/meaningfulness factor. In sum, whether false alarm data come from recognition tests that
SEMANTIC PROCESSING
1045
Table 5 Loadings on the 10 Factors for the Variables That Were Measured for the Recognition-Only Data 1
2
3
Logit F Recog Logit T Recog
.22 .23
.16 .06
.49 .38
CD Fam CD Mng CD Con CD Img CD Cat CD Noa CD Pls TAR Fam TAR Mng TAR Con TAR Img TAR Cat TAR Noa TAR Pls
.06 .12 .86 .83 .85 .19 .14 .18 .07 .88 .90 .86 .05 .19
CD Arousal CD Valence CD Domn TAR Arousal TAR Valence TAR Domn Synonymy Antonymy Taxonomy Entity Situation Introspect
4
Factor 5 6
Memory Variables .11 .35 .08 .10
7
8
9
10
.26 .05
.08 .05
.17 .65
.08 .12
.21 .05
.01 .19 .02 .01 .07 .36 .78 .06 .06 .10 .09 .13 .20 .81
Toglia–Battig Variables .82 .04 .08 .12 .76 .26 .09 .19 .13 .15 .13 .00 .24 .08 .04 .02 .19 .11 .10 .01 .61 .42 .09 .19 .20 .19 .22 .23 .65 .05 .12 .35 .86 .04 .10 .21 .03 .00 .05 .27 .01 .15 .08 .11 .06 .22 .19 .12 .81 .02 .03 .25 .00 .17 .08 .28
.03 .04 .26 .17 .25 .06 .12 .20 .13 .03 .05 .11 .01 .02
.09 .16 .17 .11 .09 .25 .05 .03 .20 .02 .09 .03 .10 .10
.05 .10 .06 .09 .03 .23 .14 .44 .02 .05 .10 .11 .13 .10
.03 .03 .18 .27 .17 .01 .10 .01 .04 .16 .22 .21 .01 .06
.15 .07 .09 .32 .06 .01
.22 .82 .63 .28 .91 .91
.08 .10 .13 .09 .00 .07
ANEW Variables .16 .00 .06 .09 .06 .20 .24 .07 .03 .04 .07 .07
.80 .15 .19 .67 .08 .08
.18 .01 .26 .13 .17 .03
.13 .17 .25 .21 .15 .17
.01 .07 .18 .00 .11 .06
.13 .31 .19 .05 .09 .20
.41 .27 .44 .39 .03 .66
.21 .22 .01 .09 .02 .06
Wu–Barsalou Variables .13 .00 .21 .06 .15 .00 .10 .06 .69 .17 .08 .00 .15 .18 .03 .07 .81 .17 .04 .01 .11 .19 .28 .05
.42 .25 .32 .74 .17 .07
.05 .80 .05 .18 .04 .06
.03 .04 .25 .13 .01 .38
.40 .10 .03 .13 .22 .19
Roediger et al. (2001) Variables .55 .41 MBAS .27 .00 .12 .09 .39 .17 .17 .07 .70 MFAS .03 .07 .02 .02 .23 .15 .03 .12 .10 .40 .47 Connect .39 .28 .13 .10 .07 .07 .15 .28 .43 .53 .43 CD Lng .00 .05 .12 .18 .11 .19 .09 .54 .42 CD Freq .11 .13 .15 .11 .10 .22 .34 .22 .79 TAR Lng .17 .19 .31 .12 .12 .00 .04 .07 .19 .57 TAR Freq .18 .04 .14 .17 .17 .26 .21 .30 .31 Note—F, false; T, true; Recog, recognition; CD, critical distractor; TAR, target; Fam, familiarity; Mng, meaningfulness; Con, concreteness; Img, imagery; Cat, categorizability; Noa, number of attributes; Pls, pleasantness; Domn, dominance; Entity, entity conceptual relations; Situation, situational conceptual relations; Introspect, introspective conceptual relations; Connect, connectivity; Lng, length; Freq, frequency. Factor loadings in bold were regarded as significant.
were preceded by recall tests or not, factor analysis tells the same basic story about DRM false recognition. Relative to Table 4, there were three new results worthy of note. First, because in prior research true recognition has not been found to correlate with interlist variability true recall, false recall, or false recognition, it should not load on either Factor 1 or Factor 3. It did not load on either of these factors, instead loading only on Factor 8, an antonym relations factor (i.e., in which the hit rate increases along with the number of antonym relations in a DRM list). Second, although MBAS again loaded positively on the critical distractor familiarity/meaningfulness factor, it also loaded negatively on the target familiarity/ meaningfulness factor. Third, although the same variables
as before loaded on Factor 3 (false recognition, critical distractor familiarity, critical distractor meaningfulness, and MBAS), critical distractor length loaded negatively and critical distractor frequency loaded positively. Neither result is surprising: Negative correlations between DRM false memory and critical distractors’ length have been reported (Roediger et al., 2001), and words’ linguistic frequencies tend to be positively correlated with their familiarity ratings (Toglia & Battig, 1978). Summary. Consistent with the dual-process notion that true recall depends primarily on retrieving verbatim traces of targets, the first factor analysis showed that true recall only loaded on an imagery/concreteness factor. Consistent with the dual-process notion that true and
1046
BRAINERD, YANG, REYNA, HOWE, AND MILLS
false memory involve dissociated representations, true and false recall loaded on different factors, as did true and false recognition. Third, consistent with the principle that DRM false memory is semantically based, false recall, false recognition, and MBAS all loaded on a critical distractor familiarity/meaningfulness factor. Fourth, the conclusions suggested by the factor analyses are that interlist variability in the DRM illusion and in MBAS are explained by interlist differences in critical distractors’ familiarity/meaningfulness. Semantic Properties of Backward Association The relation between associative and semantic relatedness, as it has commonly been characterized (e.g., Anisfeld & Knapp, 1968; Grossman & Eagle, 1970; ThompsonSchill et al., 1998), is that associatively related words are semantically related, but most semantically related words are not associatively related. In that connection, the factor analyses of DRM lists showed that MBAS loaded on a familiarity/meaningfulness factor. As a further index of semantic consequences of DRM backward associations, we computed bivariate correlations between MBAS and the three sets of semantic variables (Toglia–Battig, ANEW, Wu–Barsalou). Several variables correlated reliably with interlist variability in MBAS. Using two-tailed tests and the .05 level of confidence, correlations for seven semantic variables exceeded the critical value of r for rejecting the null hypothesis of stochastic independence: critical distractor familiarity, critical distractor meaningfulness, critical distractor categorizability, critical distractor imagery, number of cue–target introspective relations, target meaningfulness, and target arousal. Thus, backward associations between DRM targets and critical distractors have extensive semantic correlates. We were uneasy about this conclusion, however. Owing to the manner in which DRM lists are constructed, we were concerned that the DRM results do not provide a complete picture of the semantic correlates of backward associations. We were also concerned that the DRM results provide an inaccurate picture of the relation between backward and forward association. MBAS and MFAS are uncorrelated with the Roediger et al. (2001) lists, but the MFAS variability of these lists is near floor (Brainerd & Wright, 2005). To clarify these two points, we analyzed the more representative pool of 400 word pairs described earlier. The unpresented (target) and presented (cue) word in each pair were scored for the seven Toglia–Battig variables, the three ANEW variables, and the length and frequency variables of Roediger et al. The pairs were also scored for the six Wu–Barsalou variables, and the BAS and FAS values for each pair, from the Nelson et al. (1999) norms, were recorded. The objectives were to determine the semantic consequences of variations in BAS and the actual relation between BAS and FAS. The results are shown in Table 6, which contains mean values of all the variables, as functions of the four levels of BAS. The variables are blocked in the same manner as in Tables 4 and 5, and we report the results in that order. Toglia–Battig variables. Mean values of the Toglia– Battig variables are reported for presented words (cues)
and their unpresented backward associates (targets). For each of the rows in Table 6, we computed an omnibus F test to determine whether that semantic property varied as a function of BAS. Those tests appear in the next-to-last column. In the last column, we report the mean value of each of the Toglia–Battig variables, from the norms. These data demonstrate two things. First, the semantic properties that Toglia and Battig (1978) measured vary substantially as a function of BAS level; second, the normed values of these properties are usually lower than the corresponding values for words whose BAS levels are very strong, strong, or moderate. Taking unpresented words first, the F test was reliable for all Toglia–Battig variables; unpresented words’ levels of familiarity, meaningfulness, concreteness, imagery, categorizability, number of attributes, and pleasantness all varied with BAS. Whether the pattern of variation was only monotonic, or whether nonmonotonicity was also present, differed from variable to variable. For all seven, it can be seen that there was an overall monotonicdecreasing trend: The variables’ mean values were higher for very strong BAS (grand mean 5.11) than for weak BAS (grand mean 4.35). Also, these variables’ mean values from the norms (grand mean 4.36) were roughly the same as the means for weak BAS pairs, whereas the means for very strong, strong, and moderate BAS were higher than the mean values from the norms. Turning to presented words, the picture is one of reliable but less extensive semantic effects. The mean values of five of the seven variables (familiarity, meaningfulness, concreteness, imagery, and categorizability) differed reliably as a function of BAS. The pattern of variation for the last three variables was overall decline from very strong BAS (grand mean 5.04) to weak BAS (grand mean 4.30), with the values for weak BAS being roughly the same as the values in the norms (grand mean 4.43). In contrast, the pattern for meaningfulness and familiarity was nonmonotonic. Thus, variability in the semantic content of both presented and unpresented words is strongly related to BAS variation. ANEW variables. Emotional content also covaries with BAS. The mean values of arousal, valence, and dominance are reported for presented words and unpresented associates in Table 6. For each of the six rows of data, we computed an omnibus F test to determine whether that particular property varied as a function of BAS. Those tests appear in the next-to-last column. In the last column, we report the mean value of each variable, from the Bradley and Lang (1999) norms. These data show, first, that the emotional properties of words vary reliably as a function of BAS, and, second, that normed values of these properties are usually higher than the corresponding values for words whose BAS levels are very strong, strong, or moderate. Taking unpresented words first, the F ratios for all emotional content variables were reliable. All three effects were monotonic-decreasing: The mean values for weak BAS (grand mean 6.57) are higher than the mean values for very strong BAS (grand mean 5.23), and the mean values for weak BAS are above the corresponding normed values (grand mean 5.10). Thus, feelings of calmness, happi-
SEMANTIC PROCESSING
1047
Table 6 Semantic Properties of Backward Associations Between Word Pairs From the Nelson et al. (1999) Norms Very Strong
Backward Associative Strength Strong Moderate Weak F(3,396)
Norms
Toglia–Battig Variables Unpresented Familiarity Meaningfulness Concreteness Imagery Categorizability Number of attributes Pleasantness Presented Familiarity Meaningfulness Concreteness Imagery Categorizability Number of attributes Pleasantness
6.41 4.82 5.25 5.42 5.33 4.10 4.42
6.35 4.72 5.11 5.34 5.17 4.05 4.39
6.32 4.80 4.54 4.93 4.68 4.22 4.40
4.55 4.55 4.30 4.62 4.46 3.91 4.08
5.90** 7.83**** 22.87**** 26.61**** 26.26**** 4.32** 4.50**
5.59 4.03 4.40 4.55 4.33 3.56 4.01
6.04 4.37 4.96 5.14 5.03 3.74 4.26
6.11 4.49 5.07 5.28 5.19 3.81 4.14
5.99 4.27 4.72 4.99 4.84 3.72 4.18
6.14 4.34 4.14 4.52 4.23 3.72 4.21
6.25**** 7.15**** 29.16**** 27.82**** 38.34**** 1.01 1
5.59 4.03 4.40 4.55 4.33 3.56 4.01
ANEW Variables Unpresented Arousal Valence Dominance Presented Arousal Valence Dominance Synonymy Antonymy Taxonomy Entity Situational Introspection
5.02 5.62 5.06
4.72 6.00 5.00
5.01 5.99 5.36
5.18 5.73 5.30
7.78**** 3.40* 8.52****
5.16 5.12 5.02
5.28 5.36 5.02
5.15 5.37 4.96
5.22 4.58 4.55
5.18 6.25 5.62
1.36 52.37**** 58.37****
5.12 5.16 5.02
Wu–Barsalou Variables .24 .23 .44 .14 .13 .02 .33 .32 .28 .19 .26 .24 .31 .23 .26 .07 .05 .19
.16 .06 .16 .23 .45 .32
7.97**** 4.08* 3.11* 1 5.26** 13.13****
.01
32.25****
Roediger et al. (2001) Variables Forward associative strength .22 .09 .05 Unpresented Length 4.93 5.30 5.16 Frequency 207.47 145.65 168.04 Presented Length 5.56 5.69 6.16 Frequency 92.01 71.01 37.58 *p .01. **p .005. ****p .0001.
ness, and submission all increase as BAS increases. Turning to presented words, the F ratios for two of the emotional content variables, valence and dominance, were reliable. In both instances, the pattern of change is the monotonicdecreasing trend that was noted for unpresented words: The mean values of valence and dominance for weak BAS (grand mean 5.94) are higher than the mean values for very strong BAS (grand mean 5.19), and they are also higher the corresponding normed values (grand mean 5.09). Thus, interestingly, feelings of happiness and submission increase along with presented words’ levels of backward association to unpresented words. Wu–Barsalou variables. Moving to the Wu–Barsalou variables, semantic relations between presented and unpresented words are also affected by BAS variation. The mean probability that cue–target pairs are linked by each
5.06 161.90
1.04 1
5.65 130.01
2.56 1.90
of the six semantic relations is reported by BAS level in Table 6, and we computed an omnibus F test for each relation. Those tests appear in the next-to-last column of Table 6. It can be seen that five of the six semantic relations varied reliably as a function of BAS level. Three of the relations—antonymy, synonymy, and taxonomy— tended to increase as BAS increased, and the mean probability that the relation was present was lower for weak BAS (grand mean .13) than for very strong BAS (grand mean .24). The other two relations that varied reliably, situational and introspective, displayed the pattern for the emotional content variables that we have just noted. The mean probability that the relation was present was higher for weak BAS (grand mean .39) than for very strong BAS (grand mean .19). Considering the findings for the emotional content variables, the pattern for introspective
1048
BRAINERD, YANG, REYNA, HOWE, AND MILLS
relations is not surprising, because in the Wu–Barsalou taxonomy emotional relations between words are common examples of introspective relations. Also, situational relations often involve actions, circumstances, and participants with emotional connotations. FAS, length, and frequency. The mean values for the remaining variables—FAS, length of presented and unpresented words, and frequency of presented and unpresented words—appear at the bottom of Table 6. For each, we computed an omnibus F test to determine whether that variable varied reliably as a function of BAS. Those tests appear in the next-to-last column of Table 6. The findings were simple: Although neither length nor frequency of presented or unpresented words was related to BAS, FAS was strongly related to BAS. The FAS value for very strong BAS pairs (.20) is more than twice that for strong BAS pairs (.09), which in turn is more than twice that for the combined moderate and weak pairs (grand mean .03). Thus, the lack of variability in MFAS in the DRM list pool masks a strong positive relation between BAS and FAS. Summary. The data in Table 6 flesh out the principle (e.g., Grossman & Eagle, 1970) that word associations have semantic consequences. With respect to the point of particular interest, the semantic effects of BAS on unpresented words, we now know the nature of those effects for 16 semantic variables—the seven Toglia–Battig dimensions, the three ANEW dimensions, and the six Wu–Barsalou relations. Earlier, we saw with DRM lists that critical distractor familiarity, critical distractor categorizability, critical distractor meaningfulness, critical distractor imagery, number of distractor–target introspective relations, target meaningfulness, and target arousal all vary reliably as MBAS varies. Analysis of this additional sample of word pairs produced three general findings. First, all seven of the semantic factors that correlated with the MBAS of DRM lists also varied reliably with the strength of association from presented to unpresented words. In addition, many other semantic factors covaried with BAS. Thus, as suspected, the picture that DRM lists provide of the types of semantic processing that result from backward associations is incomplete. Second, for unpresented words, the semantic consequences of variations in BAS spanned virtually all of the semantic variables that we measured. Of the 16 variables, 15 varied reliably as a function of BAS. Third, by combining the Toglia–Battig pleasantness values for unpresented words with those words’ arousal, valence, and dominance values, an interesting picture of the affective consequences of BAS emerges. As BAS increases, unpresented words provoke increased feelings of pleasantness, calmness, and submissiveness. In a word, the affective consequences of increased levels of BAS are blissful. DISCUSSION How shall we interpret the classic datum that backward associations of presented to unpresented words correlate positively with false memory for unpresented words, in the DRM paradigm and various other tasks (e.g., Dewhurst, 2001; Underwood, 1965)? The easy answer is
that such errors are caused by automatic priming of unpresented words as their associates are recalled (Deese, 1959), studied (Underwood, 1965), or both (Howe, 2006; Roediger et al., 2001). We saw, however, that the easy answer is fraught with difficulties, the most important of which is that certain empirical findings disconfirm its most obvious predictions. This situation motivated the present investigation. The results have implications for some core theoretical issues—most notably, semantic and associative processing in false memory, dual-process conceptions of false memory, and the semantic properties of word associations. We briefly consider theses issues to conclude this article. Semantic and Associative Processing in False Memory Semantic explanation of high baseline false memory. At a purely descriptive level, the DRM procedure and certain other false-memory paradigms are self-evidently associative tasks because, by design, there are forward and backward associations between presented and unpresented words. Surely the most seductive feature of such associative relations is that they are objective and quantifiable. Numerical values can be extracted from word norms (Nelson et al., 1999; Russell & Jenkins, 1954) that provide standardized indexes of target-to-distractor association (BAS), distractorto-target association (FAS), and target-to-target association (connectivity). To move beyond description and treat those values as explanations of false memory, however, requires two theoretical leaps—namely, the assumption that the values also measure automatic target priming of unpresented associates as subjects study lists or respond to memory tests, and the assumption that such priming is what causes the elevations in intrusion and false alarm rates that have been detected in false-memory experiments. According to the first assumption, which has rarely been stated in explicit terms, the statistical frequency with which word A provokes conscious reports of B under conditions of intentional association (i.e., on a wordassociation test) is monotonically related to the tendency of A to prime B automatically under conditions of incidental association (i.e., the study or test phase of a memory experiment). Empirical support for this assumption is weak. In lexical decision experiments, some investigators (e.g., Cotel et al., 2008; Tse & Neely, 2005) have detected automatic priming of DRM critical distractors, but others have not (McKone, 2004; Zeelenberg & Pecher, 2002). According to the second assumption, automatic priming is what causes false recall/recognition of unpresented words, but that notion conflicts with two findings. First, lexical decision research shows that priming is transitory, which seems to rule out study-phase priming as the cause of intrusions and false alarms, because error rates remain elevated for weeks after lists are studied. Second, true recall is not positively correlated with intrusions, and false recognition is not increased by prior recognition tests for target associates, which seems to rule out test-phase priming as the cause of errors. Both findings can be accommodated, however, if semantic processing is at the bottom of false-memory responses to unpresented associates.
SEMANTIC PROCESSING In that connection, it has often been noted that associatively related words are necessarily semantically related (e.g., Grossman & Eagle, 1970; Thompson-Schill et al., 1998). However, that observation has traditionally suffered from underspecification, inasmuch as it has not been backed up with objective, quantitative data about the precise semantic components of word associations. In the present study, we removed that limitation by conducting a quantitative analysis of the semantic properties of DRM materials and the semantic properties of a large sample of word pairs that varied in degree of BAS. Our results were simple: DRM materials proved to be unusually rich in meaning content, which should provoke intense semantic processing, and the same was true of more representative samples of associated word pairs. When it comes to explaining the DRM illusion, two aspects of the semantic analysis are of special significance. First, the paradigm’s semantic profile, which consists of observed values of three sets of variables (Table 2), shows that DRM materials are dense with distractor–target meaning connections, as well as target–target meaning connections. For instance, when distractor–target pairs are scored for six semantic relations (antonymy, entity, introspective, situational, synonymy, and taxonomy), virtually every pair in the 55 Roediger et al. (2001) lists displays one of the relations. Furthermore, critical distractors and targets have high mean values (more than 0.75 SDs above the corresponding normed means) on two dimensions from the Toglia and Battig (1978) norms that measure semantic relatedness, categorizability and meaningfulness. The 55 lists’ mean levels of target–target connectivity are also high, a measure that McEvoy, Nelson, and Komatsu (1999) proposed as an index of the strengths of the gist memories that subjects store when studying these lists. Finally, critical distractors and targets have high mean ratings (more than 0.6 SDs above the corresponding normed means) on wordlevel semantic properties (concreteness, familiarity, imagery, number of attributes, and pleasantness). In sum, results of objective metrics of meaning content converged on the conclusion that semantic processing is extensive with DRM materials. This is a key consideration when it comes to explaining why baseline false memory levels are so high for these lists and why false memory is so long-lived. The other aspect of the semantic analysis that is significant for the DRM illusion is the ability of semantic variables—familiarity and meaningfulness, in particular—to explain interlist variability in the illusion. Within the context of high baseline false memory, interlist variability in false recall and false recognition were both tied to variability in these semantic dimensions. The factor analyses revealed that intrusions and false alarms both loaded on a critical distractor familiarity/meaningfulness factor and on no other factor. Importantly, MBAS also loaded on this factor and on no other factor, thereby explaining why it correlates with false recall/recognition: Beyond the high baseline levels of familiarity and meaningfulness for DRM lists, increases in MBAS are strongly associated with further increases in critical distractors’ familiarity and meaningfulness. It does not seem controversial, judging by everything that is known about semantic false
1049
memory, that error rates should be higher for items that are extremely familiar and meaningful. The suggestion— made on the basis of the factor analysis—that words that enter into strong associative relations tend to be highly familiar and meaningful was confirmed in the supplementary analysis of associated word pairs. If we follow this suggestion to its logical conclusion and treat interlist variability in the DRM illusion as interlist variability in familiarity/meaningfulness, it is important to emphasize that it is high-end variability that produces this effect. The average familiarity and meaningfulness values of the 55 DRM critical distractors are well above the corresponding averages for the 2,854 words in Toglia and Battig’s (1978) sample. If these distractors are ordered with respect to their mean false alarm rates, the average familiarity and meaningfulness of the highest 27 words are roughly three quarters of an SD above the Toglia–Battig means, whereas the average familiarity and meaningfulness of the lowest 28 words are roughly half an SD above the Toglia–Battig means. In short, the familiarity/meaningfulness differences that lead to higher versus lower false-memory rates for DRM lists could be characterized as differences between words that are quite familiar and meaningful versus words that are exceptionally familiar and meaningful. Resolving a theoretical uncertainty. In the current literature, there are two general views of the processes that foment errors on the DRM task and other tasks that involve false memory for words. According to one, which is usually identified with Roediger et al.’s (2001) activation/ monitoring theory, those processes are associative (see Gallo, 2006), whereas according to the other, usually identified with fuzzy-trace theory (Reyna & Brainerd, 1995), those processes are semantic. To explain memory errors, both views posit that studying or remembering target words causes the content of distractors, in addition to the content of the targets themselves, to be processed. However, there are differences in the types of content that are thought to be processed as a consequence of associative versus semantic operations (see Gallo, 2006), and those differences have been stressed in the recent literature (for a review, see Brainerd, Reyna, & Ceci, 2008). For instance: “[A]re critical lures falsely recalled and recognized because they share semantic features with list members or because they share varying degrees of associative strength with different list members?” (Howe, 2006, p. 1112). When processing is associative, the key result of studying or remembering a target word (e.g., nurse) is that it automatically primes the lexical entry of a specific unpresented word (e.g., doctor) (Howe, 2006). When processing is semantic, however, the key result of studying or remembering a target word (e.g., nurse) is that it activates concepts in which presented and unpresented words participate (e.g., medicine) or features that presented and unpresented words share (e.g., health care provider). Lampinen, Leding, Reed, and Odegard (2006) remarked that the literature contains data that seem to support both of these conceptions. For instance, the associative view is favored by the positive correlation between MBAS and false recall/recognition and by Hutchison and
1050
BRAINERD, YANG, REYNA, HOWE, AND MILLS
Balota’s (2005) finding that errors do not decline when DRM lists’ thematic coherence is reduced (if MBAS is held constant). On the other hand, the semantic view is favored by the findings of Toglia et al. (1999) and others that errors increase when subjects are instructed to process targets’ meanings as they study or remember words and that thematic blocking of DRM lists produces more errors than random presentation does, when MBAS is constant. Now that objective information about the semantic properties of DRM materials is in hand, the dispute over whether associative or semantic processing is the basis for the illusion can be directly addressed. The data showed that semantic processing is necessarily intense, because word associations have extensive semantic consequences. Three findings demonstrate this point. First, DRM critical distractors have high scores on properties such as categorizability, concreteness, familiarity, imagery, meaningfulness, number of attributes, and pleasantness, so that priming any critical distractor at any time under any condition activates semantic content to a high degree. Second, because it is the priming of critical distractors by targets that is of principal interest, it is crucial to note that the data showed that with only rare exceptions, the words in DRM distractor–target pairs are always linked by one of six conceptual relations (antonymy, entity, introspectiveness, situation, synonymy, taxonomy). Third, the standard assumption (e.g., Hutchison & Balota, 2005) is that increases in BAS indicate increased target priming of critical distractors’ lexical entries. However, analysis of a large pool of associated word pairs revealed that increases in BAS also result in increased semantic processing, because numerous semantic properties of words were found to covary with BAS. A useful by-product of all this is that it removes a strategic criticism of the DRM illusion discussed by Brainerd and Wright (2005). These authors noted that, to the extent that the DRM illusion is not semantically based, it may be misleading as a device for advancing scientific understanding of false memory (for related arguments, see Freyd & Gleaves, 1996). The reason is that the signature paradigms of the false memory literature are designed to emulate everyday circumstances, in which false memories have serious legal or medical ramifications, and they do not involve word lists. Consequently, automatic associative priming cannot be the basis for errors. In those paradigms, the target material typically consists of meaningful events from everyday life—presented as narratives, videos, or live event sequences—with falsely remembered items being ones that preserve the meaning content of such experiences (for a review, see Brainerd & Reyna, 2005).6 Prominent examples are schematic memory tasks (e.g., Lampinen, Copeland, & Neuschatz, 2001), misinformation tasks (e.g., Pezdek, Finger, & Hodge, 1997), narrative comprehension tasks (e.g., Reyna & Kier nan, 1994, 1995), and eyewitness identification tasks (e.g., Wells et al., 1998). Given the present results, DRM experiments are highly relevant to the larger enterprise of understanding false memory, because this paradigm shares the central feature of everyday false memory: semantic relatedness.
Dual-Process Models Although there are local theoretical hypotheses associated with individual paradigms in the false-memory literature, dual-process models provide a general perspective on all of them (Brainerd & Reyna, 2005). The core principle is that there are dissociated storage and retrieval mechanisms that underlie true and false memory, respectively. The mechanisms that underlie true memory are most commonly thought to involve memory for targets’ surface forms, whereas the mechanisms that underlie false memory are most commonly thought to involve memory for targets’ meaning content. A key prediction that has been investigated with many false-memory paradigms (e.g., Holliday & Hayes, 2000, 2001, 2002; Lampinen et al., 2001; Lampinen et al., 2006; Odegard, Holliday, Brainerd, & Reyna, 2008; Powell, Roberts, Ceci, & Hembrooke, 1999; Reyna & Kiernan, 1994, 1995; Seamon, Luo, Kopecky, et al., 2002; Seamon, Luo, Schwartz, et al., 2002; Toglia et al., 1999) is that manipulations that selectively affect surface processing or semantic processing can drive true and false memory in opposite directions. Judging by Deese’s (1959) work, the DRM illusion seemed to be an exception to the dual-process framework. If the illusion is simply due to associative priming of critical distractors by targets, only one process is involved— as, for instance, Howe (2008b) has shown. However, Roediger et al.’s (2001) important finding that MBAS correlates positively with interlist variability in the illusion, but that true recall correlates negatively, is consistent with the dual-process view. Beyond this, the present study produced a series of results supporting the notion that the DRM task is not an exception to the rule that dissociated mechanisms underlie true and false memory. Three results, in particular, should be mentioned: (1) true recall, true recognition, and false recall/recognition loaded on different factors; (2) true recall loaded on an imagery/ concreteness factor, whereas true recognition loaded on an antonymy factor; (3) false recall and false recognition both loaded on a familiarity/meaningfulness factor. Semantic Properties of Word Association What are the semantic consequences of backward associations, from presented to unpresented words? Historically, the default response has been not to study this question in detail but instead to assume that associative priming of unpresented words is adequate in itself to explain false recall/ recognition. We now know that this assumption is difficult to reconcile with various findings, so that the quest for detailed data on the semantic properties of backward associations has become an urgent priority. Our study produced two pertinent sets of results, one for the DRM materials of Roediger et al. (2001), and the other for a pool of associated word pairs from the Nelson et al. (1999) norms. With respect to DRM materials, we found that five of the semantic variables measured for critical distractors increased reliably along with MBAS: familiarity, meaningfulness, categorizability, imagery, and number of cue– target introspective relations. In addition, two of the semantic variables measured for targets increased reliably along with MBAS: meaningfulness and arousal. However,
SEMANTIC PROCESSING we suspected that this might be an incomplete picture, which led us to study this question with a large sample of word pairs from the Nelson et al. (1999) norms. The results confirmed our suspicion, inasmuch as they showed that the semantic consequences of backward associations were far more extensive for both unpresented and presented words. Of the semantic properties measured for unpresented words, increases in BAS were associated with (1) increases in categorizability, concreteness, familiarity, imagery, meaningfulness, number of attributes, and pleasantness; (2) decreases in arousal and dominance; and (3) increases in synonymy, antonymy, and taxonomy, coupled with decreases in introspective and situational relations. Of the properties measured for presented words, increases in BAS were associated with (1) increases in concreteness, imagery, and categorizability; (2) decreases in valence and dominance; and (3) increases in synonymy, antonymy, and taxonomy, coupled with decreases in introspective and situational relations. Two conclusions of theoretical significance emerge from these findings. First, variability in BAS means variability in semantic processing; but, more importantly, this principle is neither vague nor devoid of specifics. Although backward associations from presented to unpresented words no doubt have other semantic consequences, it is now known that 14 specific semantic properties of unpresented words vary monotonically with BAS, with 11 varying directly and 3 varying inversely. Likewise, it is now known that 10 semantic properties of presented words vary monotonically with BAS, with 6 varying directly and 4 varying inversely. Second, several manipulations have been studied that elevate false memory in the DRM paradigm by focusing subjects’ attention on the meaning content of materials, while holding MBAS constant (for a review, see Brainerd et al., 2008). Examples include cuing semantic properties at study, cuing semantic properties at test, blocking semantically related targets together on study lists, performing semantic encoding tasks, and varying emotional valence at study. The fact that backward association has profound semantic consequences means that it, too, is an exemplar of this class of manipulations. Conclusion Hitherto, little was known about the semantic properties of false-memory tasks in which unpresented words are associates of presented words. Much more is now known, and such knowledge clears the way for a rapprochement between semantic and associative explanations of this class of illusions. An obvious line of research that would advance that objective consists of experiments in which the semantic properties that load on the same factor as DRM false memory and MBAS (familiarity and meaningfulness) are manipulated, with MBAS controlled. The factor analytic results suggest that these properties contribute to interlist variability in DRM false memory, but the results are only correlational. If the properties actually affect false memory, lists with the same MBAS ought to produce higher levels of error when their critical distractors have higher familiarity ratings (or higher meaningfulness ratings). Another line of research that would advance the
1051
same objective consists of experiments isolating specific semantic properties that foment errors. Although a large set of properties that covary with BAS has been identified, other considerations suggest that only a subset of them may increase false memory. Although it is impossible to vary the strength of word associations without also varying semantic content, association can be varied so that only certain semantic properties are affected (for an early example of such research, see Grossman & Eagle, 1970). Using this methodology, it should be possible to sort the semantic covariates of word association into those that do and do not affect the incidence of false memory. AUTHOR NOTE The present article was supported, in part, by grants from the National Institutes of Health (MH-061211) and the National Science Foundation (BCS 0553225) to C.J.B. and V.F.R., a grant from the National Cancer Institute (R13CA126359) to V.F.R., and a grant from the Economic and Social Research Council U.K. (RES-062-23-0452) to M.L.H. We thank M. P. Toglia for providing us with an updated, searchable version of Toglia and Battig’s (1978) semantic word norms. Correspondence concerning this article should be addressed to C. J. Brainerd, Department of Human Development, Cornell University, Ithaca, NY 14850 (e-mail:
[email protected]). REFERENCES Anisfeld, M., & Knapp, M. (1968). Association, synonymity, and directionality in false recognition. Journal of Experimental Psychology, 77, 171-179. Armitage, P., Berry, G., & Matthews, J. N. S. (2002). Statistical methods in medical research (4th ed.). Oxford: Blackwell. Barnhardt, T. M., Choi, H., Gerkens, D. R., & Smith, S. M. (2006). Output position and word relatedness effects in a DRM paradigm: Support for a dual-retrieval process theory of free recall and false memories. Journal of Memory & Language, 55, 213-231. Bartlett, F. C. (1932). Remembering. Cambridge: Cambridge University Press. Bjorklund, D. F., & Muir, J. E. (1988). Children’s development of free recall memory: Remembering on their own. Annals of Child Development, 5, 79-123. Bradley, M. M., & Lang, P. J. (1999). Affective norms for English words (ANEW): Instruction manual and affective ratings (Tech. Rep. C-1). Gainesville, FL: University of Florida, Center for Research in Psychophysiology. Brainerd, C. J., Forrest, T. J., Karibian, D., & Reyna, V. F. (2006). Development of the false-memory illusion. Developmental Psychology, 42, 962-979. Brainerd, C. J., Payne, D. G., Wright, R., & Reyna, V. F. (2003). Phantom recall. Journal of Memory & Language, 48, 445-467. Brainerd, C. J., & Reyna, V. F. (2005). The science of false memory. New York: Oxford University Press. Brainerd, C. J., Reyna, V. F., & Ceci, S. J. (2008). Developmental reversals in false memory: A review of data and theory. Psychological Bulletin, 134, 343-382. Brainerd, C. J., Reyna, V. F., & Mojardin, A. H. (1999). Conjoint recognition. Psychological Review, 106, 160-179. Brainerd, C. J., Reyna, V. F., Wright, R., & Mojardin, A. H. (2003). Recollection rejection: False-memory editing in children and adults. Psychological Review, 110, 762-784. Brainerd, C. J., & Wright, R. (2005). Forward associative strength, backward associative strength, and the false memory illusion. Journal of Experimental Psychology: Learning, Memory, & Cognition, 31, 554-567. Brainerd, C. J., Wright, R., Reyna, V. F., & Payne, D. G. (2002). Dual-retrieval processes in free and associative recall. Journal of Memory & Language, 46, 120-152. Budson, A. E., Todman, R. W., Chong, H., Adams, E. H., Kensinger, E. A., Krangel, T. S., & Wright, C. I. (2006). False recognition of emotional word lists in aging and Alzheimer disease. Cognitive & Behavioral Neurology, 19, 71-78.
1052
BRAINERD, YANG, REYNA, HOWE, AND MILLS
Cann, D. R., McRae, K., & Katz, A. N. (2006, November). Knowledge types underlying false recall in the Deese/Roediger–McDermott paradigm. Paper presented at the 47th Annual Meeting of the Psychonomic Society, Houston. Carneiro, P., Fernandez, A., & Dias, A. R. (2008). The influence of theme identifiability on false memories: Evidence for age-dependent opposite effects. Manuscript submitted for publication. Corson, Y., & Verrier, N. (2007). Emotion and false memories: Valence or arousal? Psychological Science, 18, 208-211. Cotel, S. C., Gallo, D. A., & Seamon, J. G. (2008). Nonconscious activation causes false memories: Experimental control of conscious processes in the Deese, Roediger, and McDermott task. Consciousness & Cognition, 17, 210-218. Craik, F. I. M., & Lockhart, R. S. (1972). Levels of processing: A framework for memory research. Journal of Verbal Learning & Verbal Behavior, 11, 671-684. Dannenbring, G. L., & Briand, K. (1982). Semantic priming and the word repetition effect in a lexical decision task. Canadian Journal of Psychology, 36, 435-444. Deese, J. (1959). On the prediction of occurrence of certain verbal intrusions in free recall. Journal of Experimental Psychology, 58, 17-22. Dewhurst, S. A. (2001). Category repetition and false recognition: Effects of instance frequency and category size. Journal of Memory & Language, 44, 153-167. Fabrigar, L. R., Wegener, D. T., MacCallum, R. C., & Strahan, E. J. (1999). Evaluating the use of exploratory factor analysis in psychological research. Psychological Methods, 4, 272-299. Fillenbaum, S. (1969). Words as feature complexes: False recognition of antonyms and synonyms. Journal of Experimental Psychology, 82, 400-402. Freyd, J. J., & Gleaves, D. H. (1996). “Remembering” words not presented in lists: Relevance to the current recovered/false memory controversy. Journal of Experimental Psychology: Learning, Memory, & Cognition, 22, 811-813. Gallo, D. A. (2006). Associative illusions of memory: False memory research in DRM and related tasks. New York: Psychology Press. Gallo, D. A., Roberts, M. J., & Seamon, J. G. (1997). Remembering words not presented in lists: Can we avoid creating false memories? Psychonomic Bulletin & Review, 4, 271-276. Gallo, D. A., & Roediger, H. L., III (2002). Variability among word lists in eliciting memory illusions: Evidence for associative activation and monitoring. Journal of Memory & Language, 47, 469-497. Gillund, G., & Shiffrin, R. M. (1984). A retrieval model for both recognition and recall. Psychological Review, 91, 1-67. Grossman, L., & Eagle, M. (1970). Synonymity, antonymity, and association in false recognition responses. Journal of Experimental Psychology, 83, 244-248. Gunter, R. W., Ivanko, S. L., & Bodner, G. E. (2005). Can test list context manipulations improve recognition accuracy in the DRM paradigm? Memory, 13, 862-873. Holliday, R. E., & Hayes, B. K. (2000). Dissociating automatic and intentional processes in children’s eyewitness memory. Journal of Experimental Child Psychology, 75, 1-42. Holliday, R. E., & Hayes, B. K. (2001). Automatic and intentional processes in children’s eyewitness suggestibility. Cognitive Development, 16, 617-636. Holliday, R. E., & Hayes, B. K. (2002). Automatic and intentional processes in children’s recognition memory: The reversed misinformation effect. Applied Cognitive Psychology, 16, 1-16. Holliday, R. E., & Weekes, B. S. (2006). Dissociated developmental trajectories for semantic and phonological false memories. Memory, 14, 624-636. Hovland, C. I. (1951). Human learning and retention. In S. S. Stevens (Ed.), Handbook of experimental psychology (pp. 613-689). New York: Wiley. Howe, M. L. (2006). Developmentally invariant dissociations in children’s true and false memories: Not all relatedness is created equal. Child Development, 77, 1112-1123. Howe, M. L. (2007). Children’s emotional false memories. Psychological Science, 18, 856-860. Howe, M. L. (2008a). Visual distinctiveness and the development of children’s false memories. Child Development, 79, 65-79. Howe, M. L. (2008b). What is false memory development the develop-
ment of ? Comment on Brainerd, Reyna, and Ceci (2008). Psychological Bulletin, 134, 768-772. Hutchison, K. A., & Balota, D. A. (2005). Decoupling semantic and associative information in false memories: Explorations with semantically ambiguous and unambiguous critical lures. Journal of Memory & Language, 52, 1-28. Joordens, S., & Besner, D. (1992). Priming effects that span an intervening unrelated word: Implications for models of memory representation and retrieval. Journal of Experimental Psychology: Learning, Memory, & Cognition, 18, 483-491. Kensinger, E. A. (2004). Remembering emotional experiences: The contribution of valence and arousal. Reviews in the Neurosciences, 15, 241-251. Kintsch, W., Welsch, D., Schmalhofer, F., & Zimny, S. (1990). Sentence memory: A theoretical analysis. Journal of Memory & Language, 29, 133-159. Lampinen, J. M., Copeland, S. M., & Neuschatz, J. S. (2001). Recollections of things schematic: Room schemas revisited. Journal of Experimental Psychology: Learning, Memory, & Cognition, 27, 1211-1222. Lampinen, J. M., Leding, J. K., Reed, K. B., & Odegard, T. N. (2006). Global gist extraction in children and adults. Memory, 14, 952-964. Libby, L. K., & Neisser, U. (2001). Structure and strategy in the associative false memory paradigm. Memory, 9, 145-163. Mandler, G. (1962). From association to structure. Psychological Review, 69, 425-427. Masson, M. E. J. (1991). A distributed memory model of context effects in word identification. In D. Besner & G. W. Humphreys (Eds.), Basic processes in reading: Visual word recognition (pp. 233-263). Hillsdale, NJ: Erlbaum. McEvoy, C. L., Nelson, D. L., & Komatsu, T. (1999). What is the connection between true and false memories? The differential roles of interitem associations in recall and recognition. Journal of Experimental Psychology: Learning, Memory, & Cognition, 25, 1177-1194. McKone, E. (2004). Distinguishing true from false memories via lexical decision as a perceptual implicit test. Australian Journal of Psychology, 56, 42-49. Nelson, D. L., McEvoy, C. L., & Schreiber, T. A. (1999). The University of South Florida word association, rhyme, and word fragment norms. Unpublished manuscript, University of South Florida, Tampa. Neuschatz, J. S., Benoit, G. E., & Payne, D. G. (2003). Effective warnings in the Deese–Roediger–McDermott false-memory paradigm: The role of identifiability. Journal of Experimental Psychology: Learning, Memory, & Cognition, 29, 35-41. Odegard, T. N., Holliday, R. E., Brainerd, C. J., & Reyna, V. F. (2008). Attention to global gist processing eliminates age effects in false memories. Journal of Experimental Child Psychology, 99, 96-113. Paivio, A. (1971). Imagery and verbal processes. New York: Holt, Rinehart & Winston. Papke, L. E., & Wooldridge, J. (1996). Econometric methods for fractional response variables with an application to 401(k) plan participation rates. Journal of Applied Econometrics, 11, 619-632. Payne, D. G., Elie, C. J., Blackwell, J. M., & Neuschatz, J. S. (1996). Memory illusions: Recalling, recognizing, and recollecting events that never occurred. Journal of Memory & Language, 35, 261-285. Pezdek, K., Finger, K., & Hodge, D. (1997). Planting false childhood memories: The role of event plausibility. Psychological Science, 8, 437-441. Powell, M. B., Roberts, K. P., Ceci, S. J., & Hembrooke, H. (1999). The effects of repeated exposure on children’s suggestibility. Developmental Psychology, 35, 1462-1477. Reyna, V. F., & Brainerd, C. J. (1995). Fuzzy-trace theory: An interim synthesis. Learning & Individual Differences, 7, 1-75. Reyna, V. F., & Kiernan, B. (1994). The development of gist versus verbatim memory in sentence recognition: Effects of lexical familiarity, semantic content, encoding instructions, and retention interval. Developmental Psychology, 30, 178-191. Reyna, V. F., & Kiernan, B. (1995). Children’s memory and interpretation of psychological metaphors. Metaphor & Symbolic Activity, 10, 309-331.
SEMANTIC PROCESSING Rivers, S. E., Reyna, V. F., & Mills, B. (2008). Risk taking under the influence: A fuzzy-trace theory of emotion in adolescence. Developmental Review, 28, 107-144. Roediger, H. L., III, & McDermott, K. B. (1995). Creating false memories: Remembering words not presented on lists. Journal of Experimental Psychology: Learning, Memory, & Cognition, 21, 803-814. Roediger, H. L., III, Watson, J. M., McDermott, K. B., & Gallo, D. A. (2001). Factors that determine false recall: A multiple regression analysis. Psychonomic Bulletin & Review, 8, 385-407. Russell, W. A., & Jenkins, J. J. (1954). The complete Minnesota norms for responses to 100 words from the Kent–Rosanoff word association test (Tech. Rep. No. 11, Contract N8 ONR 66216, Office of Naval Research). Minneapolis: University of Minnesota. Seamon, J. G., Luo, C. R., Kopecky, J. J., Price, C. A., Rothschild, L., Fung, N. S., & Schwartz, M. A. (2002). Are false memories more difficult to forget than accurate memories? The effect of retention interval on recall and recognition. Memory & Cognition, 30, 1054-1064. Seamon, J. G., Luo, C. R., Schwartz, M. A., Jones, K. J., Lee, D. M., & Jones, S. J. (2002). Repetition can have similar or different effects on accurate and false recognition. Journal of Memory & Language, 46, 323-340. Skinner, B. F. (1957). Verbal behavior. East Norwalk, CT: AppletonCentury-Crofts. Sommers, M. S., & Lewis, B. P. (1999). Who really lives next door: Creating false memories with phonological neighbors. Journal of Memory & Language, 40, 83-108. Stadler, M. A., Roediger, H. L., III, & McDermott, K. B. (1999). Norms for word lists that create false memories. Memory & Cognition, 27, 494-500. Stahl, C., & Klauer, K. C. (2008). A simplified conjoint recognition paradigm for the measurement of gist and verbatim memory. Journal of Experimental Psychology: Learning, Memory, & Cognition, 34, 570-586. Starns, J. J., & Hicks, J. L. (2005). Source dimensions are retrieved independently in multidimensional monitoring tasks. Journal of Experimental Psychology: Learning, Memory, & Cognition, 31, 1213-1220. Storbeck, J., & Clore, G. L. (2005). With sadness comes accuracy; with happiness, false memory: Mood and the false memory effect. Psychological Science, 16, 785-791. Thompson-Schill, S. L., Kurtz, K. J., & Gabrieli, J. D. E. (1998). Effects of semantic and associative relatedness on automatic priming. Journal of Memory & Language, 38, 440-458. Toglia, M. P., & Battig, W. F. (1978). Handbook of semantic word norms. Hillsdale, NJ: Erlbaum. Toglia, M. P., Neuschatz, J. S., & Goodwin, K. A. (1999). Recall accuracy and illusory memories: When more is less. Memory, 7, 233-256. Tse, C.-S., & Neely, J. H. (2005). Assessing activation without source monitoring in the DRM false memory paradigm. Journal of Memory & Language, 53, 532-550. Tussing, A. A., & Greene, R. L. (1999). Differential effects of repetition on true and false recognition. Journal of Memory & Language, 40, 520-533. Underwood, B. J. (1965). False recognition produced by implicit verbal responses. Journal of Experimental Psychology, 70, 122-129. Wells, G. L., Small, M., Penrod, S., Malpass, R. S., Fulero, S. M., & Brimacombe, C. A. E. (1998). Eyewitness identification procedures: Recommendations for lineups and photospreads. Law & Human Behavior, 23, 603-647. Wenger, M. J., & Townsend, J. T. (2006). On the costs and benefits of faces and words: Process characteristics of feature search in highly meaningful stimuli. Journal of Experimental Psychology: Human Perception & Performance, 32, 755-779. Wu, L. L., & Barsalou, L. W. (2008). Grounding concepts in perceptual simulation: I. Evidence from feature generation. Manuscript submitted for publication. Yonelinas, A. P. (2002). The nature of recollection and familiarity: A review of 30 years of research. Journal of Memory & Language, 46, 441-517. Zeelenberg, R., & Pecher, D. (2002). False memories and lexical de-
1053
cision: Even twelve primes do not cause long-term semantic priming. Acta Psychologica, 109, 269-284. NOTES 1. Backward and forward association are used throughout this article to refer, respectively, to the tendency of unpresented words to be preexperimental associates of presented words and the tendency of presented words to be preexperimental associates of unpresented words. Although such terminology has become standard usage in the false-memory literature, strictly speaking, it is inaccurate. These terms were originally coined to denote the formation of experimental associations between words during the course of serial learning (Hovland, 1951); that is, when subjects learn to recall a list of the form A, B, C, D, and so on, under conditions of serial presentation and recall, forward associations are acquired linkages of earlier words to later words, and backward associations are acquired linkages of later words to earlier words. 2. Cotel et al. (2008) and one of the reviewers of this article conjectured that automatic priming might yield longer lasting false memories if it produced conscious activation of critical distractors—that is, if backward associations from targets to distractors caused distractors to come to mind while targets were being presented. However, this hypothesis conflicts with findings reported by Carneiro, Fernandez, and Dias (2008), Libby and Neisser (2001), and Neuschatz, Benoit, and Payne (2003). Carneiro et al. observed that if critical distractors come to mind as DRM lists are presented, this could produce false memory suppression on memory tests, via recollection rejection (Brainerd, Reyna, Wright, & Mojardin, 2003). Specifically, when a critical distractor is retrieved during recall, or is presented as a recognition probe, subjects might recollect that they thought of it, rather than studied it, and deliberately reject the item. Consistent with this hypothesis, Carneiro et al., Libby and Neisser, and Neuschatz et al. all reported that manipulations that fomented conscious awareness of critical distractors during list presentation lowered error rates on subsequent memory tests (see also Gallo, Roberts, & Seamon, 1997). 3. Although mean list true recall was used as a predictor, mean list true recognition was not. Data from various studies suggest that although true recall is negatively correlated with false recall and false recognition (and therefore would account for significant variance), true recognition is not correlated with either (cf. Brainerd, Forrest, Karibian, & Reyna, 2006; Stadler, Roediger, & McDermott, 1999). 4. Since one of Toglia and Battig’s (1978) dimensions, familiarity, will prove to be a key predictor of DRM false memory, it is important to avoid confusing it with another well-known concept of the same name—namely, the familiarity operation of theories of recognition (e.g., Yonelinas, 2002). The distinction between the two is analogous to the distinction between semantic and episodic memory. The Toglia–Battig dimension is semantic: Subjects were told to assign familiarity ratings on the basis of how common words are in linguistic usage, with person being given as an example of a very common word, and the resulting familiarity ratings correlated strongly (r .82) with meaningfulness ratings of the same words. In contrast, the familiarity concept of recognition theories is episodic: It refers to feelings of recency, the belief that an item was encountered in the near past, presumably during the study phase, although its presentation cannot be explicitly recollected. 5. For a small fraction of the target–distractor pairs, it was possible for the pair to receive a score of 1 on two Wu–Barsalou relations. With black–white, for instance, the words are common antonyms, but as exemplars of the color category, they are taxonomically related. Because the number of target–distractor pairs that could receive a score of 1 on two Wu–Barsalou dimensions was small, the mean number of semantic relations per list did not differ appreciably when a score of 1 was permitted on only one relation per pair, versus when a score of 1 was permitted on more than one relation. 6. Although semantic relatedness is the dominant property of everyday false memory, it should be noted that there are other paradigms in which false memories are rooted in visual or auditory resemblance between presented and unpresented items (Holliday & Weekes, 2006; Sommers & Lewis, 1999). (Manuscript received September 2, 2007; revision accepted for publication June 6, 2008.)