Exp Brain Res (1982) 46:357-367
BranResearch 9 Springer-Verlag 1982
A Time-Course Analysis of Attentional Tuning of the Auditory Evoked Response* M.W. Donald and M.J. Young Department of Psychology, Queen's University, Kingston, Ontario, K7L 3N6, Canada
Summary. This study examined the time course of attentional tuning of the N1 and P3 components of the auditory evoked potential. Human subjects were presented with two concurrent sequences of pure tone stimuli, one sequence delivered to each ear. They were instructed to listen to the tones in one ear and count randomly-embedded target stimuli, identified by pitch, while ignoring concurrent and physically equivalent stimuli in the other ear. Attention was then allocated to other ear-pitch combinations in subsequent runs. The rate of stimulation was rapid, an average of three stimuli per second per channel, to maximize N1 differences between channels. Evoked potentials were sampled at various times during each experimental run, to determine the time course of amplitude change in each auditory channel, as the subject tuned his neural response to the selected stimuli. The results indicated that N1 took 30-45 s to emerge as significantly larger in the attended channel, whereas P3 was instantly larger in the attended channel upon presenting the first rare stimulus of a run. The N1 effect disappeared for standard stimuli after about 7 min of stimulation, despite a continuously high rate of target identification. However, for the rare target stimuli, Nt and P3 remained at a higher level in the attended channel throughout the typical 15 min run. The study concludes that neural selectivity proceeds in a "top-down" manner, with the longer-latency P3 component showing a selective response sooner than N1. In addition, there is evidency that the selectivity of N 1 tuning increases over time, with the continued focussing of attention. Key words: Attention - Audition - Evoked response Habituation - Dichotic listening
-
* Funded by the National Science and Engineering Research Council of Canada Offprint requests to: M.W. Donald, PhD (address see above)
Introduction
In paying attention to a particular aspect of the environment, the observer experiences a narrowing or "focussing" of awareness, which is accompanied by heightened perceptual clarity of the attended object, and a simultaneous loss of sensitivity to things which lie outside of the focus of attention. This can be corroborated with objective tests of sensory thresholds and memory (Broadbent 1958; Moray 1969). Some of this alteration of sensory experience may be due to peripheral adjustments, such as changes in the orientation of the head and eyes, but it is possible to construct paradigms in which such adjustments cannot be made, and in which the focussing of attention must be achieved entirely within the central nervous system. In the latter case of "central" selection, the sensory response is apparently being continuously and actively modified by recently stored information which specifies where attention is to be allocated. One of the most widely used of the central selection paradigms is dichotic listening, where two physically equivalent sound streams are simultaneously presented to the auditory system, one to each ear. If stimuli, whether clicks, pure tones or more complex sounds such as speech, are presented at a sufficiently fast rate, the observer finds it is easier to listen to one or the other ear than to monitor both ears at once (Axelrod et al. 1968; Treisman 1971; Harvey and Treisman 1973). When the experimenter directs the subject to listen only to one ear, the recall of material presented to the other, or rejected ear is very limited. If evoked potentials are monitored in such a situation, the potentials evoked by stimuli in the attended ear are relatively larger than those elicited by stimuli in the unattended, or rejected ear (Hillyard et al. 1973). Analogous effects of central selection on evoked potentials have been reported in 0014-4819/82/0046/0357/$ 2.20
358 the visual ( H a r t e r a n d Previc 1978) a n d s o m a t o s e n sory ( D e s m e d t a n d R o b e r t s o n 1977) modalities. O n c u r r e n t e v i d e n c e the a u d i t o r y selection process results in at least two effects o n the b r a i n ' s response to s t i m u l a t i o n . First, a n e g a t i v e c o m p o n e n t usually p e a k i n g b e t w e e n 70 a n d 130 ms after stimulus onset, is i n c r e a s e d for all stimuli in the a t t e n d e d auditory c h a n n e l . If the a t t e n d e d stimulus s t r e a m includes b o t h f r e q u e n t " s t a n d a r d " stimuli, a n d in addition rare "target" stimuli which r e q u i r e a n active cognitive r e s p o n s e (e.g. storage in m e m o r y ) , the negative c o m p o n e n t will b e i n c r e a s e d for b o t h the f r e q u e n t a n d rare stimuli (Hillyard et al. 1973). This effect ist referred to h e r e i n as N1 t u n i n g . T h e second a l t e r a t i o n of the e v o k e d p o t e n t i a l is e v i d e n t in the late positive c o m p o n e n t , P3, which is only p r o d u c e d by the rare stimuli i n the series. T h e P3 response is typically m u c h larger in the a t t e n d e d c h a n n e l , almost to the exclusion of the r e j e c t e d channel. T h e rare stimuli in the r e j e c t e d c h a n n e l also elicit a small P3, b u t it is n o t n o r m a l l y r e s p o n s i v e to variations in stimulus p r o p e r t i e s in the r e j e c t e d c h a n n e l ( D o n a l d a n d Little 1981). This effect is referred to h e r e i n as P3 t u n i n g . T h e time r e q u i r e d to achieve this n e u r a l selectivity is u n k n o w n . N 1 a n d P3 t u n i n g might be an i n s t a n t a n e o u s effect o n shifting the locus of a t t e n t i o n internally, in which case selectivity s h o u l d be e v i d e n t u p o n p r e s e n t a t i o n of the first stimulus, p r o v i d e d the observer has b e e n f o r e w a r n e d of the event. O n the other h a n d , it m i g h t occur gradually, after the stimulus s t r e a m has b e e n p r e s e n t e d for s o m e time. I n s t a n t a n e o u s selectivity w o u l d i m p l y the existence of an i n t e r n a l "switch" which could preset the sensory response. G r a d u a l selectivity w o u l d imply a m e c h a n i s m which was m o r e d e p e n d e n t u p o n the stimulus s t r e a m itself, a n d which a l t e r e d s o m e u n d e r lying n e u r a l p r o p e r t y such as h a b i t u a t i o n or sensitization, or some c e n t r a l t u n i n g process which itself r e s p o n d e d to the repetitive n a t u r e of the stimulus. B o t h N1 a n d P3 t u n i n g are e x a m i n e d in this e x p e r i m e n t . A high-speed dichotic listening paradigm was utilized to m a x i m i z e the t u n i n g of b o t h N1 a n d P3, a n d thus m a k e their e m e r g e n c e in time as clear as possible. T h e e x p e r i m e n t a l design allowed a s e q u e n t i a l analysis of e v o k e d p o t e n t i a l s in b o t h the a t t e n d e d a n d r e j e c t e d c h a n n e l s , in the long a n d short term, from a few seconds to several m i n u t e s . A previous r e p o r t using this design ( D o n a l d a n d Y o u n g 1980) c o n f i r m e d its usefulness in describing the long a n d s h o r t - t e r m t r e n d s of e v o k e d p o t e n t i a l s in two c o n c u r r e n t a u d i t o r y channels. I n this p a p e r the m e t h o d is e x t e n d e d to describe the time courses of both N1 a n d P3 t u n i n g .
M.W. Donald and M.J. Young: Auditory Evoked Response
Method The experimental design allowed sampling the auditory evoked response at a series of successive time points during the performance of a selective listening task. The subjects received one stimulus stream in the right ear, and one in the left, delivered through stereo earphones. The stimuli were pure tones of 50 ms duration, 55 dB SPL, with an exponential rise time of 20 ms, and subjects were instructed to count rare target tones in either one ear or the other. One ear received a Bernouilli sequence made up of standard 1500 Hz tones (90% probable) and randomly inserted rare tones of 1,560 Hz (10% probable). The other ear received a similar sequence made up of standard tones of 800 Hz and rare tones of 840 Hz. The interstimulus intervals within each ear channel ranged from 150 to 550 ms at random, with a mean of 350 ms, and the two sequences were offset by 75 ms to prevent simultaneous tone onsets in the two ears. The task was presented in a series of brief (10-11 s) trials, each consisting of 32 stimuli in each ear. Following each trial there was a 6 s intertrial interval during which the subject recorded an estimate of the number of rare tones presented in the attended ear in the previous trial. There were 70 trials per run, and the attended channel assigned in successive runs to each of the possible earpitch combinations (right 840, right 1,560, left 840, or left 1,560) following a Latin-square design. The unattended, or rejected, channel for each condition was automatically the complement of the attended channel. Evoked potential averages were obtained by collapsing across all four attention conditions. Thus each ear-pitch combination appeared with equal frequency in the final average, in both the attended and rejected channel responses. EEG recording was limited to a vertex (Cz) electrode, referred to linked pinuae. Silver-silver chloride electrodes were used, with modified Beckman input couplers and amplifiers. The time constant was 0.1 s,with the low-pass filters set at 32 Hz. EOG recording employed similar electrodes placed on the left supraorbital ridge, referred to the left external canthus. These data, along with coded pulses indicating stimulus events, were recorded on an FM tape recorder and digitized and averaged off-line. Averaging sweeps were set at 256 ms for the standard tone averages and 512 ms for the rare tone averages. The reason for the relatively short time constant of the amplifiers was that, in pilot work using a 3 s time constant, it became clear that steady-potential shifts of 1 or 2 s duration were developing at the start of some stimulus trials, which interfered with the recording of the components of primary interest, namely N1 and P3, and which distorted baselines as well. This might be averaged out in a conventional design which averaged across all stimuli. But since sequential changes in the evoked response were of prime interest, the very slow changes obscured the sequence of changes coinciding with their occurrence, and therefore were eliminated. The N1 and P3 components were not distorted: a 0.1 s time constant will pass a sine wave of 2 Hz, and both N1 and P3 exceed this frequency. Eight male undergraduates, 21-30 years of age, with no history of neurological disease or hearing loss, served as subjects. The auditory N1 and P3 components were quantified in a conventional base-to-peak manner. The baseline was derived from the mean voltage of the first 50 ms of the averaging epoch. N1 was identified at the largest negative peak between 70 and 130 ms, and P3 as the largest positive peak between 250 and 512 ms. In cases of double N~ peaks, or flat N~ morphology, the latency was derived by interpolation. This method of measuring N1 and P3 corresponded to the method used in most of the previous literature on N~ and P3 tuning and thus provided continuity with those studies. Measurements were also taken in the 200 ms latency range to determine whether a P2 or Nz effect could be found, in view of recent evidence for a selective attention effect in this latency range (N/i/it/inenand Michie 1979), although these were not the primary
M.W. Donald and M.J. Young: Auditory Evoked Response targets of the investigation. Subtractions of potentials recorded in the attended channel from those of the rejected channel were also derived, but statistical analysis of differences between channels relied upon the original evoked potentials, not upon the subtractions. The behaviour of the subjects was evaluated by estimating both hit rate (percentage of targets detected) and guessing rate (number of false positives per trial). This was not possible to measure in the usual manner, given that subjects did not give a response to every target stimulus. Motor responses on-line were not employed in this study, because of the possible confounding effect of both averaged movement potentials and electromyographic artifacts, and because of the rapid overall rate of stimulation (about six stimuli/s), which would make it difficult to associate a fairly slow motor response with a specific stimulus. By having the subject count stimuli, the motor confound was avoided, but the subject's response was ambiguous. For example, a count of four might indicate correct detection of four targets, or detections of three targets and one false positive, or two of each, and so on, depending upon each subject's guessing behaviour. Thus an estimate of guessing behaviour was needed. This was provided by the use of maximum likelihood estimation, which was used to fit a model to performance patterns over a period of time. The recorded counts of each subject for each set of 70 trials were used to find the maximum likelihood estimates of two parameters, II (the probability of correctly detecting a rare tone) and ix (the number of inventions or false alarms on the average trial). The method is discussed in detail in Broekhoven et al. (1981). "Chance" performance can have a variety of values in this framework: II would necessarily be close to zero, but ~t could vary widely depending upon the number of inventions produced by a given subject. If performance was erratic, the method would fail to estimate the II and ~t parameters with an acceptable degree of confidence. The analysis of the sequential trends of evoked potentials required that certain constraints be placed on the experimental design. The mean interstimulus interval had to be identical for every successive tone on the average trial, as did the standard deviation. The probability of a rare tone had to be the same for each successive tone position. Rare tones were never presented in the first position, both because of eye blinks and steady-potential shifts which were likely to follow, and because of the probability that P3 components elicited in that position might be fundamentally different from those elicited in any other position, having a more frontal distribution and longer latency (Squires et al. 1975; Courchesne et al. 1975). Given these constraints, averages could be obtained in two ways to evaluate sequential changes: row averages represented the mean response across all the standard stimuli of a single trial, column averages the mean response across all the standard stimuli of a single tone position. The row averages allowed an estimate of the long-term course of evoked response amplitude; while the column averages yielded the short-term course. In pilot work it was found that the reliability of the N1 tuning effect increased with N until a sample size of 256, so this was selected as the ideal sample size. With 70 trials over four conditions, it was possible to obtain an N of 256 in the column averages. In the row averages it was necessary to collapse across three rows at a time to obtain this N. For certain of the averages, however, this ideal number could not be achieved.
Results
Grand mean N1 amplitudes for standard stimuli, and N 1 and P3 amplitudes for rare stimuli, were computed for the attended and rejected channels, averaged
359
I-I
Attended Rejected m
f J f J r
j
13
r
j
"--
f t f J ~,vj
f ~
1
c/ f
I
f l
13.
J
f j
E
f J f j
<
f
f J
~4
i
r I A f
fA
J
0
f
N1 Standards
NI
1,1
9
P3
Rares
Fig. 1. Group means for Nt and P3 components in two concurrent auditory channels, one of which is attended, and one rejected. Grand averages over entire experimental session, N = 2,048 per average
across all trials and runs (Fig. 1). A two-way analysis of variance revealed a significant effect of attention on the standard tone Na (F(1,7) = 11.07, p _ 0.01) and on the rare tone N1 (F(1,7) = 14.27, p -< 0.01) and P3 (F(1,7) = 6.00, p _< 0.05) components. There was no significant difference in amplitude between ears and the attention effect did not interact with ear of stimulation. Therefore the data for both ears were combined in subsequent analyses. The mean differences between the attended and rejected channels, calculated as a percentage of the amplitude in the rejected channel, were 39%, and 109%, respectively, for standard N1 and rare N1 and P3 amplitude. There was no effect of attention on the latency of N1 or P3, although in the latter case latency was sometimes difficult to measure in the rejected channel. An attempt was made to quantify P2, but it was not well resolved in most records, and analysis of variance failed to reveal a significant effect of attention of either positive or negative polarity, in the 200 ms latency range. Thus, the instruction to attend to a specific channel had a substantial effect on the relative amplitude of N 1 and P3 in the attended and rejected channels, but no effect was discernible in the 200 ms latency range. The accuracy of the subjects' target detection in the attended channel was modelled by maximum
360
likelihood estimation of H (hits) and tx (guesses). The estimates of II and ~t were tested for significance by estimating their confidence intervals. Significant estimates were obtained for all subjects. The general level of performance was high, with an average I-I value of 0.83, and an average ~t value of 0.32. Analysis of variance revealed no significant differences between 17 and ~t values as a function of ear of stimulation or of time (first half vs. last half of trials). It may be concluded that these subjects were successfully carrying out the instructions to count targets in the attended channel, with a minimum of guessing. Pearson product-moment correlations were calculated between FI and the amplitude differences between the attended and rejected channels for the standard tone N1 (r = 0.66, p --- 0.05). Rare tone N1 (r = 0.59, n.s.) and rare tone P3 (r = 0.44, n.s.), revealing a significant correlation only in the case of the standard tone Nx. No significant correlations were found between tx and any of the evoked response components.
M.W. Donald and M.J. Young: Auditory Evoked Response
;I Wv
Trials1_3 ,~/#""". '"~ 4-6
.
10-12
Elapsedo. 8 minTime
~
1.6rain
............
3.3 rain
22-24
6.7 rain
46-48 ~
13.4min I
I
I
I
I
256ms Fig. 2. Single-subject data illustrating the gradual amplitude decrement of the N1 component over time. Solid lines = attended channel; dotted lines = rejected channel. Each trace is an average of 256 samples. Dots indicate where N1 peaks were measured
Sequential Analysis of Standard Tone N1 Amplitude: Long-term Course Since the standard tones were the more frequent stimuli, responses evoked by them could be analysed in a reasonably detailed manner as a function of time. They did not constitute the target stimuli to which the subject responded and the standard ones were in that sense probe stimuli in the attended and rejected channels. The long-term course of Nt was initially estimated in blocks of three trials, collapsed across the four attention conditions. This allowed for a sample of 256 in the evoked response averages obtained for each subject. Single-subject averages are illustrated in Fig. 2, and group mean N1 amplitudes at each sampled time point are shown in Fig. 3. The overall trend was clearly towards a decline in amplitude over time, in both channels, the rate of decline being more gradual in the attended channel. The amplitude difference between channels was greatest in absolute terms at the 1.6 min mark in the session (time includes the intertrial pauses), but was already present in the average of the first three trials, that is, during the first 0.8 min. A two-way analysis of variance indicated a significant decrement in N1 amplitude over time (F(4,7) = 3.41, p _ 0.02), a significant effect of attention (F(1,7) = 6.33, p -< 0.05), and a significant interaction between attention and time (F(1,7) = 3.00, p _ 0.02). The averaged trends shown in Fig. 3 resembled an exponential decay process, and six of the eight
-3.0
N1 (/uV)-2.5
I\
-" -- Attended
-2.0 IIIIII
0
II II III
10
IIIllll
II Ill
III
IIIIIlll
I IIII
IIIIIII
20 30 40 Successive Triats
I
50
Fig. 3. Group means for standard tone Na amplitude over successive trials. Each data point represents the mean response of eight subjects, collapsed across four attention conditions. The superiority of the attended channel is greatest in early trials, disappearing after 7 rain (24 trials)
subjects showed what appeared to be an exponential decrement of N1 over time in their individual records. To test the consistency of the amplitude-time function, an attempt was made to fit an exponential decay model to the individual data, of the form: m-1
01 e x p (-02t) + 03 + Y a iXi i=l
361
M.W. Donald and M.J. Young: Auditory Evoked Response
-6
Tone Ist
Attended Rejected
-5
t
Etapsed Time Os
B
4
B
3
N1
(/uv)
_
,~-----
2 I i ~
~
. . . . . . . . . . . . . . .
-1 2nd
0.35s
4th
1.05s
8th
2.10 s
16th
4.20s
0
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
4 8 12 16 Tone Position in TriG[
Fig. 5. Group means for N1 amplitude during the short-term period of a single trial. Each data point represents the mean response of eight subjects, collapsed across four attention conditions. The attended channel evoked a larger response throughout the average trial
of N1 in the attended channel in all but one subject. However, this was not the case in the data for the rejected channel where the trend for N1 amplitude decrement resembled a step function.
Sequential Analysis of Standard Tone N1 Amplitude: Short-term Course
256 ms Fig. 4. Single-subject data showing the rate effect within a single trial on standard tone N1 amplitude in both attended (solid lines) and rejected (doted lines) channels. Each trace is an average of 256 samples. Dots indicate where Nx peak was measured. The evoked response to the first stimulus is shown at half the gain of the later ones; the latter correspond to the amplitude marker
where t = time, and 01 = initial amplitude at t = 0, in excess of asymptote 0z = decay constant 03 = asymptote for subject n i = difference in asymptote between subject i and n Xi = dummy variable (1 for subject i, and 0 for subject n) The dummy variables were a necessity because of the different non-zero amplitude asymptotes reached by individual subjects. Linear regression analysis verified that both the exponential term and the dummy variables were necessary for a significant fit to the data. A plot of the predicted vs. actual N 1 values revealed that the exponential model constituted a good description of the long-term amplitude decline
An analysis of the short-term pattern of N 1 amplitude over time, within the shorter period (10 s) of a single trial, was done by deriving column averages. A set of raw data from one subject is shown in Fig. 4 for the first, second, fourth, eight and 16th stimuli of the typical trial. Both the attended and rejected channels show a very rapid amplitude drop after the first stimulus. A two-way analysis of variance of withintrial changes in N1 amplitude for all subjects revealed significant effects of attention (F(1,7) = 20.08, p < 0.01) and time (F(4,7) = 40.62, p <_ 0.0001) with no significant interaction. Thus on the average trial N1 was larger in the attended channel from the first stimulus to the last, and the tuning effect must have "carried through" the 6 s interstimulus interval from trial to trial. This can be seen in the group averages shown in Fig. 5.
Emergence of the N1 Difference Between the Attended and Rejected Channels Neither the long-term nor the short-term analysis revealed the time course of the emergence of N1 tuning. In both cases, N1 tuning, that is, amplitude
362
M.W. Donald and M.J. Young: Auditory Evoked Response
~,
-4 ELAPSED TIME
TR I ALS
~. -- A t t e n d e d ---.Rejected
N1 - 3
(/uv) I
/
-2 0 s
(IStstim)
llllllllllllllllltlllJl/~
",\.
0
5
10
Successive
- ~I~V
.-
2
3
"~
~ : ~ x ? ~ ~ l ~ I
I
I
I
20
50
TriGis
Fig. 7. Group means for N1 amplitude, showing the time course of N~ amplitude on an expanded time base. Note the rapid N~ decrement in both channels on trial 2, and emergence of a large amplitude difference between channels on trial 3. The EPs from stimulus No. 1 indicated N1 amplitudes of over 10 I~V in both channels, and these data are not included, so as to maintain the scale of the graph
28 s
~
15
45s
I
256 ms Fig. 6. The emergence of N1 superiority in the attended channel during the first three trials. Each trace is a grand average of data from all eight subjects. Upper trace: response to first stimulus of trial 1, N = 32. Lower three traces: average EP during trials 1, 2 and 3, N = 512. Solid lines = attended channel; dotted lines = rejected channel. Dots indicate where N~ peaks were measured
superiority in the attended channel, was present from the start. However, it was possible that averaging the first three trials together blurred the early emergence of N1 tuning, and therefore a more detailed analysis of the first three trials was attempted, even though the resulting averages could only be based on a sample of 64 EPs for each subject. In addition, the first EPs of each experimental run - i.e., the EPs to the first standard stimuli in each channel - were sampled on a single-trial basis for each subject. The latency and amplitude of N 1 w e r e assessed independently by two scorers in the latter case, and a
consensus was reached on any discrepancies in the scoring. These data, grand-averaged across all subjects, are shown in Fig. 6, and graphed in Fig. 7, in combination with the data on later trials. It can be seen that the EPs to the first stimuli of trial 1 of each experimental run were slightly larger in the rejected, or unattended channel, but this difference was not significant. Thus, at time zero, the beginning of the run, N1 selection was not yet evident. Although an amplitude superiority for N1 in the attended channel appeared in the grand-averaged EPs obtained across all stimuli in trials 1 and 2, these differences did not reach, or even approach, significance (trial 1, t (7) = 0.70, n.s. ; trial 2, t (7) = 0.90, n.s.). On trial 3, however, the EP in the attended channel increased substantially in amplitude, and the amplitude superiority of the attended channel reached statistical significance (t (7) = 6.76, p - 0.01). The difference in N1 amplitude between the attended and unattended channels (ND) is illustrated in Fig. 8, expressed both as a voltage difference, and as a percentage of N~ amplitude in the unattended channel. Both the emergence and decay of ND is visible in Fig. 8. ND significantly exceeded zero, according to the statistical analysis, on trial 3, but not in earlier EP samples taken in trials 1 and 2; therefore by the traditionally used criterion of ND differing from zero, N1 tuning was established 30-45 s after the start of the average run. A one-way analysis of variance of ND failed to show a significant change in N D over time, for the time points shown in
363
M.W. Donald and M.J. Young: Auditory Evoked Response
- 1.2
~
ND (/uV)
--'8
ND
o"
o
0 I
0
J
J
I
t
I
I
I
I
J
I
t
I
I
I
I
I
I
I
I
5 10 15 20 Successive Trictts
I
J
I~,_l
50
Fig. 8. Group means, for successive trials of ND, (the difference in N1 amplitude between the attended and rejected channels) derived from the data shown in Fig. 7. ND is shown in absolute terms (~tV) and as a percentage increment over amplitude in the rejected channel (%). The ordinate indicates microvolts for the upper curve and percentage increment for the lower curve
Fig. 8, although the F value approached significance (F (3,21) = 2.40, p --- 0.09). The latter analysis was of questionable validity, however, in view of the heterogeneity of variance across trials, with the variance diminishing by over 50% between trials 1 and 3. Visual inspection of single-subject data revealed that one subject already showed a large N 1 superiority, sustained in later trials, in the attended channel on trial 1, two more on trial 2, and the remainder on trial 3, so that all eight subjects showed the effect by trial 3. Thus, although the average trend indicated that N1 selection takes 30-45 s to emerge, there were some subjects who achieved it somewhat more rapidly.
Sequential Analysis of the Rare Tone N1 and P3 Components The rare tone evoked responses could not be evaluated on a precise time scale because of their scarity, which precluded frequent averaging. However, a rough estimate of their time course was made by sampling 64 rare stimuli from the first and second halves of each run. This procedure yielded averaged EPs for each subject with an N of 256, when collapsed across the four attention conditions. Analysis of variance revealed a significant effect of attention on both N 1 (F(1,7) = 14.27, p _< 0.01) and P3 (F(1,7) = 6.00, p -----0.05), but no effect of time, and no interaction of attention and time. An attempt was made to estimate the relative amplitude of N 1 and P3 in the attended and unat-
i
I
i
512rns Fig. Q. Grand-averaged EPs to the first rare stimuli in each channel, N = 32, across eight subjects. Solid lines = attended channel; dotted lines = rejected channel. Dots indicate where N1 and P3 peaks were measured
tended channels on the first rare stimulus of each 70trial tun. An examination of the control tape protocols showed that the first rare stimulus was, on the average, the l l t h stimulus of the first trial. Although the position of the first rare stimulus varied widely in individual cases, it was never the first stimulus of trial, and was thus always preceded by at least one standard stimulus. The response to the first rare stimulus was sampled on a single-trial basis for each attention condition, for both the attended and rejected channels. The grand-averaged waveforms for all subjects are shown in Fig. 9. Although the resulting waveforms were noisy, the concurrent eye movement records were flat, and both a negative peak at approximately 100 ms, and a large positive wave of 300 ms latency were clearly present. These were identified as N1 and P3 and quantified base-topeak in the single-trial records by two scorers. N 1 amplitude was not significantly different between the attended and rejected channels showing that the rare stimuli were not treated differently from the standard stimuli at this level. However, P3 amplitude was significantly larger in the attended channel (F(1,7) = 26.65; p ___ 0.01) upon presentation of the first rare stimulus. The estimated time course of P3 amplitude is illustrated in Fig. 10 by showing mean P3 amplitudes to the first rare stimuli, along with the averages obtained from the first and second halves of the experimental run. It can be seen that the attended channel P3 was considerably larger in amplitude to
364
M.W. Donald and M.J. Young: Auditory Evoked Response
5
.90 ~,,i._~,~I ----t 80
-"
-" Attended 9 ----Rejected
4
P3
(/uV)
t "IT
70
3 2
>, u
60
o
50
2 (J u
40
<
30 20 10 Illl
tllllllllll
0
Ill
10
IIIIIII
20
IIIIIIIIII
30
0
IIIIl|llllJlllJlll
40
50
Successive Triots Fig. 10. Group means for P3 amplitude over successive trials. The first data point for each channel is the mean response to the first rare stimulus, comprising single-trial P3 samples taken from each subject-run, N = 32. The subsequent two data points are means of conventional within-subject averages (N = 2,048) for the first and last half of each run
the first rare stimulus in the run than it was in subsequent averages. The amplitude of P3 in the rejected channel was consistently low, but it was not zero, and a positive peak was easily identifiable at the same latency as P3 in the attended channel in the majority of subjects. The amplitude of P3 in the rejected channel tended to decline to zero during later trials in four of the eight subjects, but this trend did not reach statistical significance. In sum, the difference in Nt amplitude between channels was not significant upon presentation of the first rare stimulus but reached significance somewhat later in the run. In contrast, P3 amplitude was larger in the attended channel from the start, and remained so throughout the run. An initial drop in P3 amplitude was evident in the attended channel, although its time course could not be determined with precision. The difference in P3 amplitude between channels was a simple function of the magnitude of P3 in the attended channel.
Sequential Analysis of Behavioural Results In view of the complex pattern of emergence and decay of N1 tuning, it was considered important to describe the time course of performance in more detail than initially shown by simply comparing the early and late trials. Maximum likelihood estimation
IIIIIIIIIIIIIIJJJllJ
III
II II IIIIIIIII
10 20 30 40 Successive Triols
IIII
IIIII
50
Fig. 11. Target detection (I-I) and guessing (~) rates over successive trials, estimated for the performance data of eight subjects. Vertical bars indicate standard deviations
was used to describe the combined performance of all subjects on three-trial segments of performance, specifically on those trials corresponding to the time points in Fig. 3. The results are illustrated in Fig. 11. Although performance was very good from the start, some improvement recurred during the early trials, during the period when ND was maximal. There was no decline in performance when the standard tone ND declined to zero in later trials; in fact the 17 parameter (hits) at this stage reached its highest level, although ~t also increased, indicating a tendency towards more guessing during later trials.
Discussion
Prior to discussing the time-course analysis, the arguments will be considered for attributing N1 and P3 tuning to attention, rather than to possible confounding factors~ The difference in amplitude between channels could not be due to a physical difference in stimulation. The stimulus sequence, averaged over the four ear-pitch combinations, had identical properties in the attended and rejected channels. There were equal numbers of stimuli of both pitches and loci, and the same mean and variance for the interstimulus interval in both channels. The probability of rare tones was identical in both ears, and stimulus intensity and duration were similar in both ears. Moreover, both channels were averaged simultaneously and nonspecific changes in arousal, which would be distributed equally to both
M.W. Donald and M.J. Young: Auditory Evoked Response channels, could not account for the differences between them. Since the timing of both standard tones and rare tones was unpredictable, the possibility of "time-locking" some nonspecific event onto the attended channel must be ruled out. The averaged bipolar electro-oculograms were fiat, indicating that there were no eye movement confounds in the averages of either the attended or rejected channels. It is also unlikely that the N 1 and P3 selection effects were due to peripheral physiological reductions in the physical intensity of stimulation. Head position, relative to the source of stimulation, could not be altered, given the firmly anchored stereo headset by which sound was delivered. Stimulus intensity was set at a moderate level (55 db above threshold), and was therefore below the intensities at which the tensor tympani or stapedius reflexes would be engaged. Schwent and Hillyard (1975) demonstrated that, with stimulus intensities comparable to those used in this study, subjects could selectively enhance N~ and P3 in any of four different locations in auditory space. Such effects could not be achieved with the middle ear muscles, and by extension it is unlikely that the middle ear muscles played a role in the outcome of this study. The temporal independence of N1 and P3 tuning in this study also militates against such a notion; if the central effects in Nx and P3 were entirely due to such peripheral adjustments, they should have occurred in synchrony, not independently. Regarding the subjects' adherence to their instructions, the mean II of 0.82 indicates that the subjects were identifying target tones reliably throughout the experiment; the average ~ of 0.32 shows relatively little guessing, with about one false alarm every three trials. Written comments from subjects recorded after every trial indicated they experienced few intrusions from the rejected channel. This, combined with the fact that the amplitude superiority of the attended channel occurred regardless of the particular ear-pitch combination being attended, or of the order in which they were attended, confirms the relation of N~ and P3 selection to selective attention in this study.
The Time Course of N1 Tuning The short-term analysis showed that the N1 component to standard stimuli followed a step-function pattern during the typical 10 s trial. The first stimulus of the trial elicited a much larger response than subsequent stimuli because the 6 s intertrial interval allowed considerable recovery (Davis et al. 1966) of the N1 component to the first stimulus. The reduction
365 in amplitude from the second stimulus until the end of the trial was undoubtedly a rate effect, due to the fact that the 350 ms average interval between stimuli did not allow full recovery of the system from the previous stimulus. Since the first stimulus of each trial already elicited a larger N 1 in the attended channel, N 1 tuning must have carried through the intertrial interval from the previous trial. It would be of some interest to determine how long such a bias could be sustained in the auditory system in the absence of stimulation. The stability of the amplitude separation between channels during the typical trial shows that the mechanism of N1 tuning added a quantum, or constant voltage to N1, rather than acting as a simple amplifier. An amplifier would have produced a voltage difference proportional to N 1 amplitude, and therefore a much larger difference between channels on the first stimulus. The long-term analysis revealed a general decremental trend in N1 amplitude to standard stimuli. There was no trend towards an increase in N 1 over time in any subject, and most subjects showed roughly the same decremental trend, as the curvefitting analysis showed, after compensating for individual differences in initial amplitude and asymptote. The long-term decremental trend was too slow,. occurring over minutes, to be a rate effect, and followed a negative exponential course. The decrement in the rejected channel was complete within the 1st min, whereas the attended channel took almost seven minutes to reach asymptote, on the average. The emergence of N1 superiority in the attended channel was a gradual process, superimposed upon the general decrement in N1 over time. One possible explanation for the gradual onset of N1 selection might be that it involves, at least in part, a physiological mechanism which by nature depends upon repeated exposure to a stimulus. Habituation is such a mechanism. The decrement in N 1 across trials in this study fulfilled the essential conditions for habituation; a gradual, negative-exponential course over time, recovery after the intertrial interval, and an absence of peripheral changes in the external auditory apparatus or in stimulus intensity. The emergence of N1 selection appeared to depend, at least in part, upon habituation of N~ in the rejected channel, and this required some time. However, N1 selection cannot be accounted for solely by the more rapid habituation of the rejected channel. The attended channel achieved N1 superiority before the rejected channel reached asymptote (compare Figs. 7 and 8). The distinct time course of this increment argues for a separate process from habituation. Groves and Thompson (1970) proposed a two-process theory of habituation from rather
366 similar evidence, derived from studies of cat spinal reflexes. They found that increments and decrements of response followed distincitve time courses, and labelled the incremental process "sensitization". They also found different unit neurons which subserved the two processes. Sensitization at a spinal level, as observed by Groves and Thompson (1970), was apparently nonspecific, whereas the incremental process reported here was highly specific, being restricted to a specific locus in auditory space. In other respects, however, it resembles the process of sensitization, and in this view, N1 tuning could be seen as the result of altering the balance between habituation and sensitization of the auditory N1 component. An alternative interpretation of these results would follow Nfifitfinen and Michie (1979) in attributing N~ tuning to the presence of a separate negative slow wave, rather than to the sensitization of the auditory N 1 component itself. In this interpretation, ND would represent an additional event restricted to the attended channel, which followed the time course indicated in Fig. 8. The long-term decrement in N1 amplitude might still represent habituation of the auditory N~ component, but N1 habituation would be seen as a separate event, possibly unrelated to attention. N 1tuning would thus be entirely accounted for by an incremental event, ND, in the attended channel, which summates with N~ in the averaged record due to an overlap in latency. The quantum nature of N1 selection, as indicated in the short-term sequential analysis, and the general lack of proportionality between ND and N~ in the long-term analysis as well, would be compatible with this theory. However, the present results do not contradict the sensitization theory either, and topographic evidence suggests that Nfifitfinen and Michie's model may not explain the results of high-speed listening tasks such as the one used in this experiment (Hillyard 1981). Further research, manipulating both the incrementing and decrementing N1 processes, and utilizing topographic analysis, will be needed to resolve this problem. The disappearance of the standard tone ND halfway through the average trial was somewhat unexpected, since performance did not deteriorate after the superiority of the attended channel had ceased. This could have been the result of a changed strategy - perhaps selection initially occurred on the basis of both location and pitch, resulting in an increased response to all stimuli in the attended ear. Then, as the subject became more attuned to the task, N 1 selection narrowed, on the basis of pitch alone, to the target stimuli. In fact, the target ND did not change significantly in later trials, supporting this
M.W. Donald and M.J. Young: Auditory Evoked Response interpretation. Regardless of the psychological interpretation, the physiological definition of the attended channel (the generalization gradient of ND) was altered in later trials. Apparently the selectivity of N 1 tuning increased with the passage of time in this experimental paradigm, with the continued focussing of the subject's attention.
The Time Course of P3 Tuning P3 was selectively increased in the attended channel from the presentation of the very first target stimulus in a run, and maintained its amplitude superiority throughout the typical 7-trial run. P3 in the attended channel showed a large initial drop in amplitude, a result consistent with single-channel studies of P3 habituation by Ritter et al. (1968), Frtihstorfer and Bergstr6m (1969), and Megela and Teyler (1979). However, it stabilized at more than twice the amplitude of P3 in the rejected channel. On this evidence, P3 selection is virtually instantaneous, and must therefore be generated by a system which can be preset to favor a targetted input or at least, which can be biased very rapidly on the basis of the first few standard stimuli. The instantaneous exclusion of the P3 response to rare stimuli in the rejected channel indicates that P3 did not habituate in the rejected channel: rather it was suppressed from the start. This implies an internal switch-like mechanism, unlike the more gradual mechanism which appears to underlie the suppression of N1 in the rejected channel. Other evidence for a switch-like P3 tuning mechanism comes from a study by Donald and Little (1981). In the latter study, the inverse proportionality of P3 to stimulus probability, which is so striking in the attended channel, was not found in the rejected channel; that is, the residual rejected channel P3 was not larger when the probability of rare stimuli was greatly reduced. This suggests that, in the rejected channel, the probability-sensitive mechanism which modulates P3 amplitude can be switched out completely, although this may only apply to P3B, which was undoubtedly the component recorded in our studies given our experimental design and the latency of the component (Squires et al. 1975). When both the N1 and P3 data are taken into account, it is evident that the neural focus of attention, defined by the specificity of evoked potential components, went through several stages during an experimental run. The very first standard stimulus was not processed selectively, so far as these techniques could determine. The first rare stimulus showed P3 tuning, but no N1 tuning; thus the neural
M.W. Donald and M.J. Young: Auditory Evoked Response
focus of attention was restricted initially to longlatency processing of targets. Subsequently both standard and rare N1 components were tuned whether in synchrony we cannot say - and at this stage the neural focus included all stimuli with the same location in auditory space as the targets. In later trials, the tuning of standard stimuli dropped out, and the focus returned to the targets alone, but unlike the initial stage of selection, tuning involved both N1 and P3. The separate time courses of N1 and P3 tuning point to different generator mechanisms for the two effects, and argues against the notion of a single, monolithic mechanism for auditory selection. It has been suggested by some authors (Hillyard and Picton, 1979) that the scheme of attention proposed by Broadbent and Gregory (1964) might be extended and used as a functional model for N1 and P3 tuning. However, that model proposes a hierarchy of selection in which N~ tuning represents the primary selection process, feeding into a later P3 selection process which depends upon N1 tuning for the initial sorting out of stimuli into channels. This does not fit well with the "top-down" order of selection observed here, in which the long-latency P3 component was tuned before the acquisition of N1 tuning. The relation between the two types of tuning would appear to be more fluid than a fixed hierarchical model would suggest. In fact, the observed dynamic progression of changes in the neural focus of attention during the course of 15 min of selective listening indicates that the averaged tendency over the entire period can be quite misleading, masking the continuous evolution of the tuning process. Acknowledgement. The assistance of Dr. Louis Broekhoven and Mr. David Dockendorff, and the helpful comments of Dr. T.W. Picton, are gratefully acknowledged. The senior author also acknowledges support received while writing this paper from the Department of Psychology, University College, London, England.
References Axelrod S, Guzy LT, Diamond IT (1968) Perceived rate of monotic and dichotically presented clicks. J Acoust Soc Am 43:51-55 Broadbent DE (1958) Perception and communication. Pergamon Press London Broadbent DE, Gregory R (1964) Stimulus set and response set: The alternation of attention. QJ Exp Psychol 16:309-312 Broekhoven LH, Brooker BH, Czigler M, Donald MW (1981) Maximum likelihood estimation of the accuracy of discrimina-
367 tion performance in the absence of an overt response to every stimulus. Unpubl. manuscript, Queen's University, Canada Courchesne E, Hillyard SA, Galambos R (1975) Stimulus novelty, task relevance and the visual evoked potential in man. Electroencephalogr Clin Neurophysiol 39:131-143 Davis H, Mast T, Yoshie N, Zerlin S (1966) The slow response of the human cortex to auditory stimuli: Recovery process. Electroencephalogr Clin Neurophysiol 21:105-113 Desmedt JE, Robertson D (1977) Differential enhancement of the cortical somatosensory evoked potential during forced-paced cognitive tasks in man. J Physiol (Lond) 271:761-782 Donald MW, Little R (1981) The analysis of stimulus probability inside and outside of the focus of attention, as reflected by the auditory N1 and P3 components. Can J Psychol 35:101-113 Donald MW, Young MJ (1980) Habituation and rate decrements in the auditory vertex potential during selective listening. In: Kornhuber HH, Deecke L (eds) Motivation, motor and sensory processes of the brain. Biomedical Press, Elsevier/ North Holland (Progress in brain research, vol 54, pp 331-336) Frfihstorfer K, Bergstrtm RM (1969) Human vigilance and auditory evoked responses. Electroencephalogr Clin Neurophysiol 27:346-355 Groves PM, Thompson RF (1970) Habituation: A dual-process theory. Psychol Rev 77:419-450 Hatter MR, Previc FH (1978) Size-specific information channels and selective attention: Visual evoked potential and behavioral measures. Electroencephalogr Clin Neurophysiol 45:628-640 Harvey N, Treisman AM (1973) Switching attention between the ears to monitor tones. Percept Psychophysiol 14:51-59 Hillyard SA (1981) Selective auditory attention and early eventrelated potentials: A rejoinder. Can J Psychol 35:85-100 Hillyard SA, Picton TW (1979) Event-related brain potentials and selective information processing in man. In: Desmedt JE (ed) Progress in clinical neurophysiology, vol VI. Karger, Basel, pp 1-52 Hillyard SA, Hink RF, Schwent VI, Picton TW (1973) Electrical signs of attention in the human brain. Science 182:177-180 Megela A, Teyler TJ (1979) Habituation and the human evoked potential. J Comp Physiol Psychol 93:1154-1170 Moray N (1969) Attention: Selective processes in vision and hearing. Hutchinson, London Nafitfinen R, Michie PT (1979) Early selective-attention effects on the evoked potentials: A critical review and reinterpretation. Biol Psychol 8:81-136 Ritter W, Vaughan HG, Costa LD (1968) Orienting and habituation to auditory stimuli: A study of short-term changes in average evoked responses. Electroencephalogr Clin Neurophysiol 25:550-556 Schwent VL, Hillyard SA (1975) Evoked potential correlates of selective attention with multichannel auditory inputs. Electroencephalogr Clin Neurophysiol 38:131-138 Squires NK, Squires KC, Hillyard SA (1975) Two varieties of long-latency positive waves evoked by unpredictable auditory stimuli in man. Electroencephalogr Clin Neurophysiol 38: 387-401 Treisman AM (1971) Shifting attention between the ears to monitor tones. Q J Exp Psychol 23:157-167 Received April 6, 1981