The Psychological Record, 1965, 15, 269-274_
REINFORCEMENT SCHEDULE AND THE DURABILITY OF A SECONDARY REINFORCER MARILYN E. MILLER AND JACK WEIDNER University of Wiscomin-Milwaukee
To investigate the possibility of PRE upon an Sr, a neutral stimulus (light) was paired with primary reinforcement (candy) on three different reinforcement schedules, with no response required of S. The light was presented for 18 trials. On reinforced trials, candy appeared simultaneously with the light. 12 of 36 6- and 7-year-old school children served in each of three groups. Group 100 was reinforced on all trials, Group 50-A was reinforced on alternate trials, and Group 50-R was reinforced on a randomly-determined half of the trials. Following acquisition, a lever was introduced, and each lever press was reinforced with the light alone. Group 50-R pressed the lever at a higher rate and for a longer time than Groups 50-A and 100; Group 50-A responded at a higher rate and for longer than Group 100. It was concluded that a PRE upon the extinction of secondary reinforcing properties had been demonstrated.
It has been well established that a partially-reinforced response is more resistant to extinction than one that is continuously reinforced. Therefore, it has been reasoned that partial reinforcement of a neutral stimulus would produce a more durable secondary reinforcer ( Sr) than would continuous reinforcement. In several animal experiments designed to investigate this hypothesis (Dinsmoor, 1952; Melching, 1954; Saltzman, 1949), the establishment of Sr occurred simultaneously with the acquisition of an operant or instrumental response. The effectiveness of Sr was measured as a function of the resistance to extinction of the conditioned response when it was reinforced only with Sr In such designs, there are different methods of achieving partial reinforcement, as well as different ways in which the reinforcement scheuules of the stimulus and the response can be combined. Since results appear to be affected by the kind of schedule combination, perhaps it would be best to outline several possible combinations before considering these experiments in any detail.
In conditioning the neutral stimulus, the schedule is continuous if the stimulus and primary reinforcer always occur together, partial if they occur together on only some trials. However, a partial schedule can be achieved either by presenting the neutral stimulus more often than the primary reinforcer (partial-I) or by presenting the primary reinforcer more often than the stimulus (partial-2). A response is continuously or partially reinforced if it is, respectively, always or sometimes followed by the primary reinforcer. When a neutral stimulus and a
270
MILLER AND WEIDNER
response are conditioned simultaneously, either or both may be partially or continuously reinforced. Letting the first schedule be used for the stimulus and the second for the response, at least the following four schedule combinations are possible (a) continuous-continuous, i.e., the stimulus and reinforcer always occur together and both follow each response; (b) continuous-partial, i.e., some but not all of the responses are followed by both the stimulus and reinforcer; (c) partial-continuous, i.e., partial-2-the primary reinforcer follows each response, the stimulus follows some responses; (d) partial-partial, i.e., partial-I-the stimulus follows each response, the reinforcer follows some responses. Both Saltzman (1949) and Dinsmoor (1952) compared a partialpartial reinforcement group with a continuous-continuous reinforcement group and found greater resistance to extinction of the conditioned response following partial reinforcement. Although this was interpreted as a partial reinforcement effect (PRE) of the extinction of secondary reinforcing properties, it is impossible to determine whether the PRE is associated with the response or with the sr, since both had been partially reinforced during acquisition. Melching (1954) found no evidence for PRE following a partialcontinuous reinforcement schedule. He suggested that his results were consistent with a stimulus generalization hypothesis; namely, that resistance to extinction of a conditioned response is directly related to the degree of similarity between acquisition and extinction stimulus situations. Since the response was continuously reinforced for all of Melchings groups, differences in the degree of similarity between acquisition and extinction was determined by differences in the proportion of Sr presentations during these two sessions. With this design, 100% presentation of Sr during both acquisition and extinction represents a greater degree of similarity than does 50% Sr-presentation during acquisition and 100% Srpresentation during extinction. Therefore the continuously-reinforced group should be more resistant to extinction than the partially-reinforced group, and that is essentially what Melching found. It seems clear that simultaneous conditioning of a response and an Sr complicates the determination of the effect of reinforcement schedules upon Sr. Melching suggested that investigation of the variables affecting a secondary reinforcer would best be accomplished in a design which established Sr in one situation, and tested its effectiveness in a different situation. Zimmerman (1957), using such a design, indicated that a durable secondary reinforcer could be established by intermittently pairing (partial-I) the neutral stimulus (buzzer) with primary reinforcement (water), followed by the intermittent reinforcement of an operant (bar press) with the Sr (buzzer). In a later experiment by Zimmerman (1959), a buzzer signaled the opening of a start box (SB) door permitting rats to traverse a runway to a goal box (GB) in which they were reinforced with food on an increasing variable ratio (VR) schedule. On the extinction series, food was no longer present in the GB, a bar was inserted into the SB, and on an increasing VR schedule,
SECONDARY REINFORCER DURABILITY
271
a bar press sounded the buzzer followed by the opening of the SB door. Zimmerman reported "thousands" of responses before extinction of the bar press finally occurred. Since no other investigator has been able to demonstrate such a durable Sr, the implication was that durability of the secondary reinforcing properties of the combined buzzer and opening SB door was due to the dual intermittent schedules used by Zimmerman. However, Wike, Platt, and Knowles (1962) presented evidence that escape from the SB was reinforcing without any secondary reinforcement training. In an extended replication of the Zimmerman experiment (1959), Wike and Platt (1962) compared an increasing VR schedule with a 100% schedule and found no PRE. They noted, however, a great deal of response variability which may have masked any real effects. Thus Zimmerman's ( 1959) results concerning a possible PRE upon Sr are equivocal since he did not compare continuously- with partially-reinforced groups, and Wike and Platt's (1962) findings were inconclusive due to extreme variability. It appears that the PRE upon the extinction of reinforcing properties of a stimulus has not, as yet, been effectively tested. The present experiment was designed to make such a test, using children as Ss. During acquisition, no response was required of S. A light occurred on every trial and was accompanied by primary reinforcement (candy) on 100% or 50% of the trials. The 50% group was further divided so that half received primary reinforcement on randomly-determined trials (50"R), and half received primary reinforcement on alternate trials (50-A). Following acquisiton of the sr, a lever was introduced and each lever press was reinforced by the Sr only. Durability of the Sr was measured by the number of bar presses made prior to one of the criteria of extinction. METHOD Subjects Thirty-six 6- and 7-year old primary-one children, enrolled in a Milwaukee public school, served as Ss. Apparatus A 2~ x 3 ft. black, plywood panel separated E from S. A I-in. aluminum pipe protruded through the S's side of the panel and was used to dispense candy corn into a small plastic cup. A 15-watt lamp was also mounted on S's side. A removable lever switch unit, mounted on a small wooden base, was used as the operant response device. The number of lever presses made in each 15-sec. period was automatically recorded. Procedure For the acquisition session, 12 Ss were randomly assigned to each of the three treatment groups. All Ss were given standardized explanations of the apparatus, and all were told to observe the lamp and the candy dispenser.. They were instructed to remove any candy from the dispenserandtb keep it in a paper cup until the session was finished. All Ss agreed that they liked candy Com. .. . .
MILLER AND WEIDNER
272
The lamp was lit for 2 sec. followed by a 5-sec. intertrial interval for a total of 18 trials. On reinforced trials, a candy corn appeared simultaneously with the light presentation. Both candy and light were presented on all trials for Group 100; candy was presented with light on alternate trials for Group 50-A; and candy was presented with light on a randomly-determined schedule for each S in Group 50-R, with the restriction that candy was presented on exactly half of the trials. Thus, Group 100 had 18 light-candy pairings, while Groups 50-A and 50-R had 9 pairings. The acquisition and extinction sessions were separated by a SO-sec. interval. The response lever, previously concealed from S, was placed before him with the instructions to "pull the lever and see what happens." Each S was also told to press as slowly or rapidly, as much or as little as he desired, and to stop whenever he "felt like it." Each bar press automatically lighted the lamp. One of the following three criteria defined extinction. (a) S stated a desire to stop responding; (b) no response occurred in 90 consecutive sec.; or (c) S left the situation. Following extinction, each S was asked not to discuss his experience with his classmates, and the Ss in the partially-reinforced groups were given an additional 9 pieces of candy. RESULTS One S in Group 50-A and two in Group 50-R met the criterion of no response in 90 consecutive sec. All other Ss said they wanted to stop, at which point the session was terminated. TABLE 1 NUMBER OF S8 AND LENGTH OF TIME TO REACH EXTINCTION OF BAR-PRESSING RESPONSE Minutes 2-~ 3-~ 4-~ 5-~ 6-~ 7-~ 8-~ ~
1-~
1 2 3 4 5 6 7 8 9
Group 100
Group 50-A
Group 50-R
0 3 4 4 1
0 2 2 3 0 3 0 1 1
0 0 2 0 1 1 1 3 4
A frequency distribution of Ss and the length of time to reach criterion is shown for each group in Table 1. More Ss responded longer in Group 50-R than in Groups 50-A and 100, and more Ss in Group 50-A responded longer than in Group 100. The mean response times for Groups 100, 50-A, and 50-R were 2.88, 4.50, and 6.56 min., respectively.
SECONDARY REINFORCER DURABILITY
273
A simple analysis of variance indicated that the means differed significantly (F=10.78; df=2,33; p<.OOl) and the Newman-Keuls method (Winer, 1962, p. 80) further indicated that each mean differed significantly from each other (p~.05). As may have been expected from the time scores, the mean total responses were 119.00, 174.50, and 262.42 for Groups 100, 50-A, and 50-R, respectively. The overall difference was significant (F=11.14; df= 2,33; p<.OOl) and each mean differed significantly from each other (p~.05). In terms of time and number of responses, it is quite clear that Group 50-R was most resistant to extinction, Group 100 was least resistant, and Group 50-A was intermediate.
.,. ....
... U)
z:
....ac: A-
U)
....
3500 3000 2500 2000
:>
...=
i= 1500 c 1000
=
-SROUP50-R SROUP 58-A --- SROUP 100 0-<>
::& c.:>
500 4
5
MINUTES Fig. 1. Cumulative total number of bar-presses in 30-sec. intervals.
Comparisons of the rate of responding among groups may be made from the cumulative response curves shown in Fig. 1. In the first 30 sec. all groups responded at the same rate. In the second 30 sec. Group 100 responded at a higher rate than either of the partially-reinforced groups, followed immediately by a steadily decreasing rate until extinction was reached. Through H~ min. the two partial groups responded at the same rate, then the random group maintained a higher, steady rate than did the alternate group. DISCUSSION It is generally assumed that when a neutral stimulus is paired with primary reinforcement, the neutral stimulus acquires secondary reinforcing properties, and when the neutral stimulus is subsequently presented without primary reinforcement, extinction of its reinforcing properties occurs. The effectiveness with which the previously neutral stimulus maintains an observable response seems a reasonable measure of the rate of extinction of its reinforcing properties. The present results indicated that the light, which had previously been paired with candy, maintained a bar~pressing response at a higher rate and for a longer time when the pairings had been on a
274
MILLER AND WEIDNER
partial, compared with a continuous, schedule. Bar pressing also occurred at a higher rate and for a longer time when the partial schedule was random, compared with alternate. It may be suggested that a group having no light-candy pairings, prior to use of the light as a reinforcer, should have been included in order to demonstrate unequivocally that the reinforcing properties of the light were acquired rather than intrinsic. Lack of such a group does not seem to affect the interpretation of the results. The reinforcement schedule clearly influenced the effectiveness with which the light served as a reinforcer, and the results seem to demonstrate a PRE. It should also be noted that the PRE was observed despite the continuously-reinforced group having received twice as many light-candy pairings as the partially-reinforced groups. The findings are consistent with both the stimulus generalization and the discrimination hypothesis. The inclusion of nonreinforced trials during acquisition makes acquisition and extinction more similar and less discriminable for partially-compared with continuously-reinforced groups, so the former are more resistant to extinction. Acquisition and extinction situations are more similar and less discriminable for the random-partial group than for the alternate-partial group, since a random schedule includes longer runs of nonreinforced trials in an unpredictable order. Although a PRE upon the extinction of secondary reinforcing properties has been demonstrated when young children served as Ss, it remains to be determined whether similar results would occur in a similarly designed rat study. It is possible that the effect depends upon mediating verbal or cognitive responses and may, therefore, be specific to human Ss. Investigation of the relative effects of partial-l and partial-2 schedules with different proportions of Sr presentation during extinction, should yield more precise information concerning the effect of reinforcement schedules upon secondary reinforcing properties. REFERENCES DINSMOOR, J. A. Resistance to extinction following periodic reinforcement in the presence of a discriminative stimulus. J. compo physiol. Psychol., 1952, 45, 31-35. MELCHING, W. H. The acquired reward value of an intermittently presented neutral stimulus. J. camp. physiol. Psychol., 1954, 47, 370-374. SALTZMAN, I. J. Maze learning in the absence of primary reinforcement: A study of secondary reinforcement. J. camp. physiol. Psychol., 1949, 42, 161-173. WIKE, E. L. & PLATT, J. R. Reinforcement schedules and bar pressing: Some extensions of Zimmerman's work. Psychol Rec., 1962, 12, 273-278. WIKE, E. L., PLATT, J. R. & KNOWLES, J. M. The reward value of getting out of a starting box: Further extensions of Zimmerman's work. Psychol. Rec., 1962, 12, 397-400. WINER, B. J. Statistical principles in experimental design. New York: McGrawHill, 1962. ZIMMERMAN, D. W. Durable secondary reinforcement: Method and theory. Psychol. Rev., 1957, 64, 373-383. ZIMMERMAN, D. W. Sustained performance in rats based on secondary reinforcement. J. compo physiol. Psychol., 1959, 52, 353-358.