PART
ONE - HISTORY
OF RESEARCH
E A R L Y H I S T O R Y OF R E S E A R C H ON T E A C H E R B E H A V I O R b y DONALD M. MEDLEY, University of Virginia Studies designed to shed light on the question " W h a t makes a good teacher ?" have been appearing in the literature ever since the year 1896.1 But it is only during the last decade or so that many of these studies have incorporated objective measurements of teacher behavior as variables distinct from measures of teacher effectiveness. Since 1960 there has been a dramatic increase of interest in the analysis of the teaching process. A survey of the literature published in 19639 was barely able to turn up a score of studies using objective procedures for analyzing teachers' classroom behavior; now an admittedly incomplete anthology of instruments of this type runs to sixteen volumes. 8
Scope and Purpose o/ this Paper It will be the purpose of this paper to examine briefly what happened during the first sixty years of educational research that could account for this sudden - and rather late - emergence of interest in analyzing the teaching act. Although the volume of relevant literature is considerable, the treatment here will be brief; instead of attempting to review every important study done between 1896 and 1955 we shall examine only a relatively small number of landmark studies and some admittedly secondary sources in an effort to sketch the broad outline of what happened. This sketch will be colored with the author's own personal interpretations of some of the events in an attempt to make sense of them, b u t the outline will be as accurate and objective as possible. The period with which we are concerned was one during which research in the psychology of learning enjoyed tremendous prestige among teacher educators. Such research produced, and continues to produce, prescriptions and suggestions for effective teacher behavior which come from outside the classroom- ones which derive from research actually done in some domain of behavior other than that of teacher-pupil interaction. Despite the great influence such research has had - and continues to have - on the theory and practice of teaching, we do not consider it relevant to this discussion. For similar reasons, we will not be concerned with experimental research in methods of teaching. The typical methods experiment involves
HISTORY OF RESEARCH
431
the testing of some a priori hypothesis about effective teaching b y comparing the effectiveness of teachers using (or presumed to be using) one method and teachers using another method. These experiments generally ignore any differences in teacher behavior other than those prescribed b y the methods, or at best regard them as a source of error. Such an approach is not likely to add to our understanding of teacher behavior, and its effects on pupils. H e r e we shall be concerned only with research designed to increase our understanding of the nature of effective teaching b y studying how teachers who differ in effectiveness differ in behavior.
Early Studies: Lists ol Teacher Traits The earliest research into this question, including the pioneer s t u d y already cited, 4 tried to use students as observers. Large numbers of students were asked to describe the "best" teachers they had ever had and the descriptions were subjected to a form of content analysis which yielded lists of characteristics of "good" teachers. Many such studies appeared between 1896 and 1955; we shall examine only one, which was in a sense the culminating effort along these lines. 5 In this study each of some 10,000 high school seniors was asked to identify and describe the teachers he had liked best and least of all the teachers he had ever had. Each student was also asked to describe the most effective teacher he had ever had, unless the one he liked best and the one who was the most effective was the same person. Three groups of teachers were isolated for study - those nominated as best liked b y 3,725 seniors, those nominated as least liked b y the same group, and those nominated as most effective but not best liked b y a subgroup of 763 who described such teachers. The six characteristics which discriminate best liked from least liked teachers were as follows: 1. teaching skill (clear explanations, use of examples, well organized, etc.) 2. cheerful, goodnatured, patient, not irritable 3. friendly, companionable, not aloof 4. interested in pupils, understands them 5. impartial - does not have "teacher's pets" 6. fair in grading and marking This list is typical of those obtained in other studies (except for minor differences in the order of the characteristics) even though in most other cases the students were asked to describe their "best" teacher rather than the one that they liked best. This list m a y be said to contain the principal words used b y pupils in describing good teachers. It should be noted that
432
DONALD MEDLEY
only one trait (teaching skill) was mentioned b y as many as half of the students. This suggests that all pupils do not agree on what kind of teacher they prefer. From the point of view of a teacher looking for ways to improve his teaching, then, these lists are somewhat unsatisfactory. Which pupils are right - those who choose these traits or those who do not ? When we examine the characteristics given as distinguishing most effective teachers from those liked best, the amount of agreement is even lower. Only four items were suggested b y as many as ten percent of the students as characteristic of effective teachers: l. makes greater demands of students (suggested b y 25%) 2. has more teaching skill (20 %) 3. has more knowledge of subject-matter (10%) 4. has better discipline (10%) That this line of research was not more successful should not surprise us too much. The fact is that the typical student has no more insight into the dynamics of teacher effectiveness than anyone else. When asked to describe a good teacher, he produces a mixture of trivia, banality, and common sense that adds nothing to what is already generally believed. Not only are such descriptions devoid of new content; they tend also to be couched in terms too vague to be useful to a teacher who needs specific information rather than pious generalities. Beginning about 1917, 6 researchers began to ask these questions of experts - school administrators, professors of education and others whose opinions should have had greater validity than those of students. This approach was pushed about as far as it could go in the Commonwealth Teacher Training Study published in 1929. 7 Exhaustive and meticulous research identified some 83 traits that experts judged important to teacher effectiveness. When closely similar traits were combined the list was reduced to 25. The first six traits on this list were: 1. adaptability 2. considerateness 3. enthusiasm 4. good judgment 5. honesty 6. magnetism This list seems even less useful than the students'; the traits students mentioned were at least closer to classroom behavior than those proposed b y the experts. A third popular approach was to look at rating scales used for teacher evaluation and see what was generally considered important enough to rate. The first teacher rating scale seems to have appeared in 1913 ;8 b y
H I S T O R Y OF R E S E A R C H
433
1930, Barr was able to locate 209 such devices to analyze. 9 His study focused on areas of concern, rather than on traits as such. Here are the first six on his list : I. instruction 2. classroom management 3. professional attitude 4. choice of subject matter 5. personal habits 6. discipline While each of these areas was mentioned in a substantial proportion of scales reviewed there was nothing like perfect agreement on any of them - that is, there was little consensus even on the areas to be rated, let alone on the behaviors important in a given area. This list might give a teacher an idea of the areas of behavior to concentrate on but he would get little information about exactly how to behave in order to become a more effective teacher. The relatively small amount of overlap among the different lists does not improve the situation, either. The most serious limitation in this approach to the problem of describing the effective teacher is t h a t none of the studies included any measure of teacher effects on pupils. Everything in them is a matter of opinion. Barr and others showed as early as 19351o what is wrong with using someone's opinion as a criterion measure of teacher effectiveness. They found correlations between ratings of teachers and mean pupil gains on achievement tests that ranged from --. 15 to + . 3 6 with a mean of + . 16 - results which have been verified m a n y times since then. 11 This suggests that it is a waste of time for a teacher to study the results of any research we have been discussing so far especially if he is interested in becoming more effective in helping pupils learn. And it also suggests that the problem is not a simple one to solve. Obiective SuDervision: A False S~art It was at about this point in the history of research in teacher effectiveness that there began to be some agitation for the use of objective instrumentation ir~ supervision, in place of the highly subjective rating scales then available. This development was chronicled in a book on supervision which appeared in 1981.12 The book contained a compilation of instruments and techniques for assessing teacher competence then available. As might be expected, m a n y of them were designed around broadly-defined traits on which the teacher was to be rated by a supervisor. Also present, however, were a few checklists which were more objective than the rating scale.
434
DONALD MEDLEY
These lists were usually based on analyses of activities teachers were engaged in - or ought to be. The earliest study using this approach - attempting to describe what a teacher does rather than how well he does it - was Stevens' fine study of questioning behavior which appeared at about the same time as the first teacher rating scale. 13 Barr, the author of the book on supervision, was well aware of the importance of using such procedures to differentiate effective and ineffective teachers: he himself had published a pioneer study of this type comparing teachers identified as "good" and "poor" on a host of characteristics, including many specific behavior items. 14 Although he failed to find any significant difference, Barr was convinced that this was the direction that research in teacher effectiveness should follow, and in fact devoted the rest of his professional life to this line of research.
The Child Study Movement There was an important concurrent series of developments. In 1920, with the formation of the Committee on Child Development of the National Research Council, there had begun a movement which, although not directly concerned with the study of teacher effectiveness, did have an important impact on the development of instrumentation used in such research. The early preoccupation of the child study movement was with identifying "normal and deviate patterns Of behavior . . . and with the development of reliable indices of individual growth and development."l~ Since the children to be studied were too young to be tested or interviewed or to answer questionnaires, scientists interested in such children fell back on direct observation of behavior. And since the most convenient place to observe children was in the nursery school, kindergarten, or primary school, these researchers made direct observations of classroom behavior - not to see how effective the teacher was, b u t to see what was going on. Early studies used informal observations and recorded behaviors in diaries or logs; these soon proved unsatisfactory. Gradually a group at the University of Iowa 16 began to move toward the use of observation schedules with preassigned categories into which behaviors were tallied as they were observed. Olson is generally credited with introducing a technique of time sampling, in which the variable quantified is the time spent in behavior of a certain type. 17 Dorothy Swayne Thomas and others at Teachers College saw ill this approach the possibility of developing quantified data about human behavior which met high standards of objectivity and yielded data which
HISTORY" OF RESEARCH
435
they felt could be studied by the scientific method; that is, data which " . . . became independent of our observers within a small and predictable range of error. ''is The culmination of this movement seems to have been a five year study mainly devoted to the development of a battery of such instruments. 19 The basic approach used in developing a behavior measure was to identify some single item of behavior which could be observed reliably and which could be regarded as symptomatic of an important dimension of behavior. The time devoted to that activity was then used as a measure of the behavior trait. Such measures were required to meet extremely high standards of reliability - i.e., observer agreement. What seems to have happened is that these standards for reliability could not be met without sacrificing validity. Behaviors which could be recorded accurately enough to satisfy this group did not symptomatize anything important enough to be worth studying. Jersild described this dilemma as follows: "A [observationalj procedure obviously fails in achieving its purpose if objectivity, in a literal sense, and reliability, in the statistical sense, are gained by sacrificing more and more of the substance with which a study purports to deal. Further, for m a n y purposes, it is more important to consider given items of behavior in terms of their context and the pattern of which they are a part than to obtain simply an accumulation of isolated tallies. ''20 Interim Period
The foundations for the development of a branch of educational research devoted to the analysis of the teaching act m a y be said to have been laid by 1931. In the child study movement the basic work necessary to the development of observational methodology had been done. In the search for understanding of what makes an effective teacher most of the blind alleys had been explored and the direction to follow had been made clear by A. S. Barr. 21 But the explosion of interest did not take place for another 25 years. A number of factors m a y account for this delay. Workers in both areas m a y well have been discouraged by the predominantly negative results they had obtained. Although the exploration of blind alleys is recognized as a necessary and valuable stage in the advancement of knowledge, continuously negative results are not reinforcing to the individual scientists involved. The replacement of Watsonian behaviorism by the field psychologies in the fickle affections of teacher educators; the abandonment of the cult of efficiency and the advent of progressive education with its focus on the "whole child" m a y also have made at-
436
DONALD MEDLEY
tention to what some critics have called the "minutiae" of education unfashionable. Whatever the factors m a y have been which prevented the movement from flourishing they did not seem to affect Barr; in 1954 he was able to devote an entire issue of the Journal o[ Experimental Education to a review of 75 studies done during this period at Wisconsin under his guidance, 9'2 all of which were concerned in one w a y or another with the study of teacher characteristics. Only one of the 75 studies actually measured both teacher behaviors and teacher effectiveness - that done b y J a y n e in 1945. 28 Was it coincidental that in this study J a y n e introduced a key methodological innovation which had, apparently without his knowledge, also been introduced in 1939 b y H. H. Anderson, working in the tradition of the child study movement ? 84 It has been suggested that the lack of interest in observational studies manifest in the thirties and forties m a y have been due to an apparent triviality in what they studied. This was a natural outcome of the dilemm a resulting from their inability to increase reliability without sacrificing validity also noted above. It is small wonder that supervisors, teacher educators, and even research workers tended to prefer ratings which, whatever their limitations, did attempt to measure important characteristics of teachers. The extensive and meticulous Teacher Characteristics Study conducted b y Ryans 25 set high standards for research in teacher behavior. What Anderson and J a y n e both did was to define behavior traits or dimensions, not unlike those on rating scales, as a composite of a number of specific behaviors (or categories), each of which had something in common with the others; but each of which could be observed and recorded separately. Anderson spoke of dominative and integrative contacts, and saw that classrooms could be ordered reliably and meaningfully along a single dimension according to the difference between frequencies of contacts of the two types. J a y n e proposed two such measures which he called the Index o[ Meaning/ul Discussion and the Index o[ Immediate Recall. Each was defined as a composite of frequencies of behaviors of several kinds, and each laid claim to measuring an aspect of behavior which transcended in importance any of the individual items of which it was composed. Such dimensions have two important characteristics: (1) they measure meaningful and potentially important behavior patterns or traits (Jayne's Index of Meaningful Discussion did, in fact, correlate significantly with teacher effectiveness) ; (2) they retain the objectivity and reliability of the original items on which they are b a s e d - i t e m s too "minute" to be meaning-
HISTORY OF RESEARCH
437
ful b y themselves, b u t specific enough to be observed reliably. In addition, because the dimensions m e a s u r e d are defined in t e r m s of specific behaviors, the t e a c h e r i n t e r e s t e d in changing his position on such a dimension can find out e x a c t l y w h a t he needs to do b y s t u d y i n g the specific behaviors on which it is based. In retrospect, one can trace the progress of this i n n o v a t i o n along two lines: from Anderson's work t h r o u g h t h a t of Withal126 with his " c l i m a t e i n d e x " to the work of F l a n d e r s and the I - D ratio27; and from J a y n e t h r o u g h the work of Cornell, Lindvall and Saupe 2s to the early forms of OScAR.~9 W i t h this d e v e l o p m e n t the s t u d y of the teaching act came of age b y m a k i n g it possible to measure i m p o r t a n t dimensions of classroom b e h a v i o r with sufficient o b j e c t i v i t y for q u a n t i t a t i v e scientific analysis. E x p l o i t a tion of this new m e t h o d o l o g y was greatly facilitated b y the increased availability of federal funds, b y the d e v e l o p m e n t of high speed c o m p u t e r s a n d inexpensive v i d e o t a p e equipment, and b y advances in statistical methodology, all h a p p e n i n g at a b o u t the same time. Readers of this journal will find in it an account of the b i r t h of w h a t a m o u n t s to a new b r a n c h of science - a science of effective teacher behavior.
NOTES 1 H. E. KRATZ,"Characteristics ofthe Best Teachers as Recognized by Children", Pedagogical Seminary, 3, 1896, pp. 413-418. 2 D. M. MEDLEY and H. E. MITZEL, "Measuring Classroom Behavior by Systematic Observation", in N. L. Gage (ed.), Handbook o] Research on Teaching, Chicago: Rand McNally, 1963, pp. 247-328. 8 A. SIMON and E. G. BOYER, Mirrors ]or Behavior: an Anthology o/Observation Instruments. Philadelphia: Research for Better Schools Inc., Volumes 1-6, 1967; 7-14, 1970. 4 H. E. KRATZ, op. cir. 5 12. W. HART, Teachers and Teaching: by Ten Thousand High School Seniors. New York: MacMillan, 1936. 6 W. N. ANDERSON, "The Selection of Teachers", EducationalAdministration and supervision, 3, 1917, pp. 83-90. 7 W. W. CHARTERSand D. WAPLES, The Commonwealth Teacher-Training Study, Chicago: University of Chicago Press, 1929. 8 E. C. ELLIOTT,"A Tentative Scale for the Measurement of Teaching Efficiency", in Teachers' Yearbook o[ Educational Investigations. New York: Department of Research, New York City Schools, 1915. 9 A. S. BARR and L. M. EMANS, "Vqhat Qualities are Prerequisite to Success in Teaching ?", Nation's Schools, 6, 1930, pp. 60-64. 10 A. S. BARR et al., "The Validity of Certain Instruments Employed in the Measurement of Teaching Ability", in The ]~Ieasurement o/ Teaching Efficiency, Helen M. Walker (ed.). New York: MacMillan, 1935, pp. 75-141. 11 D. M. MEDLEY and H. E. MITZEL, "Some Behavioral Correlates of Teacher Effectiveness", Journal o/Educational Psychology, 50, 1959, p. 244. 12 A. S. BARR, An Introduction to the Scientific Study o/Classroom supervision. New York: D. Appleton, 1931.
438
DONALD
MEDLEY
13 R. STEVENS, The Question as a Measure o/Efficiency in Instruction. Contributions to Education No. 48, New York: Teachers College, Columbia University, 1912. 14 A. S. BARR, Characteristic Dif/erences in the Teaching Performance of Good and Poor Teachers o/ the Social Studies, Bloomington, Ill. : Public School Publishing Company, 1929. 15 R. E. AERINGTON, "Time-Sampling Studies of Child Behavior," Psychological Monographs, 51, 1939, p. 4 is See, for example, E. V. BERNE, " A n Experimental Investigation of Social Behavior Patterns in Young Children", University of Iowa Studies in Child Welfare, 4, 1930. 17 W. C. OLSON, "The Measurement of Nervous H a b i t s in Normal Children", Institute of Child Welfare Monographs, 3, 1929. 18 D. S. THOMAS, " A n A t t e m p t to Develop Precise Measurements in t h e Social Behavior Field," Sociologus, 8, 1932, p. 456, (cited in Arrington: op. cit., p. 18). 19 ARRINGTON, Op. cit., pp. 37--184. a0 A. T. JERSlLD and M. F. MEIGS, "Direct Observation as a Research Method", Review of Educational Research, 9, 1939. al BA~I~, see Note 14. az A. S. BAER (ed.), "Wisconsin Studies of the Measurement and Prediction of Teacher Effectiveness", Journal of Experimental Education, 30, 1961, pp. 5-156. a3 C. D. JAYNE, " A S t u d y of the Relationship Between Teaching Procedures and Educational Outcomes", Journal of Experimental Education, 14, 1945, pp. 101-134. a4 H. H. ANDERSON, "The Measurement of Dominative and of Socially Integrative Behavior in Teachers' Contacts with Children", Child Development, 10, 1939, pp. 73-89. a5 D. G. RYANS, Characteristics of Teachers. Washington, D.C. : American Council on Education, 1960. as j . WITHALL, "The Development of a Climate Index", Journal o/Educational Research, 45, 1951, pp. 93-99. 27 N. A. FLANDERS, Teacher Influence on Pupil Attitudes and Achievement, F i n a l Report, Cooperative Research Program Project No. 397. Minneapolis, Minn. : University of Minnesota, 1960. 9~8 F. G. CORNELL,C. M. LINDVALL, and J. L. SAIJPE, An Exploratory Measurement of Individualities of Schools and Classrooms, Urbana, IlL Bureau of Educational Research, University of Illinois, 1952. a9 D. M. MEDLEY and H. E. MITZEL, " A Technique for Measuring Classroom Behavior", Journal of Educational Psychology, 49, 1958, pp. 86-92.
GESCHICHTE
DER
FORSCHUNG yon
ZUM
THEMA
LEHRERVERHALTEN
DONALD M. MEDLEY
Die Forschung fiber die Wirksamkeit yon Lehrerverhalten h a t gerade ill jfingster Zeit im allgemeinen m~glichst objektive Erhebungsverfahren zur Erfassung des Lehrerverhaltens verwendet. Friihe Untersuchungen basieren auf Studentenaussagen, sp~tere auf denen yon "Experten"; seit 1915 sind gemeinhin Bewertungen yon Inspektoren benutzt worden. Doch keine dieser Forschungsarbeiten ftihrte zur L~sung der Frage, was die Wirksamkeit eines Lehrers ausmacht. I n den dreiBiger Jahren begann man, sich objektiver "check-lists" zu bedienen; dieser Versuch war jedoch nicht erfolgreicher als die herk6mmlichen Bewertungen und l a n d hie weite Verbreitung. Ungefghr zu der gleichen Zeit begannen andere, die in der Kinderiorschung arbeiteten, objektive Erhebungsverfahren fiir Vorgi~nge in der Klasse zu entwickeln und zu verfeinern. Es ist anzunehmen, dab diese objektiven Techniken vernachl~ssigt wurden, da die so erhaltenen MeBergebnisse trotz ihrer ZuverlAssigkeit (statistische lZeliabili~t)
HISTORY OF RESEARCH
439
keine Giiltigkeit (statistische Validit~t) aufweisen. Sie erweckten nicht einmal den Anscheill, valide zu sein. Zur Mitre des Jahrhunderts bin wurde dieser E i n w a n d durch die einfache Technik der Gruppierungen mehrerer objektiv meBbarer Verllaltenseinheiten entkriiftet; diese waren spezifisch genug, u m objektiv beobachtet und zu Kompositen zusammengefaBt zu werden, was fiir die betreffendell Variablen Ergebnisse iihnlich denen der ]3eobachtungsskalen ergab, jedoch viel objektiver gemessen u n d deutlicher bestimmbar. Es wird daher angenommen, dab diese verfahrensmiiBige Neuerung die pl6tzliche Beliebtheit yon Klassenbeobachtungssystemen erkliiren kann, die auch zu dieser Sondernummer der Internationale Zeitschrilt liar Erziehungswissenschalt fiihrte.
HISTOIRE
DES
RECHERCHES
SUR par DONALD
LE
COMPORTEMENT
DES
MAITRES
M. MEDLEY
Les recherches sur l'efficacit6 du comportement des maitres n'ollt utilis6 que r6cemment des mesures objectives pour obtenir le comportemellt des maitres. Les premieres 6tudes se sont bas6es sur les opinions des 6tudiallts, les suivantes sur l'opinion des "experts" et depuis 1915, des 6valuations ollt 6t6 fr6quemmellt utilis6es par des inspecteurs. Cependant, aucune de ces recherches n ' a permis de d6couvrir ce qui fair l'efficacit6 d'ull professeur. Au cours des ann6es 30, on a commenc6 k utiliser des "check-lists", mais cet essai n ' a pas 6t6 plus satisfaisallt que les 6valuations habituelles et n ' a jamais connu une large diffusion. A la m~me 6poque environ, d'autres personnes s'occupant de recherches sur l'ellfant commenc~rent 6laborer et ~ am61iorer des proc6d6s objectifs pour enregistrer les processus de la classe. I1 taut supposer que ces techniques objectives furent n6glig6es parce que les r6sultats obtenus - bien que fidMes - n'accllsaient aucune validit6. Ils n ' a v a i e n t mSme pas de validit6 apparente, c'est-k-dire qu'ils ne paraissaient m~me pas valables. Yers 1950, cette objection fur refut6e, grace au simple proc6d6 qui consistait k grouper plusieurs unit6s de comportement. Ces unit6s furent suffisamment sp6cifiques pour 8tre observ6es objectivement et rassembl6es en groupes, ce qui a permis de r6aliser des r6sultats pour les variables similaires k ceux produits par les 6chelles d'6valuation, mais mesur~s d'une fagon beaucoup plus objective et d6finis d'une mani~re plus explicite. C'est pourquoi, on suppose que eette innovation m6thodologique a provoqu6 la soudaine popularit~ des syst~mes &observation dans les classes, popularit6 qui explique la publication de ce Num6ro sp6cial de la Revue internationale
de Pddagogie.