Four obstacles to prediction

...

3 downloads 94 Views 275KB Size

8

excel. It is the converse which has n o t b e e n shown, in my opinion. Mr. Dowling went into an excellent discussion of necessary and sufficient conditions to explain why the SATs m i g h t not adequately predict grades in the first year of college, and I think his explanation is flawless. However, what he has d e m o n strated is that intelligence is a necessary but not a sufficient quality for success; he has not d e m o n s t r a t e d that low grades on the SATs necessarily imply lower intelligence or ability, since as I have just a r g u e d a low grade might be due to insufficient intelligence on the part of the test makers or insufficient quickness of t h o u g h t on the test taker's part or simple test anxiety, which is magnified by timed tests (especially ones which are designed not to let any but the fastest finish at all). In short, I believe that the idea of the SATs is an excellent one, but I question w h e t h e r these particular exams achieve what they are trying to achieve. J o h n C. Wenger D e p a r t m e n t of Mathematics Harold Washington College City Colleges of Chicago

F o u r O b s t a c l e s to P r e d i c t i o n To the editor: Professor Dowling's magnificent analysis of verbal items f r o m the College Board Scholastic Assessment Test I, is a nearly unique contribution to this troubled, misu n d e r s t o o d subject. H e is on solid g r o u n d statistically, also, because nowadays the three-hour SAT predicts college-freshman grades better than the grade-inflated fouryear academic high-school r e c o r d does. T h a t is a r e m a r k a b l e achievement, even m o r e so when one considers the following four predictive obstacles that the SAT encounters.

Academic Questions / Fall 2000

T h e first o f these is that, in selective colleges, m o s t of the predictive value o f SAT-V and SAT-M scores is used up in the selection of students. Admissions committees try very h a r d to accept only those stud e n t s t h e y b e l i e v e will p e r s i s t to g r a d u a t i o n in t h e i r institution. To the extent they succeed in this selection, the correlation of SAT scores with the GPAs of f r e s h m e n who c o m p l e t e the first one or two semesters is considerably diminished. A perhaps m o r e i m p o r t a n t criterion is persistence to graduation. I a n d others have f o u n d the SAT a better p r e d i c t o r o f that than o f GPAs. Second, selective colleges t e n d to rej e c t non-varsity-athlete, non-"legacy," nonaffirmative-action students who are low on both SAT scores and high-school GPAs (or class rank). They accept m a n y or most o f the high-highs, few if any of the low-lows, and some high-lows and low-highs. Thus, an accepted applicant who scored low on SATs is quite likely to have p r e s e n t e d a stellar high-school academic r e c o r d . An a c c e p t e d a p p l i c a n t who h a d a m e d i o c r e h i g h - s c h o o l a c a d e m i c r e c o r d is q u i t e likely to have very high SAT scores. This lowers p r e d i c t i o n o f college GPAs, because the p r e d i c t i v e value o f low SAT s c o r e s or o f low h i g h - s c h o o l GPAs becomes c o n t i n g e n t on the level o f the o t h e r predictor. Such p r o b l e m s do n o t p l a g u e o p e n d o o r colleges. Prediction there also tends to be b e t t e r b e c a u s e o f the far g r e a t e r r a n g e of talent a d m i t t e d a n d the wider range of grades given. Third, in all colleges it is difficult to predict GPAs across students taking various courses, because grading standards are by n o m e a n s u n i f o r m f r o m field to field. O n c e I studied the relationship of SAT-V and SAT-M scores to cumulative GPAs o f persons graduating f r o m J o h n s H o p k i n s who h a d m a j o r e d in chemistry. Within this e x t r e m e l y h o m o g e n e o u s g r o u p , SAT

Letters

scores predicted GPAs better than they did for o u r freshmen. Fourth, as alluded to earlier, grade inflation has lowered that prediction. Quite a few selective colleges award m a n y A's and B's and few C's, D's, and F's, thereby restricting the range of grades. Consider an e x t r e m e example: if only A's were given, there could be no differential inter-individual prediction, even t h o u g h "prediction" for the g r o u p would, without any predictor, be perfect; we'd know a h e a d of the grading that all students would earn only A's. Given these four impediments, it seems to me striking that SAT scores are fairly effective predictors. Professor Dowling showed us why! Julian C. Stanley, Professor Emeritus of Psychology J o h n s Hopkins University [email protected]

William C. Dowling responds: D o r o t h y P u g h ' s p o i n t that GPA can d e p e n d on "personality considerations" in small classes is well taken. I ' m r e m i n d e d o f the case a few years ago where the valedictorian of a small-town New H a m p s h i r e high school t u r n e d out, u p o n arriving at college, to be actually illiterate, u n a b l e even to read a newspaper. T h e r e ' s a certain weirdness to this c a s e - - t h e facts came out because she sued the high school for h a v i n g f r a u d u l e n t l y a w a r d e d h e r a dip l o m a - b u t one thing that did e m e r g e is that she had "succeeded" by obvious ploys: smiling brightly a n d asking i n t e r e s t e d questions, nodding emphatically when the teacher m a d e a point, never missing class, etc. This is an aspect of grade inflation that is too little discussed. Milton Ezrati read "Enemies of Promise" a bit too hastily, I think. The SAT question to which he o b j e c t s - - t h e one a b o u t

9

how Susan feels "intensely " after her hot a f t e r n o o n working in the g a r d e n a n d goes the refrigerator to get "a cold "--isn't, as it happens, an SAT question. It was an e x a m p l e I m a d e u p to give readers a preliminary idea of how the SAT s e n t e n c e - c o m p l e t i o n f o r m a t works, by specifying a semantic field. N o r are thirsty or drink the "answers," even in that example. I was careful to say that "an answer like" thirsty would be n e e d e d in the first instance, "an answer like" drink in the second. Mr. Ezrati's o t h e r objections to my analysis seem, while well-intended, similarly based on a less-than-careful reading of the essay. J o h n Wenger raises s o m e difficult substantive points. O n e is w h e t h e r the SATV merely tests c o n v e r g e n t t h i n k i n g - - " w h a t the test-makers wanted," in the usual lang u a g e o f its c r i t i c s - - t h e r e b y p e n a l i z i n g "imaginative" or "innovative" test-takers. A n o t h e r is whether the time limit imposed o n test-takers m i g h t give invalid results w h e n a bright s t u d e n t h a p p e n s to be a particularly meditative sort. As it happens, the first objection authorizes a strategy o f t e n used by critics (David Owen, for e x a m p l e ) in trying to d e b u n k the SATV. They'll take a question f r o m an actual SAT test, give you the answer t h a t the SAT c o u n t s as "correct," t h e n t h i n k u p scenarios in which o n e or m o r e of the "inc o r r e c t " a n s w e r s a r e s u p p o s e d to b e equally plausible. ( " O f course snow looks right at first glance. But to a child kidn a p p e d f r o m his alcoholic g r a n d p a r e n t s at age 6 and put on a space ship traveling at the speed of light towards Alpha Centurion, either b o u r b o n or nitrogen m i g h t seem better choices.") T h e r e a s o n t h a t this s o r t o f t h i n g d o e s n ' t stand up very well against the way the SATV actually tests reading c o m p r e hension is that the laws of syntax and semantics are pretty inexorable at the level of p r i m a r y or first-order meaning, which

10

is all that the SAT is c o n c e r n e d with. If Mr. Wenger will go back and look at my discussion of the SATV sentence-completion item that specifies dual as the correct answer, he'll see that I do list reasons why the "incorrect" answers m i g h t seem plausible to some test takers. But he'll also see t h a t syntax o f the s e n t e n c e b e i n g tested unambiguously specifies dual as the answer. I f you o p t for any of the o t h e r answers, inventing reasons why they're just as plausible in this context, you'll find that o n e c o n s t r u c t i o n in the t e s t - s e n t e n c e ("not o n l y . . , but as") is left hanging. Insert dual into the blank and the syntactic p r o b l e m vanishes. Beyond that, the appeal to "imaginative" thinking that sustains this particular objection seems to m e not to apply at the very r u d i m e n t a r y level of linguistic c o m p r e h e n s i o n m e a s u r e d by the SATV. It's true e n o u g h that geniuses like Borges or Nabokov see things that the rest of us d o n ' t s e e - - t h a t ' s why they are imaginative geniuses and the rest of us n o t - but they do so on a level stratospherically above the level at which the SATV measures verbal c o m p r e h e n s i o n . T h e muchrehearsed claim that the SATV is somehow penalizing large n u m b e r s of such geniuses s e e m s to m e silly, r o m a n t i c , a n d false. Moreover, I've discovered again a n d again that the students who have the easiest time with works of extraordinary imaginative c o m p l e x i t y - - N a b o k o v ' s Pale Fire, s a y - - a r e almost always those who have the highest SATV scores. The problem about time limits-whether or not the SATV penalizes bright students who d o n ' t work well u n d e r time constraints--is one over which I once agonized a good bit, but which I've m o r e recently c o m e to t h i n k is w i t h o u t m u c h substance. T h e reason lies in the relation between SATV scores a n d reading comp r e h e n s i o n d i s c u s s e d in " E n e m i e s o f Promise." Imagine, for instance, a student who got only 450 on the SATV but who,

Academic Questions / Fall 2000

given 20 hours to take the test rather than the 90 m i n u t e s allowed by ETS, m i g h t conceivably have scored 750. T h e question is w h e t h e r the difficulties that p r o d u c e d the original 450 d o n ' t also say s o m e t h i n g a b o u t the student's prospects in college. Doing well at even a m o d e r a t e l y d e m a n d ing institution, after all, m e a n s having the ability to handle a lot of work within relatively tight time constraints. A f r e s h m a n who n e e d e d five h o u r s to r e a d the first f o u r pages o f Walden, even if he or she could d e m o n s t r a t e p e r f e c t c o m p r e h e n sion at the e n d of that time, would be in obvious trouble. N o n e t h e l e s s , I ' d be all in favor o f a study t h a t set o u t to find o u t w h e t h e r any significant n u m b e r o f verbally g i f t e d s t u d e n t s are b e i n g unfairly p e n a l i z e d by the SATs t i m e constraints. It would b e simple e n o u g h to design. Professor Stanley's g o o d analysis of the S A T / G P A relation raises a p o i n t a b o u t g r a d e inflation that o u g h t to be m o r e widely publicized. A lot o f p e o p l e have trouble grasping the point that very high SAT scores, such as you find in the Ivies a n d a few o t h e r selective colleges, by definition m a k e n o n s e n s e o f the SAT as a grade predictor. If H a r v a r d were for s o m e reason to accept a f r e s h m a n class where e v e r y o n e h a d an 800 SATV a n d 800 SATM, for i n s t a n c e , h o l d i n g o n t o the idea of the SAT as a " p r e d i c t o r " w o u l d imply t h a t e v e r y o n e in the class s h o u l d get all A's. But of course this is n o n s e n s e . (An analogy: fifty r u n n e r s qualify for the O l y m p i c m a r a t h o n trial with i d e n t i c a l world-class times o f 2:12. So each of the fifty is now g u a r a n t e e d to be the winner o f the race?) T h e point is worth m e n t i o n i n g because m a n y faculty at selective institutions are deeply confused a b o u t it, as shown by justification they often give for awarding ind i s c r i m i n a t e l y h i g h g r a d e s to t h e i r students. I r e m e m b e r talking to an old

Letters

11

f r i e n d at W i l l i a m s s h o r t l y a f t e r t h e Chronicle of Higher Education h a d r u n a piece showing that g r a d e inflation h a d virtually t u r n e d GPA at Williams into a joke. D o n ' t you, I asked, have the n o r m a l range of p e r f o r m a n c e in your classes? A few students who write absolutely brilliant papers, some who are merely good, a large n u m b e r of others who are just sort of, well, average for that particular class? W h e n you and I were in college, I asked, d i d n ' t the symbols of those distinctions look like this: A, B, C+, C . . . ?

"You d o n ' t understand," my friend answered. "These students have the highest SATs in the country." A few weeks later, o n e of my colleagues at Rutgers used the same story to justify the u n i f o r m l y high g r a d e s - - v i r t u a l l y all A's, with a tiny scattering of B+'s--given in a large English course. "It's n o t me," she said w h e n I inquired a b o u t the grades I h a d seen posted on her office door. "Everybody does it. At Williams. At Harvard. At Yale. I r e a d a b o u t it in the Chronicle of

Higher Education."

T h e Center for Education R e f o r m ' s enewsletter picked up this excerpt o f a c o n v e r s a t i o n with M o n t y Neill, e x e c u t i v e director o f FairTest, an organization o p p o s e d to standardized testing. T h e conversation had appeared o n 31 May 2000 on the listserve o f the Assessm e n t R e f o r m Network . Question: "Do we defer on winning on tests until we b r e a k the corporate strange hold, or do you think this is all or nothing, we c a n ' t stop the testing a n d make real progress toward g o o d e d u c a t i o n for all without stopping capitalism." Monty Neill, o f Fair-Test--"No, of course this isn't an all-or-nothing proposition. It may well be possible to stop high stakes testing a n d corporate-led reform, t h o u g h the fact that the same c o r p o r a t e forces are imposing education reforms a r o u n d the US a n d a r o u n d the world suggests that they have a lot at s t a k e . . . And there are limits o n the d e p t h of the transformation of the schools we can achieve while the corporations are the d o m i n a n t force in society. T h e elite have a fund a m e n t a l stake in maintaining social inequality, a n d the educational policies and practices we would most want to c h a n g e are designed to justify social inequality by giving it an aura of 'meritocracy.'"

Four obstacles to prediction

Recommend Documents