Why (Some) Americans Believe in the Lie Detector While Others Believe in the Guilty Knowledge Test David T. Lykken University of Minnesota
Abstract--The accuracy of polygraphic lie detection in real life applications is very little better than chance. Yet, at least in the United States, many agencies and the polygraphers themselveshave great faith in the technique. The reasons why polygraph examiners, and their clients, genuinely believe in the myth of the polygraph are explained and illustrated.A more plausiblemethodof polygraphicinterrogation,the Guilty KnowledgeTest (GKT), is described and it is shown how the GKT, but not the lie test, might have resolveddoubtsabout the case of Demjanjuk,the alleged ~Ivanthe Terrible."
AMERICANSHAVEBEENEXPOSEDto what I call "the myth of the lie detector" for more than 70 years, a diet of news stories and other forms of fiction in which guilt is uncovered or innocence revealed by a supposedly "scientific" process of polygraphic interrogation. Connoisseurs will recall Northside 777, a 50-year-old movie in which one of the elder statesmen of lie detection, the late Leonarde Keeler, is shown using the polygraph to demonstrate that the wrong man had been sent to prison. Since World War II, the U.S. Government has been an enthusiastic user of polygraph testing. The U.S. military and all federal police and security agencies rely on the technique both for pre-employment screening and in the interrogation of government workers who are suspected of misconduct. Washington, D.C., reaching as far south as Langley, Virginia, where the CIA has its base, has the highest density of polygraph instruments of any place on earth even though, as cartoonists have suggested, a lie detector in Washington is about as useless as a compass at the North Pole. During the Reagan Administration, even the secretary of state, George Schultz, found it necessary to state publicly that he would resign before submitting to a polygraph examination. The secretary of defense, Cap Weinburger, on the other hand, had no such scruples. State and local police in the United States are also enthusiastic users of the lie detector. The polygraph's chief value for them derives from the fact that a usefully high proportion of suspects who "fail" polygraph tests can be induced to make a formal confession, persuaded that all hope is lost once the lie detector has condemned them. Although hard to discover, for obvious reasons, there are cases on record of false confessions being obtained in this way from innocent persons. Although not routinely admissible into evidence in U.S. courts,
Address for correspondence:D.T. Lykken,Departmentof Psychology,Elliott Hall, Universityof Minnesota, Minneapolis, MN, 55455. Integrative Physiologicaland BehavioralScience,July-September,1991,Vol.26, No.3,214-222. 214
215
LIE DETECTOR VS GUILTY KNOWLEDGE TEST
polygraph results are, in fact, admitted in criminal trials every year, most often in cases where there is a paucity of conventional evidence. I have been personally involved in three separate cases in which m e n w e r e convicted of murder, largely on the basis of testimony by a polygraph examiner saying that, in his expert opinion, they had lied in denying their guilt. In all three of these cases, subsequent to their conviction, the defendants were found to be innocent and were released (Cimerman, 1981.) Table 1 shows the questions used in one of these tests that sentenced an innocent man to prison for murder (Lykken, 1981b.) Only the "Relevant" questions, referring to the incident under investigation, and the "Control" questions, referring to possible misdeeds in the suspect's prior life, play a role in the outcome. The basic assumption of this "Control Question Test" or CQT is that an innocent person, able to answer the "Relevant" questions truthfully, will be relatively m o r e disturbed by the "Control" questions which, the examiner assumes, the suspect cannot truthfully or confidently deny. In other words, this man went to prison as a convicted murderer because the question: "Did you shoot Fred?" disturbed him more than the question: "Before age 26, did you ever think of hurting someone for revenge?" Cases like these, in which polygraph results can be clearly refuted by subsequent evidence, are relatively rare and are ignored or explained away by polygraph proponents as due to the incompetence of some examiners. The police in Toronto insist, in a handout provided to criminal suspects (Toronto Police, undated), that "we know of no verified instance of a competent polygraphist reporting a truthful person as untruthful," that is, that the lie test has 100% specificity. The inventor of the Control Question Test, the late John Reid, was more modest; he claimed only 99% accuracy (Reid & Inbau, 1977.) A psychological test capable of near-perfect accuracy would truly be a wonder to contemplate. One can understand why one polygrapher, writing in the journal of the American Polygraph Association, concluded that "God gave us the polygraph!" (Lynch, 1975). There have been a number of attempts to test these remarkable claims experimentally. Many of these validity studies have been scientifically worthless and none could be said to be definitive. Table 2 provides a summary of the findings from four of the better studies. The aggregate validity with innocent suspects is 5 3 % - - n o t significantly better than chance. This bias against innocent s u s p e c t s m w h o have nearly a 50:50
TABLE 1. Questions Used in a Lie-detector Test that Sent an Innocent Man to Prison for Murder. The Control Question Lie Test
1. 2. 3. 4. 5. 6. 7. 8. 9. 10.
Is today Monday? Regarding the shooting of Fred Ery, do you intend to answer truthfully about that? Do they call you "Buzz"? Before you were 26, did you ever intentionally injure any person with a weapon? Did you shoot Fred? Beforeyou were 26, did you ever think of hurting someone to get revenge? Did you shoot Fred Ery on March 28th? Are you sitting down? Wereyou in Fred's carry-out on March 28th? Betweenthe ages of 16 and 26, did you ever intentionally injure any person with a weapon?
(Irrelevant) (Relevan0 (Irrelevant) (Control) (Relevant) (Control) (Relevant) (Irrelevant) (Relevant) (Control)
216
L'rKKErq
TABLE 2. Findings from Four of the Better Studies of the Accuracy of the CQT in Real Life Interrogation of Criminal Suspects. The Data are Aggregated by Simply Averaging the Results. The Four Credible Real-life Studies of CQTAccuracy Study
(1977)
Kleinmuntz & Szucko (1984)
lacono & Patrick (1987)
98 %
79%
76%
98 %
88%
Specificity (innocent hits)
45 %
50 %
64 %
55 %
53 %
M e a n of above:
72%
65%
70%
77%
71%
Sensitivity
Barland & Raskin (1976)
Horvath
Totals
(guilty hits)
True Positive Rate
--- P(Deceptive if called "Deceptive")
True Negative Rate = P(Truthful if called "Truthful")
-~ 65% 1 -~ 82 % 1
IAssumes equal proportions of truthful and deceptive suspects.
chance of failing a lie detector test--is to be expected given the implausible assumption of the CQT mentioned earlier. The aggregate accuracy in detecting deceptive suspects is shown in Table 2 as 88% This level of sensitivity could be useful in some applications but, alas, it turns out to be an over-estimate. In three of these four studies, the criterion of "ground truth" were confessions obtained from suspects who had "failed" the tests when scored by the original examiners. (The validity estimates come from later blind rescoring of the polygraph charts by different examiners who had no other knowledge of the suspects or of the evidence against them.) This reliance on confessions as a criterion means that the only suspects verified as guilty--that is, all the guilty subjects used in these studies--were persons whose tests had already been scored as "deceptive" by the original examiner. Guilty suspects scored as truthful or inconclusive by the original examiner did not confess and so were not included. It is not surprising, therefore, that most of the subsequent rescorings tended to agree with the scoring of the original examiners.
The Control Question Lie Test As I have already suggested, polygraph examiners themselves have almost total faith in the accuracy of their technique. This genuine conviction on the part of the "experts" is important because it helps to account for the similar confidence shown by the polygrapher's clients, by the government agencies, police departments, and private businesses that rely on the polygraph in the United States. And yet the best research available shows that the lie test is deeply flawed, its accuracy in identifying innocent suspects about equal to the flip of a coin. This fact--that examiners who have personally administered thousands of these inaccurate tests can continue to believe that they are almost never wrong--therefore constitutes a paradox. In his work with the Canadian federal police, my colleague, Bill Iacono (Iacono & Patrick, 1988), traced the solution of this paradox to the polygraph-induced confession.
LIE DETECTOR VS GUILTY KNOWLEDGE TEST
2 ]7
The Insidious Role of Confessions It is standard practice for polygraphers to interrogate a suspect who has failed the lie test. They point out that the impartial, scientific polygraph has demonstrated his guilt, that no one now will believe his denials, and that his most sensible action at this point would be to confess and try to negotiate the best terms that he can. What the examiner says to the suspect is especially convincing and effective because the examiner genuinely believes it himself. Police experience in the U.S. suggests that as many as 50% of interrogated suspects do actually confess in this situation. These confessions provide virtually the only feedback of "ground truth" or criterion data ever available to a polygraph examiner. Suspects who pass the polygraph are not interrogated because the examiner firmly believes that they are truthful. Suspects who are not interrogated do not confess, of course, so therefore a confession always verifies the test that preceded it. This means that the examiner will seldom discover that a suspect he diagnosed as truthful was in fact deceptive because that bad news is almost always excluded by his dependence on confessions for verification. However, the examiner does periodically obtain these confession verifications of his diagnoses of deception, a steady diet of good news that confirms his belief that his procedure is nearly infallible. Note that the examiner's client or employer also hears about these same confessions and is also protected from learning about most of the polygrapher's mistakes. Sometimes a confession can also verify a lie test that results in a diagnosis of "truthful". This can happen when there is more than one suspect in the same crime. Then the confession of one suspect reveals that the alternative suspect is innocent. Once again, however, the examiner is substantially protected from hearing anything but good news. If the suspect who was tested first is diagnosed as "deceptive," then the alternative suspect is almost never tested at all because the examiner believes that he has solved the case. This means, therefore, it is rare that one suspect will confess and prove that someone who has already failed his test is actually innocent. Therefore, if a confession clears a suspect previously tested, then the suspect tested first will have almost certainly passed his test--otherwise the person who confessed would not have been tested. Thus, when a confession allows us to evaluate the accuracy of the test given to a person cleared by that confession, once again the news will almost always be good news; that innocent suspect will be found to have passed his lie test.
A Real Life Illustration Here is an example of how this process works in real life. In a recent issue of Polygraph, the trade journal of lie detection, a police examiner named Murray (1989) reports on a series of 552 lie tests that he administered to criminal suspects. He diagnosed 239 or about 43 % of the total group to be deceptive. Murray proceeded to interrogate these 239 suspects and obtained confessions from 105, or nearly half of them. Murray assumed that these verified-deceptive lie tests were representative of all 239 failed tests, but that is a mistake. All of the 105 lie tests that produced these 105 confessions were of course necessarily verified as accurate. These confessions also cleared 18 of the 313 suspects who had previously been tested and classified as truthful. Once again, Murray assumed that these 18 verified tests were representative of all 313 passed lie tests but, once again, this was an invalid inference. As we have seen, once a prior suspect has failed his lie test, alternative suspects in the same case are
218
LYKKEN
seldom tested at all; therefore, these 18 successes were also the almost inevitable consequences of reliance on confessions as the only criterion. Mr. Murray reports three cases in which he did discover that he had made a false-positive error. These were cases in which there were two possible suspects; the first person tested had failed his test but Murray was suspicious of the result and broke his own rule by going on to test the other suspect. In each case, the second person tested also failed, was interrogated, and confessed. In one case, for example, the first suspect was a habitual criminal whom Murray had happened to test on two previous occasions for other crimes. Both prior times, this man had failed, been interrogated, and confessed. On this third occasion, he again failed the lie test but this time continued to maintain his innocence. For this reason, Murray tested the other suspect, obtained a confession, and discovered that the habitual criminal was innocent of this third crime and that his third lie test had been in error. Murray concluded, however, that he had made only three errors in 552 lie tests, an accuracy of 99.4%. The important point is that Murray would have obtained much the same apparent confirmation if, instead of the polygraph, he had relied for his lie test on just the flip of a coin! About half of the coins would have come up "heads," indicating deception. About half of these purportedly deceptive subjects would have been guilty and many of the guilty ones, perhaps 105 of them, would have confessed following interrogation. Murray would be protected from learning, however, that most of the 134 persons who also came up heads but refused to confess were in fact innocent. Once again, a few of these confessions, perhaps 18 of them, would have cleared some (viz., 18) of the 313 suspects whose tests came up "tails" for truthful. Murray would never learn, however, that about half of the 313 people who passed his coin-flip lie test were in fact guilty. Polygraphers are not scientists or statisticians. When 20 to 25 % of their tests are verified by confession and they stumble upon errors as seldom as three times in 552, who can blame them for thinking they are nearly infallible? They are victims of their own deceptive art.
The Guilty Knowledge Test There is a fundamentally different method of polygraphic interrogation that does not attempt to detect lying but, rather, to detect whether the suspect possesses guilty knowledge (Lykken, 1959; 1981; 1988.) In some criminal cases, it is possible for the investigators to determine case facts that are almost certain to be known to the actual perpetrator but which would not be known to an innocent person. These facts can be presented to a suspect in the form of multiple-choice questions and the polygraph can be used to determine whether that suspect reacts differently to the case-relevant alternative than he does to the incorrect alternatives that are associated with the same question. For example: The bank robber wore a cloth cap. If you were the robber, you will remember what color that cap was. Was it: White? Red? Black? Green? Blue? Tan? Rather than using the conventional field polygraph with this test, one can employ a single psychophysiological variable that is likely to accompany the cognitive event of recognizing the guilty knowledge alternative, a variable like the electrodermal response, the evoked pupillary response, or an electrocortical response variable like the P300. There have been a number of studies of the sensitivity and specificity of the Guilty
219
LIE DETECTOR VS GUILTY KNOWLEDGE TEST
Knowledge Test as summarized in Table 3. The accuracy of the GKT increases with the number of test items and with the number of alternatives per item; an optimum test would include perhaps 10 questions with five scorable alternatives for each question. Moreover, one can actually predict what accuracy a given GKT should have. The eight studies aggregated in Table 3 used relatively short GKTs, averaging six questions and only four alternatives per question. The expected sensitivity of a six-item, four-alternative GKT is 89.1% correct detection of guilty suspects, compared to the 88.2% of the 161 guilty suspects in these eight studies who actually failed the GKT. The expected selectivity of a six-item GKT is 96.2%, which can be compared to the 96.7% of innocent suspects, aggregated over the eight studies, who were classified as innocent by the GKT.
The Demjanjuk Case Many people around the world have heard of the case of the Ukrainian, John Demjanjuk, who came to the United States after World War II, married, raised a family, and then, just a few years ago, was accused by the Israeli Government of having been the chief guard at the Treblinka concentration camp, a sadistic individual known as "Ivan the Terrible." Protesting that he had never set foot in Treblinka, Demjanjuk was none-the-less extradited to Israel for trial as a war criminal. After a lengthy proceeding, during which he continued to maintain his innocence, Demjanjuk was convicted in April of 1988 and sentenced to death. He is appealing that sentence and a considerable body of opinion in Israel remains unpersuaded that Demjanjuk is actually the notorious "Ivan." Partly because of American influence, Israel and Japan are among the few countries outside of North America in which polygraphic interrogation is employed. Had Demjanjuk been given a lie detector test, the "Relevant" and "Control" questions might have been like those shown in Table 4. "Were you Ivan the Terrible?" would be expected to elicit a strong
TABLE 3. Aggregated Results of Eight Analog Studies of the Accuracy of the Guilty Knowledge Test (GKT). G K T diagnosis
Status
of
subject
Guilty:
"Innocent"
Totals
"Guilty "
N
Accuracy
19
142
161
88.2%
147
5
152
96.7 %
166
147
313
93%
(sensitivity)
Innocent: (specificity)
Totals: True Positive Rate
= P(Guilty if classified "Guilty")
True Negative Rate = P(Innocent if ~Innocent")
=96.4% l =89.1% 1
1Assumes equal proportions of guilty and innocent suspects. Note: The studies summarized in this table are: Balloun & Holmes, 1979; Bradley & Warfield, 1984; Davidson, 1968; Gieson & Rollison, 1980; Iaeono, Boisvenu, & Fleming, 1984; Lykken, 1959; Podlesny & Raskin, 1978; and Stern, Breen, Watanabe, & Perry, 1981.
220
LYKKEN
T A B L E 4. Questions like those that would have been used in a Control Question p o l y g r a p h lie test had one been administered to John Demjanjuk. Relevant questions
Control questions
1. Were you a camp guard at Treblinka in 19447
1. Since 1946, have you ever felt dislike for any Jewish person?
2. Were you the chief guard called "Ivan the Terrible"?
2. Have you ever told a lie to get out of trouble?
response if Demjanjuk is truly guilty, stronger than a question like: "Have you ever told a lie to get out of trouble?" But, if Demjanjuk has been misidentified by those Treblinka survivors of 45 years ago, then the assumption of the CQT technique is that he would instead be more disturbed by the "Control" questions--although their truth or falsity cannot harm h i m - - a n d less disturbed by the "Relevant" questions, although he knows that, innocent or guilty, these questions put him in terrible jeopardy. I find it hard to understand how a rational person, not himself a polygrapher, could put any faith in a test predicated on such unreasonable assumptions. It seems unfortunate, however, that the Israeli authorities did not arrange for Demjanjuk to be given a Guilty Knowledge Test. The numerous Treblinka survivors could have provided e n o u g h p i e c e s o f i n f o r m a t i o n , f a c t s c e r t a i n to b e k n o w n to t h e r e a l I v a n , s o t h a t o n e o r m o r e 10-item tests could be constructed. The case-relevant facts could be prosaic details about the geographical layout of the camp, the names of other guards, the standard routine of the camp, a n d s o on; a n y t h i n g I v a n w o u l d k n o w b u t w h i c h s o m e o n e w h o h a d n o t v i s i t e d T r e b l i n k a would not know. T h e test i t e m s ( s e e T a b l e 5) c o u l d h a v e b e e n t r i e d o u t o n p e r s o n s u n f a m i l i a r w i t h T r e b l i n k a to m a k e s u r e that t h e i n c o r r e c t a l t e r n a t i v e s u s e d w o u l d a p p e a r to an i n n o c e n t p e r s o n to b e a s
T A B L E 5. Example of Guilty K n o w l e d g e Test Questions that Might have been U s e d in the Investigation of John Demjanjuk. Sample GKT Items
1. If you were a guard at Treblinka, you would have known the commandant's given name. What was the camp commander's first name? Was it: (1) Siegmund?
(2) Hans?
(3) Fritz?
(4) Max?
(5) Otto?
(6) Franz?
2. When Ivan went to breakfast each morning, he walked past one of the prison barracks on his right. Which barracks did Ivan pass each morning on his way to breakfast? Was it: (1) Block A?
(2) Block B?
(3) Block C?
(4) Block D?
(5) Block E?
(6) Block F?
3. The chief of the women guards at Treblinka was Frau Schmidt. What was Frau Schmidt's given name? Was it: (1) Karen? (2) Frieda? (3) Ella? (4) Olga? (5) Margot? (6) Gretchen?
LIE DETECTORVS GUILTYKNOWLEDGETEST
22 ]
TABLE 6. The Probability of Each Possible Outcome of a 10-Item Guilty Knowledge Test with the Odds, Given that Outcome, that the Subject is Innocent (i.e., has No Guilty Knowledge) versus Guilty (i.e., Recognized the Correct Answers to the GKT Questions). Guilty Knowledge Test Results Odds of Innocence vs. Guilt For Each Possible Outcome N of Hits
0 1 2 3 4 5 6 7 8 9 10
Innocent P(N)
.107374 .268436 .301990 .109714 .088080 .026424 .005505 ,000786 .000074 .000004 .000000
Guilty P(N)
.000000 .000004 .000074 .000786 .005505 .026424 .088080 .109714 .301990 .268436 .107374
prob~ Innocent
.999999 .99998 .9998 .993 .94 .5 .06 .007 .0002 .00002 .000(~ 1
Prob. Guilty
.000001 .00002 .0002 .007 .06 .5 .94 .993 .9998 .99998 .999999
Odds Innocent:guilty
1,073,742:1 65,472:1 4,098:1 140:1 16:1 1: 1 1: 16 1:140 1:4,098 1:65,472 1:1,073,742
Assumes: (1) a 10-item GKT with 5 scored alternatives; and (2) p = .80 that Guilty subject will "hit" on any given item. If we agree to call N = 5 "Inconclusive",then the Sensitivity of this GKT as well as its Specificity will equal 99.9%.
plausible as the c o r r e c t alternatives. T h e items c o u l d h a v e b e e n pretested also o n T r e b l i n k a s u r v i v o r s to m a k e sure that the c o r r e c t alternatives w o u l d be identified as c o r r e c t b y a n y o n e with k n o w l e d g e b a s e d on h a v i n g lived or w o r k e d in that c a m p . Finally, the tests c o u l d h a v e b e e n a d m i n i s t e r e d to D e m j a n j u k himself. T h e e s t i m a t e d sensitivity and specificity o f a 10-item, 5-alternative G K T are equal a n d better than 9 9 . 9 % . A b o u t 3% o f both guilty and i n n o c e n t subjects will "hit" o n e x a c t l y 5 items a n d h a v e to be classified as I n c o n c l u s i v e . I f D e m j a n j u k ' h i t ' on m o r e than 5 items, Israeli liberals c o u l d rest assured that the c h a n c e s o f D e m j a n j u k ' s being i n n o c e n t w e r e v a n i s h i n g l y small. O n the o t h e r hand, if he g a v e his strongest r e s p o n s e to, say, o n l y o n e o f the c o r r e c t a l t e r n a t i v e s - - a s c o r e that is a b o u t 6 5 , 0 0 0 times m o r e likely a m o n g i n n o c e n t than guilty p e r s o n s - - t h e n the Israeli hard-liners m i g h t h a v e b e e n willing to a c c e p t that D e m j a n j u k c o u l d not h a v e b e e n t h e Terrible I v a n after all (see Table 6).
References Balloun, K.S. & Holmes, D.S. (1979). Effects of repeated examinations on the ability to detect guilt with a polygraphic examination: A laboratory experiment with a real crime. Journal o f Applied Psychology, 64, 316-322. Bar,and, G. & Ra~ki~, D. (1976). Va~idity and re~iabiiily of polygraph examinations of criminal suspects. (Report 76-1, Contract 75 NI-99-0001). Washington, DC: U.S. Department of Justice. Bradley, M.T. & Warfield, J.E (1974). Innocence, information, and the guilty knowledge test in the detection of deception. Psychophysiology, 21,683-689. Cimerman, A. (1981). "They'll let me go tomorrow": The Fay case. Criminal Defense, 8, 7-10. Davidson, P.O. (1968). Validity of the guilty-knowledge technique: The effects of motivation. Journal o f Applied Psychology, 52, 62-65.
222
LVKKE~
Giesen, M. & Rollison, M.A. (1980). Guilty knowledge versus innocent associations: Effects of trait anxiety and stimulus context on skin conductance. Journal of Research in Personality, 14, 1-11. Horvath, E (1977). The effect of selected variables on interpretation of polygraph records. Journal of Applied Psychology, 62, 127-136. Iacono, W.G., Boisvenu, G.A., & Fleming, J.A. (1984). Effects of diazepam and methylphenidate on the electrodermal detection of guilty knowledge. Journal of Applied Psychology, 69, 289-299. Iacono, W.G. & Patrick, C.J. (1987). What psychologists should know about lie detection. In A.K. Hess & I.B. Weiner (Eds.) Handbook offorensic psychology. New York: John Wiley. Iacono, W.G. & Patrick, C.J. (1988). Polygraph techniques. In R. Rogers (Ed.), Clinical assessment of malingering and deception. New York: Guilford Press. Kleinmuntz, B. & Szucko, J.J. (1984). A field study of the fallibility of polygraphic lie detection. Nature, 308, 449-450. Lykken, D.T. (1959). The GSR in the detection of guilt. Journal of Applied Psychology, 43, 385-388. Lykken, D.T. (1981a). A Tremor in the blood: Uses and abuses of the lie detector. New York: McGraw-Hill. Lykken, D.T. (1981b). The law and the lie detector. Criminal Defense, 8, 19-27. Lykken, D.T. (1988). The case against polygraphy. In A. Gale (Ed.). The polygraph test: Lies, truth, and science. London: Sage. Lynch, M.B (1975). The American polygraph as the party affirming legal and social justice. Polygraph, 4, 154-164. Murray, K.E. (1989). Movement recording chairs: A necessity? Polygraph, 18, 15-23. Podlesny, J.Ik. & Raskin, D. (1978). Effectiveness of techniques and physiological measures in the detection of deceptio~ tfsychophysiology, 15, 344-359. Reid, J.E. ~ Inl?au, EE. (1977). Truth and deception: The polygraph ("lie detector") Technique, 2nd Ed. Baltimore: VCilliams & Wilkins. Stem, R.M., Breen, J.P., Watanabi, T. & Perry, B.S. (1981). Effect of feedback of physiological information on responses to innocent associations and guilty knowledge. Journal of Applied Psychology, 66, 677-681. Toronto Police (undated). Your Rights When Asked to Take a Polygraph Examination. Handout used by the Metropolitan Toronto Police Department during the 1980s.