Insurance Fraud: What makes a case look suspicious? By M i c h a e l T h e i l, Vienna1 Contents 1. Introduction 2. Study 3. Results 4. Discussion, conclusions and limitations
1. Introduction Insurance fraud has received increasing attention by researchers in recent years (e.g. Derrig 2002 or Viaene / Dedene 2004 for an overview). Much effort has been laid into establishing detection systems (for instance Brockett / Xia / Derrig 1998, Belhadji / Dionne / Tarkhani 2000, Tennyson / Salsas-Forn 2002, Major / Riedinger 2002, Artis / Ayuso / GuilleÂn 2002, Brocket et al. 2002, Viaene et al. 2002). Quite naturally, such models have in common that they rely on ex post information. In other words, they analyse established (and in this sense unsuccessful) attempts of insurance fraud and extract characteristic variables in order to use them in future control systems. Such analysing tools are certainly useful for some kind of ªstandardº insurance fraud; however, it is apparent that they systematically overlook ªnon-standardº forms. Keeping in mind that there is word of many fraudulent claims staying undetected and that insurance companies are interested in getting suspicious claimants to withdraw their claims, serious problems may arise from trusting court evidence only. In the following, we choose a different approach. The basic idea is that everyone has a particular picture of the characteristics of insurance fraud, which this research attempts to portray. Thinking aloud is a widespread method to elicit judgments that cannot be observed otherwise. In support of this method, each of our subjects is provided with two cases of possible insurance fraud. Their task then is to consider the features of each case that 1 ao. Univ.-Prof. Mag. Dr. Michael Theil, Institut fu È r Versicherungswirtschaft, WirtschaftsuniversitaÈt Wien.
148
Michael Theil
do or do not indicate insurance fraud. These sessions are documented in verbal protocols which are further analysed by means of content analysis. The first problem we wish to address is that whether or not these pictures of insurance fraud differ from the characteristics derived from unsuccessful fraud attempts as in earlier studies. Furthermore, since we may expect asymmetry in knowledge between laypersons and fraud control experts in that fraud control experts dispose of experience with many (potentially) fraudulent cases and procedures to detect them (and fraudsters tend to do not), we compare expert and non expert judgments. Finally, because we are in control of the cases presented to subjects, we may gain insight in the manner fraud related information is processed. The remainder of this paper is organised as follows: The next section introduces the study, its methods and procedures. The third part of this work presents the outcomes, ranging from formal results and examination of subjects' reasoning during the interviews to comparison to previous studies. The final part of the paper is devoted to discussion and conclusions. The cases that formed the basis of this study are described in the appendix.
2. Study 2.1 Material
Our aim was to have close to reality material available for our study. In co-operation with the Insurance Fraud Control office, we had access to files documenting claims suspicious of fraud. In general, these files contain a report by the person who suffered the loss. In many cases, there is additional material, such as memos, pictures and sketches available. Such material is usually prepared by the insurance company, by the police or by experts involved in the process. For obvious reasons, any data which may allow identification of the persons involved have been omitted. Since much of the information in the files deals with exactly the matter of identification, the remainder, describing the situation in which the damage occurred, is often quite brief. Language in the files often appears laboured. Since the material presented to the subjects of our study is basically an excerpt of the original files, this characteristic is maintained. From the set of eight available files we selected five cases after closer examination and discussion. For the sake of comparability with earlier studies, which are generally limited to property damages in the automobile sector (e.g. Artis / Ayuso / Guillen 1999, 2002, Belhadji / Dionne / Tarkhani 2000, Edelbacher 1995 and Tennyson / Salsas-Forn 2002), we concentrated on cases that fit into this scheme. The cases selected for our study are reproduced in the appendix, retaining the original file numbers.
Insurance Fraud: What makes a case look suspicious?
149
2.2 Subjects
Subjects consisted of 15 experts in insurance fraud control and of 27 students. The experts were chosen from a list of consultants to the Insurance Fraud Control Office. The students were recruited and interviewed at the university, while experts were met in their office.
2.3 Interviewing method
The thinking aloud method was chosen to gain access to the subjects' cognitive processes (Ericsson / Simon 1993, van Someren / Barnard / Sandberg 1994). In brief, comprises four distinct elements: (1) a problem intended as a starting point for thinking aloud, in the present case, pairs of claims suspicious of insurance fraud; (2) the process of thinking aloud itself, in which subjects reveal their model of the underlying problem; this process is recorded on audio and / or video tape; (3) these records are transcribed into written verbal protocols; (4) analysis and interpretation of the verbal protocols; in order to systematically collect and compare a large amount of textual data, content analysis (Weber 1990) was the method of choice. The main benefit of the thinking aloud method is that it practically avoids any interviewer bias, a problem many other interviewing techniques often face. On the other hand, as Biehal and Chakravarti (1989) point out, subjects generally have more time available to reflect the problem than in a given situation outside the interview. While this may be disadvantageous for studying swift decisions, a more thorough analysis of the given problem appears useful for our purposes. Furthermore, as the process is recorded and as such often understood by subjects as being ªin publicº, they tend to carry out their task more carefully, for instance in reading the material more often, considering more details and other. Again, this fact is in favour of the thinking aloud method for our purposes. The initial problem for the interview is provided by the cases suspicious of insurance fraud. We organised these cases into pairs, asking subjects, which of the two cases in such a set they consider to be fraudulent, and for which reasons. Please note that we are not interested in whether or not subjects correctly identify fraudulent cases. Actually, most cases then were still unsettled, leaving such question unanswered. Rather, we are interested in the subjects' approach of reasoning and deciding. Cases were organised in six variants: case #1 (crossing deer) and case #5 (stone) case #1 (crossing deer) and case #7 (parked car) case #1 (crossing deer) and case #8 (road roller) case #2 (Lake Balaton) and case #5 (stone)
150
Michael Theil
case #2 (Lake Balaton) and case #7 (parked car) case #5 (stone) and case #8 (road roller) Since the number of insurance fraud control specialists is quite limited, a full combination of all cases was infeasible. Therefore we decided to have combinations of cases that are extensively documented with such that were only briefly described. The subjects' reasoning was recorded on audiotape during the interview. These recordings were then transcribed into verbal protocols.
3. Results 3.1 Formal Characteristics
This section is intended to present an instructive overview how verbal protocols in our research look like. Furthermore, it addresses the question whether there are significant differences in problem treatments within the material. Our interviews produced 104 verbal protocols in total. Protocols varied in length, that is, some subjects expressed their thoughts very briefly, while others thoroughly reasoned out the given problem. Cognitive structures are referred to as thoughts and treated as one of the basic entities of content analysis (Ericsson / Simon 1993, 222; van Someren / Barnard / Sandberg 1994, 117). Another entity is the number of characters. Although the former appears more arbitrary than the latter, coder reliability assures consistency within the chosen framework. Two examples illustrate the different ways in which subjects treated the problem:2 Extract from protocol #2: 1. If it is raining and the surface is wet, the anti-lock system may not be the deciding factor, because for example wet leaves give special conditions that no driver wants 2. A sports chassis may indicate that the driver has a tendency to drive fast 3. First, he goes to the right to the roadside, then to the opposite side and then again back to the other side 4. These details appear quite strange to me; skidding three times back and forth happens rarely, after all, if it happens at all 5. [ . . . ] 2 The original language for case text and material, as well as for protocols is German.
Insurance Fraud: What makes a case look suspicious?
151
Extract from protocol #59: 1. Um 2. It happens during wintertime 3. The daytime is not given 4. The road was surely slippery 5. or snow 6. and a deer 7. he probably startled 8. [ . . . ] These examples also provide insight in the way the protocols were transcribed: every single thought (and even interjections) was singled out in a separate and numbered paragraph, thus providing the appropriate material for subsequent content analysis. Overall more than 2,500 thoughts were recorded. Regarding the number of thoughts and the number of characters, we find that both distributions are positively skewed (i.e. many observations with low and moderate values versus few observations with very high values) with means of 25.45 and 1,613.24, respectively (Kolmogorov-Smirnov tests p < 0:001 for thoughts and characters per protocol). For both variables, variances appear homogenous (Levene tests n.s. for thoughts and characters per protocol).3 Looking at differences between experts and non-experts, we do not find differences neither regarding the number of thoughts nor the number of characters (2 -tests for thoughts*expert / non-expert and characters*expert / non-expert n.s.). Thus, being an expert or a layperson is not a discriminator with respect to protocol length. Turning now to the combinations of cases, there are no differences to observe concerning the number of thoughts per protocol (2 -tests for thoughts*combination of cases n.s.) and characters per protocol (2 -tests for characters*combination of cases n.s.). Therefore, protocol length does not appear to depend on which cases subjects had to evaluate. Since subjects were told that one of two cases is fraudulent, they might have come to a conclusion after having considered the first case ± even though both cases have been presented simultaneously. Contrary to that, detailed inspection of the protocols revealed that subjects reflected on both cases (z-test, p < 0:001). Furthermore, their decision did not depend on which case they considered first (2 -test for first / second case*alternative 3 The above examples show differences in verbalization skills, which are, however, unavoidable (van Someren / Barnard / Sandberg 1994, 35).
152
Michael Theil
considered n.s.). Both facts can be taken as evidence that subjects were well motivated and fulfilled their task thoroughly. The results hold when we control for experts versus non-experts (Kruskal-Wallis for experts / non-experts p < 0:001) and for combination of cases (Kruskal-Wallis for combination of cases p < 0:001). Therefore, in summary, there are several results from formal analysis: Firstly, subjects appear well motivated and willing to treat the presented problem in depth. We do not observe significant differences between experts and non-experts in this respect. Secondly, considering the inclusion of alternatives, case sequence and case combinations, there are no structural differences to note. Following from that, there appears to be sufficient evidence that all protocols are equally suitable for further analysis. 3.2 Qualitative content analysis
3.2.1 General It is quite obvious that subjects use different languages in treating the problem in question, that is, the terms and expressions they use and the way of their argumentative lines often vary considerably. Some of the aspects that appear in verbal protocols may be grouped together, sometimes allowing for statistical treatment of these data. Such grouping, however, may be prone to overlooking peculiar characteristics of meaning and context. Furthermore, even in a large sample as in the present research, specific statements may appear only scarcely. Under these circumstances, statistical tests loose their otherwise important value and become replaced by other techniques. Therefore, in the following sections, tests are only run when allowed by contextual and technical factors. 3.2.2 Categories The core of content analysis lies in the extraction of aspect categories from the recordings. That is, what subjects mention during the experiment (ªaspectsº, sometimes also named ªthemesº) is grouped into categories. Please note that ªaspectsº differ from above mentioned ªthoughtsº: While the term ªthoughtº refers to the structure of the interview, ªaspectº refers to its contents. Of course, subjects rarely use exactly the same words or expressions when discussing the given problem. Therefore, building categories might appear somewhat arbitrary. To counter this objection, the concepts of inter-coder reliability and intra-coder-reliability have been developed, measuring the degree of agreement between two coders and the same coder in two different points of time, respectively. A coding scheme is usually developed in several
Insurance Fraud: What makes a case look suspicious?
153
runs, where the coding scheme for the same text is varied until it fits all aspects and until categories are unambiguous. Mismatches were traced down and so-revealed errors, mostly omitted encodings, eliminated. At the end of this process, coder reliability exceeded 90% for all cases, which is deemed very satisfactory (FruÈh 2001). The following table gives an overview and the basis for further analysis. Case
Aspect categories, sorted by frequency (in brackets)
1
A deer crossed the road (28), the accident happens in late November (26), the road may have been wet (23), the car skids three times (20), the car is only two years old (17), the car is equipped with ABS (17), the driver probably went too fast (17), the accident has been reported to the police (13), damage is A 5,000 (9), the incident appears normal (8), which kind of insurance is involved? (6), the car is an Alfa (4), sketch of the scene (4), the accident happens on a country road (2), the car is strong (2)
2
Second car key and documents are missing (31), the previous car owner has been involved in insurance fraud (18), the car is 4 years old (16), the damage is A 7,000 (14), the car has already been damaged (13), the car was stolen in Hungary (11), the car is a Volvo (8)
5
The stone hits the side door (26), the damage is A 500 (15), the incident appears normal (13), the driver probably did not keep safety distance (12), how can a stone fall off? (11), the driver probably jerked the steering wheel around (10), the driver probably went too fast (6), the driver probably overtook the lorry (3), which kind of insurance is involved? (1)
7
The picture shows a car door (18), the damage is A 3,000 (17), scratches on the car door (17), the car is four years old (8), the car is a Mazda (6), the damage has been photographed (2), which kind of insurance is involved? (1)
8
The driver has not received payment for a longer period of time (24), how fast can a road roller go? (21), a road roller is a very heavy vehicle (18), the driver probably want to take revenge on his employer (16), a road roller is a very large vehicle (10), the damage is A 10,000 (10)
3.2.3 Number of aspects considered The number of aspects may be taken as an indicator for case complexity. There are several dimensions to consider: (1) richness of material: if a case is described in much detail, it offers more aspects to evaluate; (2) association of ideas: even if a case is described briefly, subjects may consider aspects that are not directly taken from the case description; (3) the solution of the problem may appear straightforward, without leaving doubts, so that a limited number of arguments seems sufficient. case
1
2
5
7
8
max. # of aspects
10
7
5
5
5
average
4.7
3.3
2.3
2.3
2.6
154
Michael Theil
For all cases, the distribution of aspects considered is unimodal and positively skewed (Kolmogorov-Smirnov p < 0:001 for both, experts and nonexperts), i.e. the majority of subjects take relatively few aspects into account. Non-experts consider somewhat more aspects than experts. However, this difference is not statistically significant (2 -test, n.s.). Cases #1 and #2 differ from others with respect to the number of aspects taken into account by subjects (Kruskal-Wallis controlling for cases p<0.001). As noted above, this does not come as a surprise since these cases provide more material and tell a more contrived story. The message from these results is mixed. On the one hand, a larger amount of available information appears to make subjects evaluate more aspects of the case. Requesting more details from the claimant may thus make fraud detection easier. On the other hand, following from the distribution of aspects they take into consideration, subjects are far from exhausting all available information. Notably, in only less than ten percent of cases, subjects cautiously expressed regret that case descriptions offered only limited information, and in only four cases (two experts, two laymen) they explicitly demanded additional details, although remaining unspecific which. Potential fraudsters may thus have a good chance to conceal details that might raise suspicion. 3.2.4 Aspect ranks The importance of single aspects varies between experts and non-experts and also depending on case. Spearman's rank correlation coefficient rs shows a high degree of correspondence for cases #2 and #5 rs 0:935, p < 0:01 and rs 0:930, p < 0:001, respectively), that is, experts and non-experts consider aspects similarly. There are, however, substantial differences for other cases (rs < 0:59; n.s.). With respect to aspects appearing in more than just one case description, the results show the following: The question of insurance is more important for experts. This effect is most pronounced for case #1, where insurance is the aspect most often mentioned by experts and least by non-experts. For other cases, this effect is somewhat weaker. While the amount of damage is largely ignored by non-experts, it appears very important for experts, particularly for case #8 and less distinct for other cases. Interestingly, experts rank the amount of damage higher than non-experts the higher the sum of money involved. That is, differences in rank are highest for case #8 (where damage is A 10.000), diminishing until ranks correspond for case #5 (A 500). While the car type is more often taken into account by experts in cases #1 and #7, there is no difference for case #2 (and it is ignored by either group in case #5).
Insurance Fraud: What makes a case look suspicious?
155
Other aspects appearing in two or more case descriptions, such as type and age of the car, do not show a particular pattern. Turning to variation within individual cases, and omitting highly corresponding cases #2 and #5, we observe the following: Most pronounced differences appear in case #1, with aspects ªthe road may have been wetº and ªthe driver probably went to fastº being more important to non-experts, and ªthe car is equipped with ABSº to experts. For case #7, experts tend to look more closely on the number, direction and size of scratches on the car door, while non-experts confine themselves to merely stating that the accompanying picture shows damage to a car door. With respect to case #8, non-experts speculate a lot about collaboration between driver and employer in staging the accident. All other differences in ranks are only marginal. Overall, while there is a great deal of correspondence between experts and non-experts concerning the aspects they take into consideration, experts appear to have an eye for details and stick more closely to case material, a point discussed more deeply in the following section. 3.2.5 Original and derivative aspects Aspects may be taken directly from case text or material (original aspects); or, they may stem from associative thinking, thus being indirectly related to case material (derivative aspects). Both variants appear in the present work. By comparing the presented text and material with interview transcriptions, original aspects are easily determined. All others are counted into the derivative category. Case
Original
Derivative
1
A deer crossed the road (28), the accident happens in late November (26), the car skids three times (20), the car is only two years old (17), the car is equipped with ABS (17), the accident has been reported to the police (13), damage is A 5,000 (9), the car is an Alfa (4), sketch of the scene (4), the accident happens on a country road (2), the car is strong (2)
The road may have been wet (23), the driver probably went too fast (17), the incident appears normal (8), which kind of insurance is involved? (6)
2
Second car key and documents are missing (31), the previous car owner has been involved in insurance fraud (18), the car is 4 years old (16), the damage is A 7,000 (14), the car has already been damaged (13), the car was stolen in Hungary (11), the car is a Volvo (8)
156
Michael Theil
Case
Original
Derivative
5
The stone hits the side door (26), the damage is A 500 (15), the driver probably jerked the steering wheel around (10)
The incident appears normal (13), the driver probably did not keep safety distance (12), how can a stone fall off? (11), the driver probably went too fast (6), the driver probably overtook the lorry (3), which kind of insurance is involved? (1)
7
The picture shows a car door (18), the damage is A 3,000 (17), scratches on the car door (17), the car is four years old (8), the car is a Mazda (6), the damage has been photographed (2)
Which kind of insurance is involved? (1)
8
The driver has not received payment for a longer period of time (24), the damage is A 10,000 (10)
How fast can a road roller go? (21), a road roller is a very heavy vehicle (18), the driver probably want to take revenge on his employer (16), a road roller is a very large vehicle (10)
Overall, the majority of aspects (69.23%) is taken from case text or material. For experts, this share is significantly higher (2 -test, p < 0:001). That is, experts rely more on readily available information, while non-experts tend to get more ideas of their own. Occurrence of original versus derivative aspects, however, varies even more between cases, with case #2 (100 % original aspects) and case #8 (34.34 % original aspects) at the extremes. Thus, the overall preference for original aspects is due to case #2. If we exclude this particular case from analysis, both aspect classes are equal. Within cases (except #2 which has no derivative aspects), original aspects receive more attention than others in case #5, while the results are more varied for other cases. As for case #1, subjects consider some original aspects, such as ªa deer crossed the roadº or ªthe accident happens in late Novemberº very often, while others, for instance the very prominent sketch of the scene, are hardly noticed. On the other hand, some derivative aspects like ªthe road may have been wetº or ªthe driver probably went too fastº appear very often in the protocols. Similar results are obtained for remaining cases. Overall, there is no particular patter in favour or against one of the classes, original or derivative. While experts exhibit a general tendency to consider original rather than derivative aspects, insurance is the only noteworthy exception. In the particular context of insurance fraud, however, this does no harm to the overall conclusion that experts rather count on the information available. Except for the car type in case #5, all information presented in accompanying text is considered at least once. This, of course, holds with respect to
Insurance Fraud: What makes a case look suspicious?
157
the aggregate. On the individual level, subjects always treated only a limited number of aspects. 3.2.6 First idea The reason to analyse the first idea that comes to one's mind is that ± while it may not be the very best argument ± it appears important to the decision-maker. Case
First ideas, sorted by frequency (in brackets)
1
A deer crossed the road (10), the car is only two years old (7), the accident happens in late November (7), the car skids three times (4), the incident appears normal (4), the car is equipped with ABS (4), which kind of insurance is involved? (3), sketch of the scene (2), damage is A 5,000 (2), the road may have been wet (1), the car is an Alfa (1), the accident happens on a country road (1), the car is strong (1), the driver probably went too fast (0), the accident has been reported to the police (0),
2
the car was stolen in Hungary (8), second car key and documents are missing (8), the car is a Volvo (6), the damage is A 7,000 (4), the car is 4 years old (3), the previous car owner has been involved in insurance fraud (3), the car has already been damaged (2)
5
The incident appears normal (13), the stone hits the side door (12), how can a stone fall off? (5), the damage is A500 (5), the driver probably did not keep safety distance (4), the driver probably went too fast (1), the driver probably jerked the steering wheel around (1), the driver probably overtook the lorry (0), which kind of insurance is involved? (0)
7
The picture shows a car door (14), scratches on the car door (8), the damage is A 3,000 (5), the car is a Mazda (3), the car is four years old (2), the damage has been photographed (1), which kind of insurance is involved? (1)
8
A road roller is a very heavy vehicle (10), how fast can a road roller go? (9), a road roller is a very large vehicle (8), the driver has not received payment for a longer period of time (6), the damage is A 10,000 (5), the driver probably want to take revenge on his employer (0)
There are considerable differences between the rankings of aspects mentioned as first idea and as a whole (rs < 0:67; n.s.). Apparent from the above table (indicated by frequency 0), subjects consider some aspects only later during their interview. First ideas are often either summarizing (ªthe incident appears normalº) or deal with aspects taken directly from text or material of a case. The exception is case #8, where subjects generally start reasoning about the characteristics of a road roller. On the other end of the spectrum, ideas coming up in later stages of the interview are often speculative in nature, for instance ªthe road may have been wetº, ªthe driver probably went too fastº (both case #1), ªthe driver probably overtook the lorryº, ªthe driver probably jerked the steering wheel aroundº (both case #5), ªthe driver probably want to take revenge on his employerº (case #8).
158
Michael Theil
Finally, first ideas by experts and non-experts differ considerably (rs < 0.57, n.s.). The most remarkable differences can be found again concerning insurance and the amount of loss, both more often mentioned by experts as first ideas. Aspects to which non-experts tend to turn first are more varied, with no particular pattern to discern. 3.3 Comparison of ex ante and ex post characteristics
Quite naturally, insurance companies are interested in identifying indicators for insurance fraud. Approaches for detecting something abnormal about a claim cover a wide range of measures, from relatively simple checklists for claims adjusters (some examples are provided by Edelbacher 1995) to automated algorithms to screen claims (for instance Brockett / Xia / Derrig 1998 or Major / Riedinger 2002). As we have noted earlier, detection systems are based on claims successfully identified as fraudulent. Since this may constitute an inherent bias, the question arises whether we find agreement or differences using the present approach. As with the cases used for this work, many reports on fraudulent claims are in the area of automobile insurance. In the probably most extensive overview, Belhadji / Dionne / Tarkhani (2000) collect more than 50 fraud indicators, 23 of which later prove significant predictors. Edelbacher (1995), Artis / Ayuso / Guillen (1999, 2002) and Tennyson / Salsas-Forn (2002) present some additional signs for automobile insurance fraud. Comparison is further limited to such indicators that correspond with information presented in the case material. The following table lists items suitable for comparison. loss exceeding A 2,000, minor collision produced large damage. damage implausible, no police report, accident in nonurban area or on a quiet road, accident involving a single vehicle, accident involving a parked car, accident involving an ordinary old car, accident happened while car backed up, accident involved collision with a wall, the car has many extras, the insured is having financial difficulties, accident involving foreign citizens
For all five cases, the amount of loss was explicitly stated. Except for case #5, loss amount well exceeded the A 2,000 limit. Despite this prominent status, only a third of subjects make a remark on this matter. This result does not depend on cases. In particular, case #5, where loss is very low, does not differ from other cases in this respect. Further, there are no significant differences to observe between experts and non-experts. A small number (4) of subjects say that fraud would not pay for a small amount of A 500, and one subjects thinks that it would pay for A 7,000. Over-
Insurance Fraud: What makes a case look suspicious?
159
all, there is little evidence that subjects consider the mere size of loss as an indicator for insurance fraud. The vast majority of remarks involving the amount of loss are about plausibility or causality of loss size. Examples are considerations such as ªis a four year old Volvo worth A 7,000?º or ªcan this accident cause a damage of A 500?º, thus corresponding with search items ªminor collision produced large damageº and ªdamage implausibleº. Conclusions to these questions were very controversial: one and the same loss size was determined plausible and implausible alike, not depending on whether this judgment came from an expert or from a non-expert. Plausibility of loss size therefore offers no straightforward clue to fraud detection. Material for cases #1 and #2 contained reference to a police report and details of a record by the police, respectively. Well less than a third of subjects mentioned the police report in case #1, almost all of them in a sense that ªyou cannot cheat the policeº, i.e. presence of the police on the spot made the claim more credible. In a similar manner, a police record concerning the previous owner of the stolen car raised speculations about collusion between him and the owner at the time of the theft. Police reports in case #1 and #2, therefore, led to considerations of plausibility and causality, much as information about loss size. Experts and non-experts did not differ in this respect. Since a missing police report is regarded as an indicator for insurance fraud, and since most of the cases did not have one, we expected this to be echoed in the verbal protocols. Surprisingly, only two subjects (non-experts) mildly demanded one, stating that it would be helpful for evaluating the case. Two accidents, case #1 and #5, specifically, happened on a country road, as indicated in the case material. According to previous research and investigation manuals, accidents happening in nonurban areas are regarded suspicious of fraud. Nonetheless, only four non-experts (two for each case) and ± notably ± none of the experts mentioned this detail during the interview. Case #1 was the only accident with a single vehicle involved. This circumstance, while appearing suspicious of fraud in other sources, was not further discussed in the present study, neither by experts nor by non-experts. By contrast, subjects rather tried to imagine how accidents with more than just one car could have happened, thus considering aspects like distance, relative speed and driving manúuvres. In a similar manner as with judgments concerning loss size discussed above, these are considerations of plausibility or causality rather than a question of the number of vehicles involved. Again, as with plausibility of loss size, these considerations do not differ between experts and non-experts.
160
Michael Theil
Subjects raised even more doubts concerning a claim's legitimacy when incidents involved a parked vehicle. This was most pronounced in case #7, where the majority of respondents raised questions as to whether the pictured damage is plausible for damage to a parked car. The importance of this issue is further underlined by the fact that it is usually mentioned as the first idea. Experts treat this aspect more thoroughly than non-experts, discussing details of the place, direction and size of the scratches. The cars contained in the cases of this research, as far as there is information available, are very much everyday examples: they do not come from an exceptional manufacturer and they are at maximum four years old. Age appears to be a controversial characteristic: some considered four years as old, others as relatively new for a car. Nevertheless, only about ten percent of subjects, irrespective of being an expert or not, think of car age when evaluating the cases and if they do so, they have no common notion of whether car age point towards fraud or against. Case #8 was the only one in which the accident happened while backing up. Slightly more than half of the subjects (with no difference between experts and non-experts) discussed this issue, all considering whether the speed would be enough to turn over the road roller. None of the cases had information concerning collision with a wall. However, the scratches on the car door on the picture of ace #7 raised speculations (three subjects, all of them non-experts) that the car might have had contact with a wall. Case #1 stated that the car had a number of extras. Little more than a third of subjects considered this characteristic, with no difference between experts and non experts. Generally, the presence of car extras leads to the speculation that the accident was caused by excessive speed. Description of case #8 mentioned that the truck's driver had not received payment for some time. About two thirds of subjects (experts and non-experts alike) interpreted that in a way that he might have had financial difficulties and thus ± in collaboration with his employer ± tries to cheat the insurance company. This is not directly comparable to the fraud indicator, where the insured himself is experiencing financial difficulties, a discrepancy which subjects tended to explain by collaboration between employee and employer. Although there was no information that foreign citizens were involved in one of the cases, the car in case #2 was reported stolen in Hungary. About a third of subjects (experts and non-experts alike) considered this fact in that the incident therefore appeared more suspicious. Overall, subjects consider these aspects only to a very limited extent. That is, while the majority of subjects discusses three characteristics (financial
Insurance Fraud: What makes a case look suspicious?
161
difficulties, parked vehicle and backing up), all others remain practically disregarded. Bearing in mind that previous work has found that these aspects are good indicators for insurance fraud, this result comes as a surprise. Also astonishing is the fact that experts do not appear to have an advantage over non-experts in this respect. None of the subjects considered aspects that do not appear in case text or material, but are regarded as significant predictors elsewhere. In other words, none of derivative aspects appearing in the interviews corresponds with ex post characteristics of insurance fraud found in other studies. 4. Discussion, conclusions and limitations Above analysis consists of two major parts: First, it gives a novel and detailed account of thinking about insurance fraud; second, it contrasts present findings with characteristics that have been significant for insurance fraud ex post. Turning to the first area of interest, formal analysis shows that protocols by experts and those by non-experts appear undistinguishable, at least at first glance. In particular, we do not find differences between these groups concerning formal aspects, such as protocol lengths, as measured by number of thoughts and number of characters. In a similar manner, formal measures did not vary depending on combination of cases, that is, protocol length does not appear to be influenced by problem sets presented to subjects. Furthermore, we established that subjects did indeed consider both cases in the presented set and that their decision did not depend on which case they discussed fist. These results support the assumption that time and effort to fulfil the task are similar between subject and task groups. Analysis of protocol contents produces a detailed report of aspects. Consistent with formal results, overall, most subjects take relatively few aspects into account, while some consider many aspects. Two cases, #1 and #2, differ markedly from others, in that more aspects are treated. Notably, these two cases also contain more material. Given that we do not observe differences between cases with respect to formal measures, interpretation is twofold: First, a larger quantity of material is reflected in a greater variety of aspects considered by subjects; second, limiting factors are workload (expressed by formal measures), which does not increase when more material is presented, and inefficient use of information (in that subjects do by far not exhaust all details given). Practically speaking, while it may sound useful for fraud control to gather as much facts as possible, the effect may be only marginal. These results hold for experts and non-experts alike. Aspect ranks capture the overall importance to subjects. While analysis until this point shows similar results for experts and non-experts, there are
162
Michael Theil
some differences to note concerning aspect ranks. In particular, experts exhibit an overall tendency to look closer at the kind of insurance involved and on the extent of damage, two facets that appear important in the given context. Furthermore, when we distinguish between aspects that are taken directly from case material (original aspects) and others that go beyond (derivative aspects), it turns out that experts rely more on readily available information. This result may indicate that experts are less prone to speculation, since many derivative aspects appear based on guessing rather than facts. On the other hand, ªgoing beyond the obviousº may open the path to alternative explanations that ± in actual fraud cases ± could be further investigated. This view, however, is countered by the fact that only very few subjects demanded additional information and they did so very mildly, much leaving consideration of derivative aspects in the area of speculation. First ideas are often taken from case material or they are summarizing, for instance stating that an incident appears normal. Overall, experts and non-experts appear similar in their treatment of cases at a general level; differences only appear when protocols are analysed in more detail. The results are quite mixed: Experts do not lead concerning the intensity of analysis. Rather, they appear to be subject to the same limitations as non-experts, in that they content themselves with analysing only small portions of available evidence. Experts are, however, somewhat different in picking clues. Some of them seem very reasonable for the problem in questions, for instance considering loss amounts and insurance particulars. On the whole these expert advantages remain relatively small. Comparison of ex post characteristics of insurance fraud identified in earlier work with aspects considered in the present research constitutes the second area of interest. Thirteen of these fraud indicators were linked to case text or material. The results show that overall, subjects perform weakly in considering these indicators of insurance fraud. There would be reason to assume that, since these characteristics are widely known to experts, they receive particular attention and that experts are superior to non-experts in picking the suspicious pieces of information. However, neither of these assumptions does hold on the basis of this analysis. Many argue that there is a significant number of undetected cases of insurance fraud. In fact, the present results suggest that fraudsters may have quite a good chance to escape being discovered. In particular, much of the information available is ignored. Even the subjects performing best do not come close to discuss a case in full, not to mention on average or below. Furthermore, known indicators of insurance fraud do not receive proper attention: If such hints are available in the case descriptions, they are often neglected, and if they are not, subjects tend not to request additional information. Finally, performance of fraud control experts is only marginally better than that of laypeople.
Insurance Fraud: What makes a case look suspicious?
163
Certainly, there are some limitations to this study worth noting. With respect to cases and accompanying material we think that we did the utmost to work close to reality. Each of the cases could actually have landed on each of the subjects' desks for evaluation (and, in fact, cases were subject to investigation, which was revealed to the subjects). As a method, verbal protocols are well-founded, but of course cannot give an absolutely full account of what people think during analysis. There is little, however, to improve. While we concede that in our interviews, subjects are not responsible for the quality of their analysis, probably reducing their effectiveness, they appeared well motivated to fulfil their task. Participation is probably an alternative method for future work, although it may face other limitations. Critics may argue that the number of participants is limited. This may, however, overlook that verbal protocol studies often work with a far smaller number of subjects, in particular when analysing expert groups. Instead, material richness, which is very high in the present work, is commonly regarded as the key issue. Yet, as with many other studies involving verbal protocols, increasing detail is accompanied by limited opportunities for statistical tests. Since interest in such studies usually lies in the subtleties of problem treatment, this disadvantage is inherent to aim and method. It should be noted, however, that there are methodological concerns of their own for ex post studies as well. In light of this, present results point at a serious gap between what instructions suggest and how people actually treat suspicious cases. Room for improvement appears to lie in making the most of what is available for assessment and in integrating aspects judged important in this research into checklists and handbooks.
5. Appendix Text
Case #1 ªcrossing deerº
As a result of a deer crossing the road, a vehicle (see material A) goes into a skid, crashing into several obstacles (see material B and C). The car has many extras, among them an anti-lock braking system, air conditioning, and a sports chassis. It is little over two years old at the time of the accident, which happens on a county road in late November. Further details about visibility and road conditions are unknown. The accident has been recorded by the police. Damage repair costs about A 5,000. Material A Material A represents basic vehicle data, such as type of car, colour, chassis number, mileometer reading, engine and registration details.
164
Michael Theil
Material B In material B, the claimant gives a handwritten account of the incident as part of the procedure to file a claim. Material C In addition to his verbal report, the claimant produces a detailed sketch of the scene in material C. Case #2 ªLake Balatonº
Text While on a trip to Lake Balaton (Hungary), a Volvo 440 is reported stolen. The car is four years old. Estimated loss is A 7,000. During investigation of the case it turns out that the vehicle has already been damaged when having been sold to the present owner (see police report material D). The records also contain a memo about the previous proprietor (material E). The present owner is unable to find the second car key, car documents and the contract for sale. Material D Material D contains a police report about an accident in which the previous car owner has been involved. The report describes scene and event occurrence in detail, however, it remains vague regarding car damage. Material E The insurance company questioned the previous car owner in the matter of the above accident. The memo mentions that this person is accused of other offences, among them insurance fraud. Case #5 ªstoneº
Text On a country road, a Renault MeÂgane drives behind a lorry, when a stone falls off the loading space. In spite of an evasive manúuvre, the stone hits the Renault's side door, causing A 500 damage. Case #7 ªparked carº
Text A parked Mazda 626 is damaged by an unknown vehicle (see material F). Damage repair cost is about A 3,000. The Mazda is four years old at the time of damage.
Insurance Fraud: What makes a case look suspicious?
165
Material F The picture shows the right front car door which has been removed from the car and stripped of door handle and glazing. Several scratches run across the door. Case #8 ªroad rollerº
Text While turning his truck around, the driver overlooks a road roller (see material G) which is parked at the roadside. Subsequently, the roller falls over. Damage is at about A 10,000. Closer investigation reveals that the employer has not paid the driver's wages for quite a while. Material G The picture shows the road roller, recovered after the incident.
References Artis, Manuel / Ayuso, Mercedes / Guillen, Montserrat (1999): Modelling Different Types of Automobile Fraud Behaviour in the Spanish Market; Insurance: Mathematics and Economics 24: 67 ± 81. Artis, Manuel / Ayuso, Mercedes / Guillen, Montserrat (2002): Detection of Automobile Insurance Fraud with Discrete Choice Models and Misclassified Claims; Journal of Risk and Insurance 69 (3): 325 ± 340. Belhadji, El Bachir / Dionne, Georges / Tarkhani, Faouzi (2000): A Model for the Detection of Insurance Fraud; Geneva Papers of Risk and Insurance 25 (4): 517 ± 538. Biehal, Gabriel / Chakravarti, Dipankar (1989): The Effects of Concurrent Verbailzation on Choice Processing; Journal of Marketing Research 26 (1): 84 ± 96. Biggs, Stanley / Rosman, Andrew / Sergenian, Gail (1993): Methodological Issues in Judgment and Decision-making Research: Concurrent Verbal Protocol Validity and Simultaneous Traces of Process; Journal of Behavioral Decision Making 6: 187 ± 206. Brockett, Patrick / Xia, Xiaohua / Derrig, Richard (1998): Using Kohonen's Self Organizing Feature Map to Uncover Automobile Bodily Injury Claims Fraud; Journal of Risk and Insurance 65 (2): 245 ± 274. Brockett, Patrick / Derrig, Richard / Golden, Linda / Levine, Arnold / Alpert, Mark (2002): Fraud Classification using Principal Component Analysis of RDITs; Journal of Risk and Insurance 69 (3): 341 ± 371. Derrig, Richard (2002): Insurance Fraud; Journal of Risk and Insurance 69 (3): 271 ± 287. Edelbacher, Max (1995): Versicherungsbetrug kennt keine Grenzen, Aspang. Ericsson, Anders / Simon, Herbert (1993): Protocol Analysis. Verbal Reports as Data; Cambridge / London. FruÈh, Werner (2001): Inhaltsanalyse. Theorie und Praxis, 5. Aufl., Konstanz.
166
Michael Theil
Krippendorff, Klaus (2004): Content Analysis. An Introduction to its Methodology, 2nd ed., Thousand Oaks / London / New Delhi. Major, John / Riedinger, Dan (2002): EFD: A Hybrid Knowledge / Statistical-Based System for the Detection of Fraud; Journal of Risk and Insurance 69 (3): 309 ± 324. Tennyson, Sharon / Salsas-Forn, Pau (2002): Claims Auditing in Automobile Insurance: Fraud Detection and Deterrence Objectives; Journal of Risk and Insurance 69 (3): 389 ± 308. van Someren, Maarten / Barnard; Yvonne / Sandberg, Jacobijn (1994): The Think Aloud Method. A Practical Guide to Modelling Cognitive Processes; London et al. Viaene, Stijn / Dedene, Guido (2004): Insurance Fraud: Issues and Challenges; Geneva Papers on Risk and Insurance 29 (2): 313 ± 333. Viaene, Stijn / Derrig, Richard / Baesens / Dedene, Giudo (2002): A Comparison of State-of-the-Art Classification Techniques for Expert Automobile Insurance Claim Fraud Detection; Journal of Risk and Insurance 69 (3): 373 ± 421. Weber, Robert (1990): Basic Content Analysis, 2nd ed., Newbury Park / London / New Delhi.
Abstract By many, insurance fraud is believed to be widespread, but little is known about how to detect it. In recent years, some attempts have been made to find indicators for fraud. They are, however, probably hampered when relying on characteristics of established fraud, since the majority of fraudulent cases then remains excluded, leaving many white spots on the map. In choosing a different approach, we let subjects reason freely about insurance fraud, recording and analysing their clues, and comparing them to indicators found in previous research. Our findings show that not only much of available information is largely ignored, subjects tend to concentrate on other aspects than supposedly reliable fraud characteristics and experts fail to fare better than laypeople.
Zusammenfassung Vielfach wird angenommen, dass Versicherungsbetrug betraÈchtliche Dimensionen erreicht. Indikatorensysteme, mit deren Hilfe die BetrugsbekaÈmpfung unterstuÈtzt werden soll, leiden haÈufig an dem Problem, dass sie auf FaÈllen beruhen, in denen der Betrug tatsaÈchlich nachgewiesen wurde, was angesichts der vermuteten Dunkelziffern weitere Probleme aufwirft. In der vorliegenden Untersuchung analysieren Probanden moÈgliche Betrugssituationen, ihre gedanklichen AnsaÈtze werden aggregiert und bekannten Indikatoren gegenuÈbergestellt. Dabei zeigt sich, dass nur wenig der verfuÈgbaren Information tatsaÈchlich in die Analyse einbezogen wird und sich die Probanden verbreitet auf andere als die vorgeschlagenen Indikatoren kon-
Insurance Fraud: What makes a case look suspicious?
167
zentrieren. Eine Kontrollgruppe mit Studierenden zeigt dabei keine wesentlich schlechteren Ergebnisse als die Gruppe von Experten in der BetrugsbekaÈmpfung.