Changes: Base rate fallacy

Latest revision as of 22:26, 2 June 2013

Cognitive Psychology: Attention · Decision making · Learning · Judgement · Memory · Motivation · Perception · Reasoning · Thinking - Cognitive processes Cognition - Outline Index

The base rate fallacy, also called base rate neglect or base rate bias, is an error that occurs when the conditional probability of some hypothesis H given some evidence E is assessed without taking into account the "base rate" or "prior probability" of H and the total probability of evidence E.^[1]

Example

In a city of 1 million inhabitants there are 100 known terrorists and 999,900 non-terrorists. The base rate probability of one random inhabitant of the city being a terrorist is thus 0.0001 and the base rate probability of a random inhabitant being a non-terrorist is 0.9999. In an attempt to catch the terrorists, the city installs a surveillance camera with automatic facial recognition software. The software has two failure rates of 1%:

if the camera sees a terrorist, it will ring a bell 99% of the time, and mistakenly fail to ring it 1% of the time (in other words, the false-negative rate is 1%).
if the camera sees a non-terrorist, it will not ring the bell 99% of the time, but it will mistakenly ring it 1% of the time (the false-positive rate is 1%).

So, the failure rate of the camera is always 1%.

Suppose somebody triggers the alarm. What is the chance they are a terrorist?

Someone making the 'base rate fallacy' would incorrectly claim that there is a 99% chance that they are a terrorist, because 'the' failure rate of the camera is always 1%. Although it seems to make sense, it is actually bad reasoning. The calculation below will show that the chances they are a terrorist are actually near 1%, not near 99%.

The fallacy arises from confusing two different failure rates. The 'number of non-terrorists per 100 bells' and the 'number of non-bells per 100 terrorists' are unrelated quantities, and there is no reason one should equal the other. They don't even have to be roughly equal.

To show that they do not have to be equal, consider a camera that, when it sees a terrorist, rings a bell 20% of the time and fails to do so 80% of the time, while when it sees a nonterrorist, it works perfectly and never rings the bell. If this second camera rings, the chance that it failed by ringing at a non-terrorist is 0%. However if it sees a terrorist, the chance that it fails to ring is 80%. So, here 'non-terrorists per bell' is 0% but 'non-bells per terrorist' is 80%.

Now let's go back to our original camera, the one with 'bells per non-terrorist' of 1% and 'non-bells per terrorist' of 1%, and let's compute the 'non-terrorists per bell' rate.

Imagine that the city's entire population of one million people pass in front of the camera. About 99 of the 100 terrorists will trigger the alarm—-and so will about 9,999 of the 999,900 non-terrorists. Therefore, about 10,098 people will trigger the alarm, among which about 99 will be terrorists. So the probability that a person triggering the alarm is actually a terrorist is only about 99 in 10,098, which is less than 1%, and very very far below our initial guess of 99%.

The base rate fallacy is only fallacious in this example because there are more non-terrorists than terrorists. If the city had about as many terrorists as non-terrorists, and the false-positive rate and the false-negative rate were nearly equal, then the probability of misidentification would be about the same as the false-positive rate of the device. These special conditions hold sometimes: as for instance, about half the women undergoing a pregnancy test are actually pregnant, and some pregnancy tests give about the same rates of false positives and of false negatives. In this case, the rate of false positives per positive test will be nearly equal to the rate of false positives per nonpregnant woman. This is why it is very easy to fall into this fallacy: it gives the correct answer in many common situations.

In many real-world situations, though, particularly problems like detecting criminals in a largely law-abiding population, the small proportion of targets in the large population makes the base rate fallacy very applicable. Even a very low false-positive rate will result in so many false alarms as to make such a system useless in practice.

Mathematical formalism

In the above example, where P(A|B) means the probability of A given B, the base rate fallacy is the incorrect assumption that:

$P(\mathrm {terrorist} |\mathrm {bell} ){\overset {\underset {\mathrm {?} }{}}{=}}P(\mathrm {bell} |\mathrm {terrorist} )=99\%$

However, the correct expression uses Bayes' theorem to take into account the probabilities of both A and B, and is written as:

$P(\mathrm {terrorist} |\mathrm {bell} )={\frac {P(\mathrm {bell} |\mathrm {terrorist} )P(\mathrm {terrorist} )}{P(\mathrm {bell} )}}$ $=0.99(100/1000000)/[(0.99\cdot 100+0.01\cdot 999900)/1000000]=1/102\approx 1\%$

Thus, in the example the probability is overestimated by more than 100 times, due to the failure to take into account the fact that there are about 10000 times more nonterrorists than terrorists (a.k.a. failure to take into account the 'prior probability' of being a terrorist).

Findings in psychology

In experiments, people have been found to prefer individuating information over general information when the former is available.^[2]^[3]^[4]

In some experiments, students were asked to estimate the grade point averages (GPAs) of hypothetical students. When given relevant statistics about GPA distribution, students tended to ignore them if given descriptive information about the particular student, even if the new descriptive information was obviously of little or no relevance to school performance.^[3] This finding has been used to argue that interviews are an unnecessary part of the college admissions process because interviewers are unable to pick successful candidates better than basic statistics.^{[attribution needed]}

Psychologists Daniel Kahneman and Amos Tversky attempted to explain this finding in terms of a simple rule or "heuristic" called representativeness. They argued that many judgements relating to likelihood, or to cause and effect, are based on how representative one thing is of another, or of a category.^[3] Richard Nisbett has argued that some attributional biases like the fundamental attribution error are instances of the base rate fallacy: people underutilize "consensus information" (the "base rate") about how others behaved in similar situations and instead prefer simpler dispositional attributions.^[5]

Kahneman considers base rate neglect to be a specific form of extension neglect.^[6]

Notes

↑ http://www.fallacyfiles.org/baserate.html
↑ Bar-Hillel, Maya (1980). The base-rate fallacy in probability judgments. Acta Psychologica 44: 211–233.
↑ ^3.0 ^3.1 ^3.2 Kahneman, Daniel, Amos Tversky (1973). On the psychology of prediction. Psychological Review 80: 237–251.
↑ Kahneman, Daniel; Amos Tversky (1985). "Evidential impact of base rates" Daniel Kahneman, Paul Slovic & Amos Tversky (Eds.) Judgment under uncertainty: Heuristics and biases, 153–160.
↑ Nisbett, Richard E.; E. Borgida, R. Crandall & H. Reed (1976). "Popular induction: Information is not always informative" J. S. Carroll & J. W. Payne (Eds.) Cognition and social behavior, 227–236.
↑ Kahneman, Daniel (2000). "Evaluation by moments, past and future" Daniel Kahneman and Amos Tversky (Eds.) Choices, Values and Frames.

References

Bar-Hillel, M. (1980). The base-rate fallacy in probability judgments. Acta Psychologica, 44, 211-233.
Kahneman, D., & Tversky, A. (1973). On the psychology of prediction. Psychological Review, 80, 237-251. (summary here)
Nisbett, R.E., Borgida, E., Crandall, R., & Reed, H. (1976). Popular induction: Information is not always informative. In J.S. Carroll & J.W. Payne (Eds.), Cognition and social behavior, 2, 227-236.

External links

The Base Rate Fallacy The Fallacy Files
Psychology of Intelligence Analysis: Base Rate Fallacy

Template:Relevance fallacies

This page uses Creative Commons Licensed content from Wikipedia (view authors).

[1] ttp://www.fallacyfiles.org/baserate.html

[2] Bar-Hillel, Maya (1980). The base-rate fallacy in probability judgments. Acta Psychologica 44: 211–233.

[kv1-3] 3.0 ^3.1 ^3.2 Kahneman, Daniel, Amos Tversky (1973). On the psychology of prediction. Psychological Review 80: 237–251.

[4] Kahneman, Daniel; Amos Tversky (1985). "Evidential impact of base rates" Daniel Kahneman, Paul Slovic & Amos Tversky (Eds.) Judgment under uncertainty: Heuristics and biases, 153–160.

[5] Nisbett, Richard E.; E. Borgida, R. Crandall & H. Reed (1976). "Popular induction: Information is not always informative" J. S. Carroll & J. W. Payne (Eds.) Cognition and social behavior, 227–236.

[6] Kahneman, Daniel (2000). "Evaluation by moments, past and future" Daniel Kahneman and Amos Tversky (Eds.) Choices, Values and Frames.

[1]

[2]

[3]

[4]

[5]

[6]

@@ Line 1: / Line 1: @@
 {{ExpPsy}}
-The '''base rate fallacy''', also called '''base rate neglect''', is a [[logical fallacy]] that occurs when irrelevant information is used to make a [[probability]] judgment, especially when [[empirical]] [[statistics]] about the probability are available (called the "base rate" or "[[prior probability]]").
+The '''base rate fallacy''', also called '''base rate neglect''' or '''base rate bias''', is an error that occurs when the [[conditional probability]] of some hypothesis H given some evidence E is assessed without taking into account the "[[base rate]]" or "[[prior probability]]" of H and the total probability of evidence E.<ref>http://www.fallacyfiles.org/baserate.html</ref>
-In some experiments, students were asked to estimate the [[Grade Point Average]]s of hypothetical students. When given relevant statistics about GPA distribution, students tended to ignore them if given descriptive information about the particular student, even if the new descriptive information did not seem to have anything to do with school performance.
+==Example==
-This finding has been used to argue that interviews are an unnecessary part of the [[college admissions]] process because [[empirical]] evidence shows that interviewers are unable to pick successful candidates better than basic statistics. Similarly, economists argue that [[stock market]] brokers commit this mistake because market performance and the performance of any individual stock are indistinguishable from chance movement, and professionally-chosen portfolios do no better than those composed of stocks picked at [[random]].
+In a city of 1 million inhabitants there are 100 known terrorists and 999,900 non-terrorists. The [[base rate]] probability of one random inhabitant of the city being a terrorist is thus 0.0001 and the [[base rate]] probability of a random inhabitant being a non-terrorist is 0.9999. In an attempt to catch the terrorists, the city installs a surveillance camera with automatic [[facial recognition system|facial recognition software]]. The software has two failure rates of 1%:
+# if the camera sees a terrorist, it will ring a bell 99% of the time, and mistakenly fail to ring it 1% of the time (in other words, the false-negative rate is 1%).
-[[Psychologist]]s [[Daniel Kahneman]] and [[Amos Tversky]] attempted to explain this finding in terms of the [[representativeness heuristic]].
+# if the camera sees a non-terrorist, it will not ring the bell 99% of the time, but it will mistakenly ring it 1% of the time (the false-positive rate is 1%).
+So, the failure rate of the camera is always 1%.
-[[Richard Nisbett]] has argued that some [[attributional bias]]es like the [[fundamental attribution error]] are instances of the base rate fallacy: people underutilize "consensus information" (the "base rate") about how others behaved in similar situations and instead prefer simpler [[dispositional attribution]]s.
+Suppose somebody triggers the alarm. What is the chance they are a terrorist?
-== See also ==
+Someone making the 'base rate fallacy' would incorrectly claim that there is a 99% chance that they are a terrorist, because 'the' failure rate of the camera is always 1%. Although it ''seems'' to make sense, it is actually bad reasoning. The calculation below will show that the chances they are a terrorist are actually near 1%, not near 99%.
+The fallacy arises from confusing two different failure rates. The 'number of non-terrorists per 100 bells' and the 'number of non-bells per 100 terrorists' are unrelated quantities, and there is no reason one should equal the other. They don't even have to be roughly equal.
+To show that they do not have to be equal, consider a camera that, when it sees a terrorist, rings a bell 20% of the time and fails to do so 80% of the time, while when it sees a nonterrorist, it works perfectly and never rings the bell. If this second camera rings, the chance that it failed by ringing at a non-terrorist is 0%. However if it sees a terrorist, the chance that it fails to ring is 80%. So, here 'non-terrorists per bell' is 0% but 'non-bells per terrorist' is 80%.
+Now let's go back to our original camera, the one with 'bells per non-terrorist' of 1% and 'non-bells per terrorist' of 1%, and let's compute the 'non-terrorists per bell' rate.
+Imagine that the city's entire population of one million people pass in front of the camera. About 99 of the 100 terrorists will trigger the alarm—-and so will about 9,999 of the 999,900 non-terrorists. Therefore, about 10,098 people will trigger the alarm, among which about 99 will be terrorists. So the probability that a person triggering the alarm is actually a terrorist is only about 99 in 10,098, which is less than 1%, and very very far below our initial guess of 99%.
+The base rate fallacy is only fallacious in this example because there are more non-terrorists than terrorists. If the city had about as many terrorists as non-terrorists, and the false-positive rate and the false-negative rate were nearly equal, then the probability of misidentification would be about the same as the false-positive rate of the device. These special conditions hold sometimes: as for instance, about half the women undergoing a pregnancy test are actually pregnant, and some pregnancy tests give about the same rates of false positives and of false negatives. In this case, the rate of false positives per positive test will be nearly equal to the rate of false positives per nonpregnant woman. This is why it is very easy to fall into this fallacy: it gives the correct answer in many common situations.
+In many real-world situations, though, particularly problems like detecting criminals in a largely law-abiding population, the small proportion of targets in the large population makes the base rate fallacy very applicable. Even a very low false-positive rate will result in so many false alarms as to make such a system useless in practice.
+==Mathematical formalism==
+In the above example, where P(A|B) means the probability of A given B, the base rate fallacy is the incorrect assumption that:
+<math>P(\mathrm{terrorist}|\mathrm{bell}) \overset{\underset{\mathrm{?}}{}}{=} P(\mathrm{bell}|\mathrm{terrorist}) = 99%</math>
+However, the correct expression uses [[Bayes' theorem]] to take into account the probabilities of both A and B, and is written as:
+<math>P(\mathrm{terrorist}|\mathrm{bell}) = \frac{P(\mathrm{bell}|\mathrm{terrorist})P(\mathrm{terrorist})}{P(\mathrm{bell})}</math>
+<math>=0.99(100/1000000)/[(0.99\cdot 100+0.01\cdot 999900)/1000000] = 1/102 \approx 1%</math>
+Thus, in the example the probability is overestimated by more than 100 times, due to the failure to take into account the fact that there are about 10000 times more nonterrorists than terrorists (a.k.a. failure to take into account the 'prior probability' of being a terrorist).
+==Findings in psychology==
+In experiments, people have been found to prefer individuating information over general information when the former is available.<ref>{{cite journal|last=Bar-Hillel|first=Maya|title=The base-rate fallacy in probability judgments|journal=Acta Psychologica|year=1980|volume=44|pages=211–233}}</ref><ref name="kv1"/><ref>{{cite book|last=Kahneman|first=Daniel|title=Judgment under uncertainty: Heuristics and biases|year=1985|pages=153–160|coauthors=Amos Tversky|editor=Daniel Kahneman, Paul Slovic & Amos Tversky (Eds.)|chapter=Evidential impact of base rates}}</ref>
+In some experiments, students were asked to estimate the [[grade point average]]s (GPAs) of hypothetical students. When given relevant statistics about GPA distribution, students tended to ignore them if given descriptive information about the particular student, even if the new descriptive information was obviously of little or no relevance to school performance.<ref name="kv1"/> This finding has been used to argue that interviews are an unnecessary part of the [[college admissions]] process because interviewers are unable to pick successful candidates better than basic statistics.{{Who|date=March 2009}}
+[[Psychologist]]s [[Daniel Kahneman]] and [[Amos Tversky]] attempted to explain this finding in terms of a [[heuristics in judgment and decision making|simple rule or "heuristic"]] called [[representativeness heuristic|representativeness]]. They argued that many judgements relating to likelihood, or to cause and effect, are based on how representative one thing is of another, or of a category.<ref name="kv1">{{cite journal|last=Kahneman|first=Daniel|coauthors=Amos Tversky|title=On the psychology of prediction|journal=Psychological Review|year=1973|volume=80|pages=237–251|doi=10.1037/h0034747}}</ref> [[Richard Nisbett]] has argued that some [[attributional bias]]es like the [[fundamental attribution error]] are instances of the base rate fallacy: people underutilize "consensus information" (the "base rate") about how others behaved in similar situations and instead prefer simpler [[dispositional attribution]]s.<ref>{{cite book|last=Nisbett|first=Richard E.|title=Cognition and social behavior|year=1976|coauthors=E. Borgida, R. Crandall & H. Reed|editor=J. S. Carroll & J. W. Payne (Eds.)|chapter=Popular induction: Information is not always informative|pages=227–236|volume=2}}</ref>
+Kahneman considers base rate neglect to be a specific form of [[extension neglect]].<ref>{{cite book|last=Kahneman|first=Daniel|title=Choices, Values and Frames|year=2000|editor=Daniel Kahneman and Amos Tversky (Eds.)|chapter=Evaluation by moments, past and future}}</ref>
+==See also==
+* [[Bayesian probability]]
+* [[Data dredging]]
+* [[False positive paradox]]
+* [[Inductive argument]]
 * [[Misleading vividness]]
+* [[Prosecutor's fallacy]]
 * [[Representativeness heuristic]]
-==References==
+==Notes==
+{{Reflist}}
+==References==
-* [http://dx.doi.org/10.1016/0001-6918(80)90046-3 Bar-Hillel, M. (1980). The base-rate fallacy in probability judgments.  ''Acta Psychologica'', 44, 211-233.]
+* [http://dx.doi.org/10.1016/0001-6918(80)90046-3 Bar-Hillel, M. (1980). The base-rate fallacy in probability judgments. ''Acta Psychologica'', 44, 211-233.]
 * Kahneman, D., & Tversky, A. (1973). On the psychology of prediction. ''Psychological Review'', 80, 237-251. ([http://faculty.babson.edu/krollag/org_site/soc_psych/kan_tver_pred.html summary here])
-* Nisbett, R. E., Borgida, E., Crandall, R., & Reed, H. (1976).  Popular induction: Information is not always informative. In J. S. Carroll & J. W. Payne (Eds.), ''Cognition and social behavior'', 2, 227-236.
+* Nisbett, R.E., Borgida, E., Crandall, R., & Reed, H. (1976). Popular induction: Information is not always informative. In J.S. Carroll & J.W. Payne (Eds.), ''Cognition and social behavior'', 2, 227-236.
 ==External links==
+* [http://www.fallacyfiles.org/baserate.html The Base Rate Fallacy] The Fallacy Files
+* [https://www.cia.gov/library/center-for-the-study-of-intelligence/csi-publications/books-and-monographs/psychology-of-intelligence-analysis/art15.html#ft145 Psychology of Intelligence Analysis: Base Rate Fallacy]
+{{Relevance fallacies}}
-* [https://www.cia.gov/csi/books/19104/art15.html#ft145 Psychology of Intelligence Analysis: Base Rate Fallacy]
-[[Category:Relevance fallacies]]
+[[Category:Behavioral finance]]
 [[Category:Cognitive biases]]
+[[Category:Probability]]
+[[Category:Relevance fallacies]]
+<!--
+[[de:Prävalenzfehler]]
 [[fr:Oubli de la fréquence de base]]
 [[he:כשל הסתברות קודמת]]
+[[pl:Zaniedbywanie miarodajności]]
+-->
 {{enWP|Base rate fallacy}}