Corrections and clarifications
There is an error in Figure 3.3. This is the tree diagram representing Eddy's (1982) medical problem. For the branch representing P(Negative | Cancer) the probability should be 0.208, NOT 0.028. Accordingly, the number at the rightmost end of the branch should be 0.0028, NOT 0.00028.
This does not affect the answer to the problem stated in the text, which remains 0.077.
My thanks to Chloe Turner (one of my students) for pointing this out.
Sunday, 7 June 2009
To illustrate their argument they analyse the likelihood of encountering particular sequences of coin tosses. With a sequence of four coin tosses, there are 16 possible sequences that could occur (e.g. HHHH, HHHT, THHT,... etc.). Suppose, though, that we are interested in the occurrences of two equally-likely subsequences, HHH and HHT, within all the possible four-toss outcomes. We can analyse this by representing all 16 sequences in a probability tree diagram, such that HHHH is one branch, HHHT another branch, and so on.
Notice that the subsequence HHH occurs twice within one branch (HHHH), whereas this never happens for HHT. This result generalizes to much longer sequences of coin tosses. Hahn and Warren describe this as being like waiting for a bus: "For a long and frustrating period, there is no bus in sight, and then, all of a sudden, several arrive in immediate succession" (p.455). The upshot of this analysis is that you would need to wait longer (on average) to encounter HHH in a sequence of coin tosses than you would HHT. In a sequence of coin tosses, the expected wait time for HHH is 14 coin tosses, whereas for HHT it is just 8.
The authors also argue that this theoretical analysis maps quite well onto human experience. Limited lifespan and resources mean that random events we encounter are likely to be relatively short sequences, as compared to the long runs discussed in explanations of probability. Furthermore, short-term memory has a very limited capacity, so people can only hold a few events in mind. Hahn and Warren extend their analysis by reporting the results of a computer simulation designed to assess the probability that certain substrings will not occur within a given sequence of coin tosses. They specifically looked at the likelihood of the following substrings: HHHH, HTHH, HHHT, HHTT, and HTHT.
The substring with the highest probability of non-occurrence was HHHH, and the next substring that was likely to not occur was HTHT. Likewise, when Hahn and Warren did the same analysis for substrings of length 6, the most likely substrings to not occur were HHHHHH and HTHTHT. Significantly, previous research has shown that naive participants tend to regard these sequences as less likely to have been generated by a random process than more varied or less regular strings.
In short, Hahn and Warren argue that people's misperceptions of chance are a rational response to the environments that they encounter (although they are still errors). The gambler's fallacy can be viewed in this light also. This fallacy occurs if (say) a gambler believes that a sequence of HHH means that T is more likely to occur on the next toss of the coin. However, Hahn and Warren's analysis shows that a gambler is likely to encounter the sequence HHHT before he or she encounters the sequence HHHH, even though if the gambler has just experienced HHH then the next outcome is equally likely to be H or T.
Hahn, U., and Warren, P.A. (2009). Perceptions of randomness: Why three heads are better than four. Psychological Review, 116 (2), 454-461.
Thursday, 4 June 2009
Hertwig et al also noted that participants in probabilistic reasoning studies are typically university students. They conducted a telephone survey of 1000 adults in Switzerland, whose numbers had been randomly selected by computer. Each person was presented with four "pure" probability problems and two "everyday" probability problems. As expected, they found a higher level of performance on the pure problems. Moreover, on these problems higher levels of educational achievement were associated with higher levels of reasoning performance.
On the everyday problems, there was actually a tendency towards poorer performance by the most highly educated respondents. This appears to be inconsistent with previous research in which the SAT scores of the American participants were recorded.
Finally, people with some gambling experience also showed a slight performance advantage, as did male participants relative to women.
The study does appear to assume that education is a causal factor in producing correct probabilistic responses on the pure problems. However, the study does not actually distinguish between the cognitive ability of the respondents and their education; or to put it another way, those with higher cognitive ability are more likely to go on to higher levels of education in the first place. Likewise, does gambling experience confer a better understanding of probabilitistic problems, or is it just that people with better understanding are more likely to gamble?
Hertwig, R., Zangerl, M.A., Biedert, E., and Margraf, J. (2008). The public's probabilistic numeracy: How tasks, education and exposure to games of chance shape it. Journal of Behavioral Decision Making, 21 (4), 457-470.
Tuesday, 2 June 2009
Equation 1 P(B Λ A) = P(A) x P(B A);
or if A and B are independent then
Equation 2 P(B Λ A) = P(A) x P(B)
Chapter 3 focuses on one of the best-known explanations of the conjunction fallacy, which is the representativeness heuristic. There have been other explanations, however, but the one thing that the various accounts have in common is that the existence of the fallacy means that people are not using probability theory to reason about conjunctions.
A paper by Costello (2009) now suggests that it is premature to rule out probability theory as a component of human thinking. Consider the famous Linda problem (described in Chapter 3) whereby people read a description of "Linda" when she was a student, and then rank the likelihood of statements about what Linda might be doing now. The three key statements are:
1. Linda is a bank teller.
2. Linda is a feminist.
3. Linda is a feminist and a bank teller.
People typically rank (3) above (1). However, according to Costello's argument the values that are entered into Equations 1 or 2 are affected by random variation, or noise. This is not surprising: people's beliefs about "Linda" are likely to be vague rather than precise, hence at the moment that people are asked to think about Linda the values of (1) and (2) are likely to be drawn from a range of values.
The existence of this noise in the probabilities that people hold about (1) and (2) mean that people could use a process of thinking that implements the conjunction rule, yet produces a conclusion that would appear to be inconsistent with it. The conjunction fallacy is most likely to occur when the probability of one constituent is low and the probability of the other constituent is high, a prediction that is born out by previous findings in the literature.
Costello also suggests that the variation around the probability values for uncertain events will be reduced when people are explicitly asked to estimate probabilities for those events, as compared to the procedure in the classic conjunction studies where people are asked to rank order events. Thus, asking for probability estimates should reduced the conjunction fallacy, a result that has also been found in previous research.
Another prediction is that the conjunction fallacy should be reduced when people are asked to estimate the probabilities of the constituents before estimating the probability of the conjunction of those constituents (as opposed to estimating the probability of the conjunction first). This is because first estimating the probabilities of A and B is likely to reduce the extent to which they vary when the probability of the conjunction is being estimated. On the other hand, variability is less constrained when estimating the probability of the conjunction first.
Costello, F.J. (2009). How probability theory explains the conjunction fallacy. Journal of Behavioral Decision Making, 22, 213-234.
Saturday, 14 March 2009
Chapman, G.B., and Liu, J. (2009). Numeracy, frequency, and Bayesian reasoning. Judgment and Decision Making, 4 (1), 34-40.
Lipkus, I.M., Samsa, G., & Rimer, B.K. (2001). General performance on a numeracy scale among highly educated samples. Medical Decision Making, 21, 37-44.