My colleague brought this intriguing probability question to my attention which, as the comments following the article show, has a controversial answer. I believe the reason for this is that there is more than one way to model the question in a random framework.
“I have two children. (At least) one is a boy born on a Tuesday. What is the probability I have two boys?”
The puzzle teller then asks, “What has Tuesday got to do with it?”
(We can assume that the probability of a baby being born a male is 1/2.)
Being pragmatic, to answer the above question, we need to think of a way of (at least conceptually) repeating the “experiment” many times, so that we can deduce what the long term average of there being two boys is.
Model A: Look at all families in this world who have two children, (at least) one of them a boy born on a Tuesday. How many of these families have two boys?
Model B: Look at all families in this world who have two children. Pick one of the children uniformly at random and state the sex and weekday of birth of that child. How many of these families will have both children of the same sex?
Model C: Look at all families in this world who have two children. Pick one of the children uniformly at random and state the sex and weekday of birth of that child. How many of these families will have two boys?
Model D: Look at all families in this world who have two children. Pick one of the children uniformly at random and state the sex and weekday of birth of that child. Out of only those families who stated they have a boy, how many will have two boys?
Model E: Look at all families in this world who have two children, (at least) one a boy. State the weekday of birth of the boy (or if there are two boys, pick one at random and state his weekday of birth). How many of these families will have two boys?
In Models B to D, the term “uniformly at random” is intended to mean that there is equal chance of picking either child. It should be compared with other choices, such as “pick the younger child” or “pick the boy if there is a boy”.
The puzzle teller is actually interested in Model A, and indeed, the answer is an interesting one in this case, as elaborated on below. The controversy over this puzzle arises because people, wittingly or otherwise, consider other models, especially Model E above. Indeed, we need to ask for clarification from the puzzle teller on how and why he chose to tell us that one of his children was a son born on a Tuesday before we can solve this puzzle without making any additional assumptions. (That said, the puzzle teller would presumably argue that Model A is the most sensible and therefore the default interpretation of the question. Note though that by being asked what the relevance of Tuesday is, the audience will start to think along the lines of Model E, thus fueling the controversy.)
In Model B, whether or not we learn the sex and weekday of birth of one of the children is irrelevant to the question being asked; one half of all two-children families will have both children of the same sex. Similarly, in Model C, the answer is 1/4 because we are considering the proportion of all families. In Model D, all families with two girls are discarded and there is 50% chance that families with one boy and one girl are discarded, while all families with two boys remain. Therefore, 1/2 of all remaining families will have two boys; it is irrelevant to the question being asked that we are told on which day of the week the boy was born. The answer to Model E is 1/3; learning the weekday of birth is again irrelevant to the question we have been asked.
The difference between Model A and E is that in Model A, we have discarded all families who do not have a son born on a Tuesday, whereas in Model E, we merely learn of the weekday of birth of the son. The distinction is crucial! Would we always be told the same information, or would the information we are told change from experiment to experiment?
The Answer to Model A
The fail-safe method for answering any probability question concerning only a finite number of outcomes is to enumerate all the outcomes. It may not be illuminating on its own but it is rigorous. The first child could be a boy or a girl (2 possibilities) and it could have been born on any day of the week (7 possibilities). There are therefore 14 possible outcomes. Same for the second child, leading to a total of 14 times 14 equally likely outcomes (but we don’t need to know this number). What information does knowing that “one of the children is a boy born on Tuesday” tell us? Well, it fixes precisely the outcome of either the first child or the second child. To avoid double counting, we need a little care; we consider the three non-overlapping cases:
- First child is a boy born on a Tuesday, second child is not a boy born on a Tuesday
- First child is not a boy born on a Tuesday, second child is a boy born on a Tuesday
- Both children are boys born on a Tuesday
There are 13 possibilities for the first case with 6 being that the second child is a boy; 13 possibilities for the second case with 6 being that the first child is a boy; and only 1 possibility for the third case. So the probability that the other child is a boy is (6+6+1) / (13+13+1) = 13/27.
In Model A, we are filtering out families that do not meet the criteria of i) at least one boy; and ii) at least one boy born on a Tuesday. What is the difference between filtering out just on (i) versus filtering out on both (i) and (ii)? They key point is that (ii) will filter out more families with just one boy than it will filter out families with two boys. For every family with one and only one boy, we will discard 6 out of 7 of them (equivalently, 42 out of 49) because the boy will fail to be born on a Tuesday. For families with two boys though, they have a greater chance at passing the Tuesday test because, roughly speaking, they get two goes at it! We will discard only 36 out of 49 such families.
Consider a pool of families having at least one boy. We may assume that 2/3 of them have exactly one boy, that is 98 families have exactly one boy and 49 have two boys. Now, we discard from the 98, 42 out of every 49, that is, we discard 84. So by discarding families not having a son born on a Tuesday, we are left with 14 families having exactly one boy. From the two-boy families, we discard 36 out of 49, so we are left with 13 two-boy families. Therefore, 13 out of families having at least one boy born on a Tuesday will have two boys.
Probability (and mathematics in general) sometimes gives counter-intuitive results. We choose to accept these results because they are founded on axioms and logic which experience has shown to be extremely useful and conceptually very intuitive. The trick then is to improve our intuition based on such examples so we are better prepared in the future.
Another moral is that English is imprecise. The puzzle is controversial because it can be interpreted in different ways, and in particular, I speculate that although we feel that we understand the question, our subconscious mind doesn’t quite know which way is best to understand it, so it sporadically jumps from one interpretation to another the more we think about it, just as it does in the young woman / old woman optical illusion where the mind sporadically jumps between seeing a young woman or seeing an old woman. (Rather than say “I have two children..”, the puzzle teller should have said, “Pick at random a two-child family who has a son born on a Tuesday.” It is not as catchy though this way…)