by
Introduction:
In the popular press much attention has been given to a class of “conditional probability” problems in
various forms, but essentially of the same nature. Examples of these are the “Game Show” problem, the “Warden and
the Prisoner” problem, the “Woman with 2 children”, etc. In this paper we attempt to explain the
“Woman as 2 children” problem and its solution in layman terms.
Although a small amount of
mathematics is required, every effort has been made to keep it simple.
The
Problem and A Solution
A woman has two children; at least one is a boy. What is the probability that she has two boys? This last sentence shall be referred to as question, “Q”.
In
order to answer this question, let us first discuss what we mean by
“Probability of an event, A”. A
definition which is appealing to those who wish to apply it to practical
problems is the so-called “relative frequency” definition. It is as follows:
If an experiment is repeated
a large number of times n, and the event A occurs m of these times, then we consider the ratio m/n as an approximation to the probability of A occurring, and we define
the probability of A, denoted P[A], as the number approached by m/n as n gets large without limit. Mathematically
P[A] = the limit as n goes to
infinity of (m/n).
Thus,
for a large number of repetitions of the experiment the event A occurs
approximately m/n proportion of the
time. For example, most would agree
that when we say that the probability of a head on a fair coin is ½, we mean
that in a large number of identically repeated trials, the proportion of heads
should be around ½. Mathematically this
definition is inadequate. However, it
is important for us to realize that this definition furnishes us a guide for
modeling physical problems with probability models. Thus, in the question above, we need to know what experiment we
are conducting, i.e., what is the population of possible outcomes and how does
the random selection within the population take place? Are we considering the population of women
with 2 children where at least one is a boy, and selecting one purely
randomly? For example, if each of the
women in this population were given a number and then one of those numbers was
selected at random, then assuming BB, BG,
GB (BG, GB indicating order of birth) are all equally likely, it is obvious
that P[BB]= 1/3. On the other hand, if the population we wish
to consider is the population of women with 2 children and we are told at
least one is a boy, then the answer to the question “What is the probability
she has 2 boys?” depends totally on the process by which one is “told”. That is, within the population of women with
2 children, some have 2 girls. Since
our population is all women with 2 children, and our experiment is randomly
drawing one of these women from this population, then there must be some
possibility that we would draw a woman with 2 girls and hence would be told “At
least one is a girl. What is the
probability she has two girls?”
Now
what is the scenario surrounding question “Q”?
Note that the question is preceded by two statements, an initial
fact, “A woman has 2 children” and a
conditional statement “at least one is a boy”.
The first statement, without added qualification, must refer to the
population of women with 2 children.
That is, we must assume the statement means that the woman is simply
randomly chosen from the population of women with 2 children. The conditional statement “at least one is
a boy” must be the result of some action.
To
clarify the scenario, let’s represent each woman in the population by a chip
that has written on it the sex of her children, i.e., BB, BG, GB, or GG. Assume all outcomes are equally likely. Now suppose the action taken is that a chip
is drawn at random from the population and one of the following actions is
taken.
Action
1) Person looks at the chip and reports
at least one is an “X”, where “X” is a boy
if both are boys, “X” is a
girl if both are girls, and “X” is equally likely to be
a boy as a girl if there is
one of each.
Action
2) Person looks at the chip and says at
least one is a boy, unless both are girls.
If both are girls, “at least”
one is a girl is reported.
Action
3) Person looks at the chip. If both are girls, the chip is returned and
another
draw is taken. This is continued until a chip with at least
one boy is drawn.
When such a chip is drawn,
the person reports at least one is a boy.
The
third action is inconsistent with question “Q”. That is, the third action is equivalent
to
requiring the initial statement to refer to the collection of women with
2 children, at least one of which is a boy.
Therefore, we reject action three as inconsistent with question “Q”.
In the case of action two, the population is indeed women with 2 children. However, there is a clear prejudice toward reporting “at least one is a boy”. In general, one could modify this action to the following action: if one gets a BG or a GB chip, one reports “at least one is a boy” or “at least one is a girl” by some randomization method.
In
scenario two, the randomization is the case, where probability you’re told “at least one is a boy” when GB or BG
occurs and probability you’re told “at least one is a girl” when GB or
BG occurs. However, any p and any q are possible so long as 0 £ p £1, 0 £ q £1 and
q+ p=1.
Therefore,
“Q” cannot be answered unless the additional information defining the
randomization procedure is given. Since
no such information is forthcoming, we necessarily apply the principle of “equi
ignorance”. That is, given no
information to the contrary, we assume we are equally likely to be told “at
least one is a boy” as we are to be told that “at least one is a girl” in the
outcomes BG and GB, i.e., p = q =1/2 in
the randomization. Thus, question “Q”
is, without additional information, necessarily conditioned by action one.
Let us now briefly discuss some basic properties of the
probability function, P. Earlier we discussed P[A],
the probability of an event, A, as
relative frequency, and we mentioned that although such a definition is
problematic mathematically, it is physically appealing. Fortunately P can be defined in such a way as to give it the mathematical
structure it needs and yet remain consistent with our relative frequency definition . Actually we will really only
need a few basic properties of the probability function. The first is the relationship between the
probability of an event A and the conditional probability of an
event A given an event B. That is, if P[B] > 0,
P[A|B]
= P[ A and B] / P[B] = P[B | A]
P[A] / P[B] .
(1)
(1)
Note
that P[B | A] is read “The
probability of B given A”.
The
clear temptation is to interpret question “Q” as “what is the probability of 2
boys, given at least one boy?” i.e.,
P[ 2 boys | at least 1 boy] = ? (2)
In
fact, most beginning students of probability would probably view this as the
question. However, interpreting this as
the question is equivalent to interpreting question “Q” as relating to a random
selection from the population of women with 2 children, neither of which is a
girl. We have already argued that if
this is the question, it should have been stated clearly from the start. Thus, without additional information, we
reject (3) as the proper interpretation of question “Q”.
Let
us now return to what we believe is a proper interpretation of question “Q”,
i.e.,
P[
2 boys | told at least 1 is a boy] = ?
From
(1)
= P[told at least 1 is a boy | 2 boys] P[ 2 boys ] /
P[ told at least 1 is a boy].
Clearly
and
P[ 2 boys ] = (1/2)(1/2) =
1/4,
where
we have assumed that girls and boys are equally likely and that births are
independent events. Under the assumptions that the probability of a boy is
½ and that births are independent
events there is no question that P[ 2
boys ] = 1/4 regardless of how we interpret the conditional statement. Therefore any ambiguity in the answer
results from the evaluation of P[ told at least 1 is a boy].
Now
continuing the notation BB, BG, GB
meaning both are boys, the first born is a boy and the 2nd born is a
girl, the first born is a girl and the
2nd born is a boy, respectively,
we have
+ P[BG or GB and told at
least one is a boy] + P[ GG and told at least 1 is a boy ].
Assuming
that the “teller” does not lie P[ GG and
told at least 1 is a boy] = 0.
So
+ P[ BG
or GB and told at least one is a boy].
But,
and
since P[told
at least 1 is a boy | BB] = 1 and P[
BG or GB ] = 1/2.
Therefore
So
the only question remaining is, “What is P[
told at least 1 is a boy| BG or GB]?”.
Action
2 says that this probability is 1.
Actually, as we have argued earlier, Action 2 could have an uncountable
number of solutions depending on the randomization procedure. It should now be clear that if we modified
Action 2 to this more general scenario, i.e. randomizing the response when the
out come is BG or GB, that Action 1
is the special case where the response is equally likely to be “at least one is
a boy” as it is “at least one is a girl”.
Moreover, it should also now be clear that, without being given
additional information, the only reasonable approach is to apply the
equi-ignorance principle. That is,
assume the responses are equally likely.
However, under Action 1
P[told at least 1 is a boy |
BG or GB] = 1/2 .
So,
P[told at least 1 is a boy]
= (1/4) + (1/2)(1/2) = 1/2 .
Therefore
P[ 2 boys | told at least 1
is a boy] = (1/4)/(1/2) = 1/2 is the correct answer to Q.
Now
recall that we stated earlier that the “naïve solution” to the problem is
However
there is no reasonable action where this is the correct answer unless the
problem is restated with additional information.
The
question under Action 1 can be simulated on the computer to obtain repeated
trials. The table below shows the
relative frequency estimate of P[2
boys | told at least one is a boy.]
for n = 100, 200, 500, 1000.
In this paper we have attempted to explain what seems to be the only reasonable solution to the “Woman has two children” problem. The confusion appears to come from the way that the problem is stated, i.e. “A woman has two children; at least one is a boy. What is the probability that she has two boys?” That is, should the statement “At least one is a boy” be interpreted as the mathematical statement “given at least one is a boy” or should we interpret this as we are “told” at least one is a boy. If it is the former then it behooves us to explain what physical problem we are solving. In any case, however, it would appear that the opening statement “A woman has two children” seems to establish the original sample space in such a manner that if the problem really is “given at least one is a boy” then further information must be included in the statement of the problem. In short it appears that the only reasonable interpretation of the problem as stated is to accept the statement as “told” at least one is a boy. If that’s the case, again, given no further information, we must assume that one was equally likely to be told “at least one is a girl” if there was one girl and one boy and one is told “at least one is a girl if there were two girls.