CONDITIONAL PROBABILITYDefinition: For events A and B in some probability space with P(B) > 0, we define P(A|B) as P(A cap B)/P(B). [In Nosal's book, the notation is not standard; there it is P(A/B)]. This is read as the "conditional probability of A given B. |
The idea behind this definition is that one knows that B has happened, and thus the sample space has shrunken to B. All probabilities are scaled by P(B) so that P(B|B)=1. Note that P(C|B)=0 for events C which are disjoint from B (in the same sample space). If B is the sample space S, no new knowledge has been introduced: P(A|S)=P(A) for all events A. The idea behind this conditional probability definition can be expressed as a theorem:
Theorem 1: Let B be an event for a probability space with P(B) & gt 0. For all events A in this probability space, let P'(A)=P(A|B). Then, for the same sample space and the same set of events, P' forms a probability space.An important statement follows immediately from the definition of conditional probability: when P(B) > 0, P(A cap B)=P(A|B)P(B). [Just multiply both sides of the defining equation by P(B).] This can be generalized to multiple intersections:
Theorem 2: Let A_i's be events, with i ranging over the positive integers from 1 to n (inclusively); here n is a positive integer too. If the intersection of the first n-1 of the A_i's has positive probability, then the probability of the intersection of all n of the A_i's is equal to the product of the following expressions: P(A_j | intersection of the A_i's for i < j), where j ranges from 1 to n inclusively. The empty intersection is interpreted as the sample space S, so that when j=1, P(A_1 | intersection of the A_i's for i < 1)=P(A_1 | S)=P(A_1). Also, the intersection of a single set is interpreted as the set itself, so that when j=2, P(A_2 | intersection of the A_i's for i < 2) =P(A_2 | A_1).
Conditional probabilities are central to two well-known and much-used formulas: the total probability formula and Bayes' Formula.