Supplementary Exercises 4.107 and 4.108 of IPS7e ------------------------------------------------ Data on earned degrees in the US 2005-2006 academic year, divided by sex and type of degree. We will work with data represented by such tables in more detail in Session 8. 4.107: ------ (a) The sample space is all 8 combinations of sex and degree type: S=(f,bach),(f,mast),(f,prof),(f,doct),(m,bach),...,(m,doct). We can also represent the sex and degree type by two random variables, say S and D, and express the outcomes in terms of these, e.g. (S=f,D=bach) and (S=m,D=doct) for the first and last outcomes. We estimate the probabilities by the data, that is, we work with the empirical distribution, and compute: P(S=f) = 1119/1944 = 0.576, or 57.6%. (b) When conditioning on a professional degree (i.e., D=prof), we only consider the data in the column for professional degrees. Hence we compute: P(S=f|D=prof) = 39/83 = 0.470, or 47.0%. (c) We have two ways of assessing independence, using either the multiplication rule (slide 3L-6) or the conditional probabilities (3L-9). Because we already have computed the conditional probability of S=f given D=prof, the latter approach is more direct here. The definition of independence involving conditional probability says that for two events to be independent, the conditional probability of one given the other must equal the unconditional probability. We just saw that the conditional probability P(S=f|D=prof) was *not* equal to the unconditional probability P(S=f), and they don't even seem close. Therefore we can conclude that the two events are not independent. In order to use the multiplication rule we would further need to compute: P(D=prof) = 83/1944 = 0.043, or 4.3% P(S=f and D=prof) = 39/1944 = 0.020, or 2.0%. We now need to check whether the latter probability is equal to the product of P(S=f) and P(D=prof). We have 0.576*0.043 = 0.025, and this is value *not* equal to 0.020, so again we conclude that the events are not independent. 4.108: ------ (a): In a similar way as above, we get P(S=m) = 825/1944 = 0.424, and alternatively we could have used the complement rule to get this value (try to do this calculation if you're not sure how!). (b) Also here we use the same approach as above: P(D=bach|S=m) = 559/825 = 0.678, or 67.8%. (c) The question mentions the multiplication rule, but in the way we've discussed it so far that rule is for independent events, and we don't know whether the two events are independent. We could use a similar calculation as above to establish that they are in fact not. The intention in the question is to use the multiplication rule for dependent events coming from the definition of conditional probability (slide 3L-9): P(A and B) = P(A|B)*P(B) = P(B|A)*P(A) Using this rule we calculate, P(S=m and D=bach) = P(D=bach|S=m)*P(S=m) = (559/825)*(825/1944) = 559/1944. This is also how we would calculate the probability directly from the table, as expected!