Supplementary Exercises 5.49, 5.51 and 5.53 of IPS7e ---------------------------------------------------- 5.49 ---- Let X = number of word errors caught by the proofreader out of the 20 in the essay, and let Y=20-X be the number of word errors missed. (a) The distribution for X should be binomial B(20,0.7), and the distribution for Y should be binomial B(20,0.3) because the probability of missing a word is 1-0.7 = 0.3. We write X ~ B(20,0.7) and Y ~ B(20,0.3). To facilitate the understanding of the following, we display the probability functions of both distributions using Minitab (where pcaught corresponds to the distribution for X and pmissed to the distribution for Y): MTB > Name c1 "x" MTB > Set 'x' DATA> 1( 0 : 20 / 1 )1 DATA> End. MTB > Name c2 "pcaught" MTB > PDF 'x' 'pcaught'; SUBC> Binomial 20 .7. MTB > Name c3 "pmissed" MTB > PDF 'x' 'pmissed'; SUBC> Binomial 20 .3. MTB > Print 'x' 'pcaught' 'pmissed'. Data Display Row x pcaught pmissed 1 0 0.000000 0.000798 2 1 0.000000 0.006839 3 2 0.000000 0.027846 4 3 0.000001 0.071604 5 4 0.000005 0.130421 6 5 0.000037 0.178863 7 6 0.000218 0.191639 8 7 0.001018 0.164262 9 8 0.003859 0.114397 10 9 0.012007 0.065370 11 10 0.030817 0.030817 12 11 0.065370 0.012007 13 12 0.114397 0.003859 14 13 0.164262 0.001018 15 14 0.191639 0.000218 16 15 0.178863 0.000037 17 16 0.130421 0.000005 18 17 0.071604 0.000001 19 18 0.027846 0.000000 20 19 0.006839 0.000000 21 20 0.000798 0.000000 The distributions can also be displayed using the Graph-Probability Distribution Plot menu. You can get both distributions in the same display by choosing the submenu Vary Parameters. MTB > DPlot; SUBC> Distribution; SUBC> Binomial 20 0.3; SUBC> Distribution; SUBC> Binomial 20 0.7; SUBC> Panel; SUBC> Same 2 1. Distribution Plot (b) The event "missing 9 or more words" is expressed as Y>=9. The Minitab listing above gives P(Y>=9) = 0.06537 + 0.03082 + 0.01201 + 0.00386 + 0.00102 + 0.00022 + 0.00004 = 0.1133 Alternatively, Table 1 from Stephens gives P(Y>=9) = 0.065 + 0.031 + 0.012 + 0.004 + 0.001 = 0.113 Another possibility is to use a cumulative probability calculation (in Minitab). We can get P(Y>=9) = 1-P(Y<=8) = 1-0.8867 = 0.1133 or directly P(Y>=9) = P(X<=11) = 0.1133. Minitab code and listing for these calculations: MTB > CDF 8; SUBC> Binomial 20 0.3. Cumulative Distribution Function Binomial with n = 20 and p = 0.3 x P( X <= x ) 8 0.886669 MTB > CDF 11; SUBC> Binomial 20 0.7. Cumulative Distribution Function Binomial with n = 20 and p = 0.7 x P( X <= x ) 11 0.113331 5.51 ---- (a) By the formula for the mean of a binomial distribution EX = 20*0.7 = 14 EY = 20*0.3 = 6 (b) By the formula for the standard deviation of a binomial distribution sdX = sqrt(20*0.7*0.3) = sqrt(4.2) = 2.049 Note that this is also the standard devation of Y. (c) p=0.9: sd(X) = sqrt(20*0.9*0.1) = sqrt(1.8) = 1.342 p=0.99: sd(X) = sqrt(20*0.99*0.01) = sqrt(0.198) = 0.445 The standard deviation decreases (towards zero) as the probability gets close to 1 (or close to zero). 5.53 ---- In our notation, Y is the number of words missed, and Y ~ B(20,0.3). We are looking for the smallest number m so that P(Y>=m) is at most 0.05. From Exercise 5.49 we know that for m=9, that probability is 0.1133. We need to increase m to get a smaller probability. For m=10, the term 0.06537 will drop out of the summation (see item (b) of Exercise 5.49), so that P(Y>=10) = 0.03082 + 0.01201 + 0.00386 + 0.00102 + 0.00022 + 0 = 0.0479. Therefore, the desired value is m=10. The interpretation in the question of the event Y>=10 as evidence that the proofreader catches less than 70% of word errors comes from statistical testing of the null hypothesis H0: p=0.7 against the alternative Ha: p<0.7. This will be covered later in the course (and book), and the inclusion of such an interpretation here is an attempt to gradually introduce these ideas.