Supplementary exercises 7.127 and 7.129 of IPS7e ------------------------------------------------ Data: Birth weight (in g) of babies born by women who tested positive on cocaine use compared to women who did not test positive. Note: this must be an observational study, not an experiment. Assume 75 women in each group. Model: Two-sample inference based on normal distributions N(mu1,sigma) and N(mu2,sigma); note that we assume the standard deviation to be the same in the two groups (a convenient assumption for planning purposes, but if in reality they differed, it would be good to take more subjects in the group with the highest standard deviation). The standard deviation is considered unknown and to be estimated by the data, and our guess for the (common) value is sigma=650. Hypotheses of interest are: H0: mu1=mu2 (no difference between groups) Ha: mu1<>mu2 (different birth weights in the cocaine group; a one-sided alternative can be argued as well but there might be so much doubt about the results from the previous study that the researcher opts for a more cautious approach) 7.127: ------- Power calculation assuming a true population mean difference of mu1-mu2=350. MTB > Power; SUBC> TTwo; SUBC> Sample 75; SUBC> Difference 350; SUBC> Sigma 650; SUBC> GPCurve. Power and Sample Size 2-Sample t Test Testing mean 1 = mean 2 (versus not =) Calculating power for mean 1 = mean 2 + difference Alpha = 0.05 Assumed standard deviation = 650 Sample Difference Size Power 350 75 0.905905 The sample size is for each group. MTB > Power; SUBC> TTwo; SUBC> Sample 75; SUBC> Difference 350; SUBC> Sigma 650; SUBC> Alternative 1; SUBC> GPCurve. Power and Sample Size 2-Sample t Test Testing mean 1 = mean 2 (versus >) Calculating power for mean 1 = mean 2 + difference Alpha = 0.05 Assumed standard deviation = 650 Sample Difference Size Power 350 75 0.949227 The sample size is for each group. Notes: ------ The computed powers are 0.91 and 0.95, against a two-sided and one-sided alternative, respectively. A study of this size is pretty likely to find a significant result if the true magnitude of the difference is as presumed. 7.129: ------- Solution for n=75 only. The 95% confidence interval for mu1-mu2 is given by (when using the pooled standard deviation): Xmean1-Xmean2 +- tstar*s*sqrt(1/n1 + 1/n2) Here n1=n2=75 df = (n1-1)+(n2-1) = 148, tstar = t_0.975(148) ~= t_0.975(100) = 1.984, (Minitab gives the exact value as 1.976) and the margin of error is 1.984*650*sqrt(2/75) = 210.6 ~= 210 It is seen that an observed difference of 350 would make the CI very far from including zero, and would therefore correspond to a strongly significant result. This may seem surprising in view of the power in the range 0.9-0.95 obtained with n=75. However, a power calculation involves the true population difference, not the observed sample difference. The true population difference being 350 does not all guarantee that the sample difference is also close to 350, therefore it is a *stronger* requirement to have a fairly high power at a certain population difference than having a margin of error for the CI of the same size. Another way of saying this is that controlling the margin of error leads to a statement about what an observed mean difference of a certain size, here 210 g, between the two groups tells us (namely, it is just at the significance cut-off), whereas a power calculation gives us the probability of getting a significant result from the two samples when the true population difference is 350 g. Minitab command (to compute tstar) and listing: MTB > InvCDF .975; SUBC> T 148. Inverse Cumulative Distribution Function Student's t distribution with 148 DF P( X <= x ) x 0.975 1.97612 Minitab command to compute hypothetical confidence interval (because we don't know the actual means) from which we can read off the margin of error: MTB > TwoT 75 0 650 75 0 650; SUBC> Confidence 95.0; SUBC> Test 0.0; SUBC> Alternative 0; SUBC> Pooled. Two-Sample T-Test and CI SE Sample N Mean StDev Mean 1 75 0 650 75 2 75 0 650 75 Difference = mu (1) - mu (2) Estimate for difference: 0 95% CI for difference: (-210, 210) T-Test of difference = 0 (vs not =): T-Value = 0.00 P-Value = 1.000 DF = 148 Both use Pooled StDev = 650.0000 Comments: --------- It is seen that the margin of error equals 210.