Exercise 27.8 of PSLS 3e ------------------------ Data: 2 samples of corn yields (bushels per acre) for 4 plots without weeds and 4 plots with 9 lamb's quarter plants per meter of row. Model: the 2 samples are independent and each a simple random sample (i.i.d. sample) from a distribution with unknown mean, median and standard devation. (a) When using the Wilcoxon-Mann-Whitney test we may make no further assumptions about the distributions, and test H0: P1=P2, versus Ha: P1 is systematically larger than P2, where P1 and P2 are the distributions of corn yields in the two populations (weed-free and weed-filled plots). Alternatively, we may make the "delta-assumption" that the two distributions are of the same shape (only differ by their location), and test the hypotheses H0: median1=median2, versus Ha: median1>median2 (the alternative expressing higher yields in weed-free plots). The motivation for the one-sided alternatives lies in the wording of the question ("evidence that 9 weeds per meter reduces corn yields"), but in practice one could also justify a two-sided alternative. Given the low sample sizes it is perhaps tempting to increase the power of the statistical analysis by using a one-sided alternative. Minitab commands and output for the test: MTB > WOpen "H:\VHM\VHM801\Datasets\Minitab\Chapter 27\ex27_008.mtw". Retrieving worksheet from file: 'H:\VHM\VHM801\Datasets\Minitab\Chapter 27\ex27_008.mtw' Worksheet was saved on 17/10/2014 MTB > Unstack ('yield'); SUBC> Subscripts 'weeds'; SUBC> After; SUBC> VarNames. MTB > Mann-Whitney 95.0 'yield_0' 'yield_9'; SUBC> Alternative 1. Mann-Whitney Test and CI: yield_0, yield_9 N Median yield_0 4 169.45 yield_9 4 162.55 Point estimate for eta1-eta2 is 9.65 97.0 Percent CI for eta1-eta2 is (2.20,34.49) W = 26.0 Test of eta1 = eta2 vs eta1 > eta2 is significant at 0.0152 Comments: --------- The Wilcoxon rank sum test (or Wilcoxon-Mann-Whitney test) gives an approximate P-value of 0.0152 for H0 against a one-sided alternative. This means that there is some evidence that 9 weeds per meter systematically reduce the corn yield. (Note: the exact P-value is 0.014, obtained with another software.) (b) MTB > TwoSample 'yield_0' 'yield_9'; SUBC> Confidence 95.0; SUBC> Test 0.0; SUBC> Alternative 1. Two-Sample T-Test and CI: yield_0, yield_9 Two-sample T for yield_0 vs yield_9 N Mean StDev SE Mean yield_0 4 170.20 5.42 2.7 yield_9 4 157.6 10.1 5.1 Difference = mu (yield_0) - mu (yield_9) Estimate for difference: 12.62 95% lower bound for difference: 0.39 T-Test of difference = 0 (vs >): T-Value = 2.20 P-Value = 0.046 DF = 4 Comments: --------- The t-test gives a P-value for the one-sided test of 0.046, just below the 5% significance limit. It is really on the border of 5% significance, which we might express as weak evidence. Note that we do not assume the variances to be equal because of the larger variation in the second sample (due to the outlier). (c) MTB > Copy 'yield_9' c7; SUBC> Varnames. MTB > name c7 'yield_9_excl' MTB > let c7(2)='*' MTB > Mann-Whitney 95.0 'yield_0' 'yield_9_excl'; SUBC> Alternative 1. Mann-Whitney Test and CI: yield_0, yield_9_excl N Median yield_0 4 169.45 yield_9_excl 3 162.70 Point estimate for eta1-eta2 is 6.85 94.8 Percent CI for eta1-eta2 is (2.20,14.50) W = 22.0 Test of eta1 = eta2 vs eta1 > eta2 is significant at 0.0259 MTB > TwoSample 'yield_0' 'yield_9_excl'; SUBC> Confidence 95.0; SUBC> Test 0.0; SUBC> Alternative 1. Two-Sample T-Test and CI: yield_0, yield_9_excl Two-sample T for yield_0 vs yield_9_excl N Mean StDev SE Mean yield_0 4 170.20 5.42 2.7 yield_9_excl 3 162.633 0.208 0.12 Difference = mu (yield_0) - mu (yield_9_excl) Estimate for difference: 7.57 95% lower bound for difference: 1.18 T-Test of difference = 0 (vs >): T-Value = 2.79 P-Value = 0.034 DF = 3 Comments: --------- The outlier reduced the mean yield in the 9 meter group by 5 units (bushels per acre); the value is obtained as 162.6-157.6=5.0. It increased the standard deviation by a factor of approximately 50 (10.1/0.208). However, the results in both analyses with and without the outlier are surprisingly similar, although the P-value increases in the non-parametric analyses and decreases with the t-test (why?). Additional comment: ------------------- From this exercise one may get the idea that the Wilcoxon rank sum test is more powerful than the t-test in small samples. That is not true in general, and the P-value obtained by the Wilcoxon rank sum test is the smallest possible with these sample sizes (why?, see below for answer). Additional question: -------------------- The rank sums for the two groups are the most extreme possible for a dataset with 4 observations in each group. That is because every observation in the weed 0 sample is larger than any of the observations in the weed 9 sample. In this case, the ranks for the weed 9 sample become 1, 2, 3 and 4 (sum=10), and the ranks for the weed 0 sample become 5, 6, 7 and 8 (sum=26). No matter the actual values, the ranks in the two groups can never be more different, and therefore the P-value is the smallest possible with two samples of size 4.