Solution file for additional exercise 6.1 ----------------------------------------- Data: measurements of iron in the livers of white rats. Rats were randomly allocated to 5 diets (A-E), with 10 rats per diet. The data constitute 5 independent samples with continuous outcome, and the model immediately suggested is a one-way ANOVA. MTB > WOpen "H:\VHM\VHM802\Data_csv\hs06_1.csv"; SUBC> FType; SUBC> CSV; SUBC> DecSep; SUBC> Period; SUBC> Field; SUBC> Comma; SUBC> TDelimiter; SUBC> DoubleQuote. Retrieving worksheet from file: ‘H:\VHM\VHM802\Data_csv\hs06_1.csv’ Worksheet was saved on 12/02/2011 MTB > OneWay; SUBC> Response 'iron'; SUBC> Categorical 'diet'; SUBC> IType 0; SUBC> GMCI; SUBC> GIntPlot; SUBC> GBoxPlot; SUBC> TMethod; SUBC> TFactor; SUBC> TANOVA; SUBC> TSummary; SUBC> TMeans; SUBC> Nodefault. One-way ANOVA: iron versus diet Method Null hypothesis All means are equal Alternative hypothesis At least one mean is different Significance level a = 0.05 Equal variances were assumed for the analysis. Factor Information Factor Levels Values diet 5 A, B, C, D, E Analysis of Variance Source DF Adj SS Adj MS F-Value P-Value diet 4 127.8 31.955 12.44 0.000 Error 45 115.6 2.569 Total 49 243.4 Model Summary S R-sq R-sq(adj) R-sq(pred) 1.60276 52.51% 48.29% 41.37% Means diet N Mean StDev 95% CI A 10 5.693 2.406 (4.672, 6.714) B 10 2.906 1.983 (1.885, 3.927) C 10 2.823 1.621 (1.802, 3.844) D 10 1.584 0.562 (0.563, 2.605) E 10 1.090 0.428 (0.069, 2.111) Pooled StDev = 1.60276 Interval Plot of iron vs diet Boxplot of iron Comments: --------- The standard deviations are clearly not equal in the 5 groups, but seem to increase almost linearly with the mean. We therefore try a log transformation. MTB > Name C3 'lniron' MTB > Let 'lniron' = ln('iron') MTB > OneWay; SUBC> Response 'lniron'; SUBC> Categorical 'diet'; SUBC> IType 0; SUBC> GMCI; SUBC> GIntPlot; SUBC> GBoxPlot; SUBC> GFourpack; SUBC> TMethod; SUBC> TFactor; SUBC> TANOVA; SUBC> TSummary; SUBC> TMeans; SUBC> Nodefault. One-way ANOVA: lniron versus diet ... Analysis of Variance Source DF Adj SS Adj MS F-Value P-Value diet 4 14.90 3.7255 15.72 0.000 Error 45 10.66 0.2370 Total 49 25.57 Model Summary S R-sq R-sq(adj) R-sq(pred) 0.486819 58.29% 54.58% 48.50% Means diet N Mean StDev 95% CI A 10 1.652 0.458 ( 1.342, 1.962) B 10 0.874 0.647 ( 0.564, 1.184) C 10 0.894 0.558 ( 0.584, 1.204) D 10 0.406 0.345 ( 0.096, 0.716) E 10 0.026 0.356 (-0.284, 0.336) Pooled StDev = 0.486819 Interval Plot of lniron vs diet Boxplot of lniron Residual Plots for lniron MTB > PPlot 'lniron'; SUBC> Normal; SUBC> Symbol; SUBC> FitD; SUBC> Grid 2; SUBC> Grid 1; SUBC> MGrid 1; SUBC> Panel 'diet'. Probability Plot of lniron MTB > GLM; SUBC> Response 'iron'; SUBC> Nodefault; SUBC> Categorical 'diet'; SUBC> Terms diet; SUBC> Boxcox; SUBC> TMethod; SUBC> TAnova; SUBC> TSummary; SUBC> TCoefficients; SUBC> TEquation; SUBC> TFactor; SUBC> TDiagnostics 0. General Linear Model: iron versus diet ... Box-Cox transformation Rounded lambda -0.5 Estimated lambda -0.302808 95% CI for lambda (-0.689308, 0.0756918) ... Comments: --------- The 1-way ANOVA model: ln(iron_i) = mu_diet(i) + eps_i, i=1,...,50 where the errors eps_i are N(0,sigma^2) seems to describe the data well. The standard deviations are roughly the same in the 5 groups, the values in each group do not show strong deviations from normality (difficult to assess with only 10 obs.), and the residual plot and normal plot look fine. A Box-Cox analysis points to an optimal power of lambda=-0.3, and we may also consider the rounded "nice" value of -0.5 (as shown above, this is Minitab's choice in the GLM menu, with a 95% CI for lambda including 0 (corresponding to log transformation). At both scales the analysis of residuals do not detect any major problems with model assumptions. For simplicity, it might be preferable to work with the log transformation (and that's what we'll do for this solution). The ANOVA table is given above, and the F-statistic for testing all groups having equal means is F=15.72 and clearly significant in F(4,45). The text defines 4 contrasts by their coefficients. These are seen to be pairwise orthogonal, because for any pair of them the sum of products of coefficients is zero. For example, for contrasts 1 and 2: 1*1 + 1*(-1) + 0*2 + 0*0 + 0*0 = 0 or for contrasts 2 and 3: 1*2 + 1*2 + (-2)*2 + 0*3 + 0*3 = 0. Contrast Estimate SE SS SS(%) t P(t) F(Schef) P(Schef) ------------------------------------------------------------------------------- beef vs pork 0.778 .2177 3.03 20.3 3.57 0.0001 3.19 0.022 mammals vs poultry 0.738 .3771 0.91 6.1 1.96 0.057 0.96 0.44 animal vs vegetab. 5.545 .8432 10.25 68.8 6.58 0.0000 10.8 0.0000 beans vs oats 0.380 .2177 0.72 4.8 1.75 0.088 0.76 0.56 ------------------------------------------------------------------------------- formulae: SS=(estimate^2)/[(w_1^2+...+w_5^2)/10] SS (%) = SS / SSTrT (SSTrT=14.902) t=Est/SE=sqrt(SS/MSE) (MSE=0.237) P(t) ~ t(45) F(Scheffe)=SS/4/MSE (or t^2/4) P(Scheffe) = tail probab for F in F(4,45) Conclusions: ------------ Just looking at the SS-values for the contrasts, it is clear that the last one (animal vs. vegetable) is accounting for a large proportion (69%) of the variation between diets. Using ordinary t-tests, all 4 contrasts are interesting - two of them are clearly significant and the other two are "interesting" ("borderline significant"). This method assumes that all contrasts were preplanned, and works with individual error levels of 5%. It is possible to do a Bonferroni correction for carrying out 4 tests by multiplying all P-values by 4 (not shown). This approach would make the 2 "interesting" contrasts non-significant. Note however that the method still assumes that all contrasts were preplanned. We could also do a Holm correction, by which we would then multiply the P-values by 4,3,2,1; the conclusions are the same. Finally, the Scheffe method takes into account both the multiple tests and contrasts being suggested by the data. It is the most conservative of the 3 methods but also the most "safe". It shows the animal vs vegetable contrast to be still highly significant, and also the beef vs pork contrast is still significant. There is evidence that beef and pork diets are different, and strong evidence that animal and vegetable diets are different.