Brief solution file for GO Problem 8.7 -------------------------------------- Data: viability (percentage of seeds sprouting) of big sagebrush seeds in batches subjected to different storage times and relative humidity. Notation: y_i = viability (based on 300 seeds) for batch i, i=1,...,63, or y_ijk = viability (based on 300 seeds) for batch k stored i days at relative humidity level j, i=0,60,120,180.240,300,360; j=1,2,3 ~ humidity 0%,32%,45%; k=1,2,3. The design is a complete 2-factorial with 3 replications, so the obvious model is that of a 2-way ANOVA with interaction: y_i = mu + alpha_days(i) + beta_humid(i) + (alpha beta)_days*humid(i) + eps_i, or y_ijk = mu + alpha_i + beta_j + (alpha beta)_ij + eps_ijk, depending on the chosen notation, where the errors are assumed i.i.d. and to follow N(0,sigma^2). MTB > WOpen "H:\VHM\VHM802\Data_csv\ch08pr7.csv"; SUBC> FType; SUBC> CSV; SUBC> DecSep; SUBC> Period; SUBC> Field; SUBC> Comma; SUBC> TDelimiter; SUBC> DoubleQuote. Retrieving worksheet from file: ‘H:\VHM\VHM802\Data_csv\ch08pr7.csv’ Worksheet was saved on 15/02/2011 MTB > GLM; SUBC> Response 'y'; SUBC> Nodefault; SUBC> Categorical 'days' 'humid_num'; SUBC> Terms days C5 days*C5; SUBC> Means days C5 days*C5; SUBC> TExpand; SUBC> TMethod; SUBC> TAnova; SUBC> TSummary; SUBC> TCoefficients; SUBC> TFactor; SUBC> TMeans; SUBC> TDiagnostics 0; SUBC> Rtype 2; SUBC> GFOURPACK. General Linear Model: y versus days, humid_num Method Factor coding (-1, 0, +1) Factor Information Factor Type Levels Values days Fixed 7 0, 60, 120, 180, 240, 300, 360 humid_num Fixed 3 0, 32, 45 Analysis of Variance Source DF Seq SS Contribution Adj SS Adj MS F-Value P-Value days 6 1788.7 10.15% 1788.7 298.12 61.25 0.000 humid_num 2 11476.4 65.12% 11476.4 5738.22 1178.93 0.000 days*humid_num 12 4154.2 23.57% 4154.2 346.18 71.12 0.000 Error 42 204.4 1.16% 204.4 4.87 Total 62 17623.8 100.00% Model Summary S R-sq R-sq(adj) PRESS R-sq(pred) 2.20620 98.84% 98.29% 459.96 97.39% Coefficients Term Coef SE Coef 95% CI T-Value P-Value VIF Constant 71.521 0.278 ( 70.960, 72.082) 257.31 0.000 days 0 9.813 0.681 ( 8.439, 11.187) 14.41 0.000 1.71 60 2.813 0.681 ( 1.439, 4.187) 4.13 0.000 1.71 120 1.168 0.681 ( -0.206, 2.542) 1.72 0.094 1.71 180 -1.543 0.681 ( -2.917, -0.169) -2.27 0.029 1.71 240 0.124 0.681 ( -1.250, 1.498) 0.18 0.857 1.71 300 -3.521 0.681 ( -4.895, -2.147) -5.17 0.000 1.71 humid_num 0 9.613 0.393 ( 8.819, 10.406) 24.45 0.000 1.33 32 9.475 0.393 ( 8.681, 10.268) 24.10 0.000 1.33 days*humid_num 0 0 -9.946 0.963 (-11.889, -8.003) -10.33 0.000 2.29 0 32 -8.808 0.963 (-10.751, -6.865) -9.15 0.000 2.29 60 0 -3.979 0.963 ( -5.923, -2.036) -4.13 0.000 2.29 60 32 -3.808 0.963 ( -5.751, -1.865) -3.95 0.000 2.29 120 0 -3.268 0.963 ( -5.211, -1.325) -3.39 0.002 2.29 120 32 -0.163 0.963 ( -2.107, 1.780) -0.17 0.866 2.29 180 0 -0.624 0.963 ( -2.567, 1.319) -0.65 0.521 2.29 180 32 -0.452 0.963 ( -2.396, 1.491) -0.47 0.641 2.29 240 0 -0.290 0.963 ( -2.234, 1.653) -0.30 0.764 2.29 240 32 1.881 0.963 ( -0.062, 3.824) 1.95 0.057 2.29 300 0 7.387 0.963 ( 5.444, 9.330) 7.67 0.000 2.29 300 32 2.492 0.963 ( 0.549, 4.435) 2.59 0.013 2.29 Fits and Diagnostics for Unusual Observations Obs y Fit SE Fit 95% CI Resid Std Resid Del Resid HI Cook’s D 49 52.90 57.03 1.27 (54.46, 59.60) -4.13 -2.29 -2.42 0.333333 0.13 Obs DFITS 49 -1.71411 R R Large residual Means Fitted Term Mean SE Mean days 0 81.333 0.735 60 74.333 0.735 120 72.689 0.735 180 69.978 0.735 240 71.644 0.735 300 68.000 0.735 360 62.667 0.735 humid_num 0 81.133 0.481 32 80.995 0.481 45 52.433 0.481 days*humid_num 0 0 81.00 1.27 0 32 82.00 1.27 0 45 81.00 1.27 60 0 79.97 1.27 60 32 80.00 1.27 60 45 63.03 1.27 120 0 79.03 1.27 120 32 82.00 1.27 120 45 57.03 1.27 180 0 78.97 1.27 180 32 79.00 1.27 180 45 51.97 1.27 240 0 80.97 1.27 240 32 83.00 1.27 240 45 50.97 1.27 300 0 85.00 1.27 300 32 79.97 1.27 300 45 39.03 1.27 360 0 83.00 1.27 360 32 81.00 1.27 360 45 24.00 1.27 Residual Plots for y MTB > FacPlot 'y'; SUBC> Factors days 'humid_num'; SUBC> GInt; SUBC> Full. Interaction Plot for y Comments: --------- The residual plots look very nice, and there is no evidence against a normal distribution by the normality test. A Box-Cox analysis does not give evidence for a need of transformation (not shown). The residual variation is small and the model explains an impressive 98.84% of the variation. As a consequence hereof, both the main effects and the interaction are strongly significant. The interaction plot shows a clear pattern with a simple interpretation: for humidities 0% and 32% there is only little effect of storage length (possibly some significant differences at some days, but no consistent trends), whereas there is a clear and strong declining trend with time at humidity 45%. The trend does not look entirely linear but the linear component most likely captures the majority of the variation. The results from the analysis could be represented by the estimated means for the 21 treatments with standard errors (given above). Different options for further analysis: - explore the hypothesis that humidity 0% and 32% do not produce different effects, in order to simplify the interpretation of the interaction; use an F-test between the full and reduced model, - carry out pairwise comparisons within the days*humidity interaction, possibly limited to comparisons within days (3 per day) and within humidities (7*6/2=21 per humidity level); due to the large number of comparisons, one would either have to accept a quite large simultaneous type I error level or a substantial loss of power by correcting for multiple comparisons, - design contrasts within the interaction to express hypotheses of interest; for example, a linear contrast in days contrasted between humidity groups 0% and 32% versus 45% is discussed on p.217 of the textbook (this is possible due to the equidistant storage times), - replace the categorical modelling of days by polynomial modelling of sufficiently high order to avoid significant loss of fit. We give a Minitab listing for modelling days by a third-order polynomial. For simplicity of assessing the coefficients, we rescale the days by dividing by 100. The lack-of-fit test for this model is F = (284.8-204.43)/(51-42)/4.87 = 1.834 ~ P=0.09 in F(9,42), so this seems an acceptable model reduction with no major loss of fit. MTB > Name C6 'days100' MTB > Let 'days100' = days/100 MTB > GLM; SUBC> Response 'y'; SUBC> Nodefault; SUBC> Continuous 'days100'; SUBC> Categorical 'humid_num'; SUBC> Unstandardized; SUBC> Terms C5 days100 days100*C5 days100*days100 days100*days100*C5 & CONT> days100*days100*days100 days100*days100*days100*C5; SUBC> TExpand; SUBC> TMethod; SUBC> TAnova; SUBC> TSummary; SUBC> TCoefficients; SUBC> TFactor; SUBC> TMeans; SUBC> TDiagnostics 0; SUBC> Rtype 2; SUBC> GFOURPACK. General Linear Model: y versus days100, humid_num Factor Information Factor Type Levels Values humid_num Fixed 3 0, 32, 45 Analysis of Variance Source DF Seq SS Contribution Adj SS Adj MS F-Value P-Value humid_num 2 11476.4 65.12% 2.304 1.152 0.21 0.814 days100 1 1562.0 8.86% 333.746 333.746 59.76 0.000 days100*humid_num 2 3900.5 22.13% 344.429 172.214 30.84 0.000 days100*days100 1 5.4 0.03% 196.523 196.523 35.19 0.000 days100*days100*humid_num 2 17.7 0.10% 166.917 83.458 14.94 0.000 days100*days100*days100 1 191.2 1.08% 191.159 191.159 34.23 0.000 days100*days100*days100*humid_num 2 185.8 1.05% 185.811 92.906 16.64 0.000 Error 51 284.8 1.62% 284.823 5.585 Lack-of-Fit 9 80.4 0.46% 80.396 8.933 1.84 0.090 Pure Error 42 204.4 1.16% 204.427 4.867 Total 62 17623.8 100.00% S R-sq R-sq(adj) PRESS R-sq(pred) 2.36321 98.38% 98.04% 422.846 97.60% Coefficients Term Coef SE Coef 95% CI T-Value P-Value VIF Constant 81.293 0.759 (79.769, 82.817) 107.09 0.000 humid_num 0 0.18 1.07 ( -1.98, 2.33) 0.16 0.870 8.67 32 0.49 1.07 ( -1.67, 2.64) 0.46 0.650 8.67 days100 -15.45 2.00 (-19.46, -11.43) -7.73 0.000 64.85 days100*humid_num 0 9.05 2.83 ( 3.37, 14.72) 3.20 0.002 281.02 32 13.03 2.83 ( 7.35, 18.70) 4.61 0.000 281.02 days100*days100 8.07 1.36 ( 5.34, 10.81) 5.93 0.000 422.50 days100*days100*humid_num 0 -3.83 1.92 ( -7.69, 0.04) -1.99 0.052 1173.61 32 -6.58 1.92 (-10.44, -2.71) -3.42 0.001 1173.61 days100*days100*days100 -1.452 0.248 (-1.950, -0.954) -5.85 0.000 182.35 days100*days100*days100*humid_num 0 0.813 0.351 ( 0.109, 1.518) 2.32 0.025 414.64 32 1.199 0.351 ( 0.494, 1.903) 3.42 0.001 414.64 Fits and Diagnostics for Unusual Observations Obs y Fit SE Fit 95% CI Resid Std Resid Del Resid HI Cook’s D 11 75.50 79.98 0.79 (78.40, 81.57) -4.48 -2.01 -2.08 0.111111 0.04 17 87.90 83.25 0.92 (81.40, 85.09) 4.65 2.14 2.22 0.150794 0.07 Obs DFITS 11 -0.734304 R 17 0.934307 R R Large residual Residual Plots for y