Brief solution file for GO Problem 8.7 -------------------------------------- Data: viability (percentage of seeds sprouting) of big sagebrush seeds in batches subjected to different storage times and relative humidity. Notation: y_i = viability (based on 300 seeds) for batch i, i=1,...,63, or y_ijk = viability (based on 300 seeds) for batch k stored i days at relative humidity level j, i=0,60,120,180.240,300,360; j=1,2,3 ~ humidity 0%,32%,45%; k=1,2,3. The design is a complete 2-factorial with 3 replications, so the obvious model is that of a 2-way ANOVA with interaction: y_i = mu + alpha_days(i) + beta_humid(i) + (alpha beta)_days*humid(i) + eps_i, or y_ijk = mu + alpha_i + beta_j + (alpha beta)_ij + eps_ijk, depending on the chosen notation, where the errors are assumed i.i.d. and to follow N(0,sigma^2). MTB > WOpen "h:\VHM\VHM802\Data_csv\ch08pr7.csv"; SUBC> FType; SUBC> CSV; SUBC> DecSep; SUBC> Period; SUBC> Field; SUBC> Comma; SUBC> TDelimiter; SUBC> DoubleQuote. Retrieving worksheet from file: 'h:\VHM\VHM802\Data_csv\ch08pr7.csv' Worksheet was saved on 15/02/2011 MTB > Name c6 "SRES1" c7 "TRES1" MTB > GLM 'y' = days| 'humid_num'; SUBC> Brief 2 ; SUBC> Means days* 'humid_num'; SUBC> SResiduals 'SRES1'; SUBC> TResiduals 'TRES1'; SUBC> GFourpack; SUBC> RType 2 . General Linear Model: y versus days, humid_num Factor Type Levels Values days fixed 7 0, 60, 120, 180, 240, 300, 360 humid_num fixed 3 0, 32, 45 Analysis of Variance for y, using Adjusted SS for Tests Source DF Seq SS Adj SS Adj MS F P days 6 1788.74 1788.74 298.12 61.25 0.000 humid_num 2 11476.44 11476.44 5738.22 1178.93 0.000 days*humid_num 12 4154.18 4154.18 346.18 71.12 0.000 Error 42 204.43 204.43 4.87 Total 62 17623.78 S = 2.20620 R-Sq = 98.84% R-Sq(adj) = 98.29% Unusual Observations for y Obs y Fit SE Fit Residual St Resid 49 52.9000 57.0333 1.2737 -4.1333 -2.29 R R denotes an observation with a large standardized residual. Least Squares Means for y days*humid_num Mean SE Mean 0 0 81.00 1.274 0 32 82.00 1.274 0 45 81.00 1.274 60 0 79.97 1.274 60 32 80.00 1.274 60 45 63.03 1.274 120 0 79.03 1.274 120 32 82.00 1.274 120 45 57.03 1.274 180 0 78.97 1.274 180 32 79.00 1.274 180 45 51.97 1.274 240 0 80.97 1.274 240 32 83.00 1.274 240 45 50.97 1.274 300 0 85.00 1.274 300 32 79.97 1.274 300 45 39.03 1.274 360 0 83.00 1.274 360 32 81.00 1.274 360 45 24.00 1.274 Residual Plots for y MTB > GLM 'y' = days| 'humid_num'; SUBC> SMeans C4000; SUBC> Brief 0; SUBC> Interact 'days' 'humid_num'. MTB > GFInt 'days' 'humid_num'; SUBC> Responses 'y'; SUBC> FMeans C4000; SUBC> Full. Interaction Plot for y MTB > Erase C4000. MTB > PPlot 'SRES1'; SUBC> Normal; SUBC> Symbol; SUBC> FitD; SUBC> Grid 2; SUBC> Grid 1; SUBC> MGrid 1. Probability Plot of SRES1 The P-value for the Anderson-Darling test of normality is 0.427 Comments: --------- The residual plots look very nice, and there is no evidence against a normal distribution by the normality test. A Box-Cox analysis does not give evidence for a need of transformation (not shown). The residual variation is small and the model explains an impressive 98.84% of the variation. In reflection hereof, both the main effects and the interaction are strongly significant. The interaction shows a clear pattern with a simple interpretation: for humidities 0% and 32% there is only little effect of storage length (possibly some significant differences at some days, but no consistent trends), whereas there is a clear and strong declining trend with time at humidity 45%. The trend does not look entirely linear but the linear component most likely captures the majority of the variation. The results from the analysis could be represented by the estimated means for the 21 treatments with standard errors (given above). Different options for further analysis: - explore the hypothesis that humidity 0% and 32% do not produce different effects, in order to simplify the interpretation of the interaction; use an F-test between the full and reduced model, - carry out pairwise comparisons within the days*humidity interaction, possibly limited to comparisons within days (3 per day) and within humidities (7*6/2=21 per humidity level); due to the large number of comparisons, one would either have to accept a quite large simultaneous type I error level or a substantial loss of power by correcting for multiple comparisons, - design contrasts within the interaction to express hypotheses of interest; for example, a linear contrast in days contrasted between humidity groups 0% and 32% versus 45% is discussed on p.217 of the textbook, - replace the categorical modelling of days by polynomial modelling of sufficiently high order to avoid significant loss of fit. We give a Minitab listing for modelling days by a third-order polynomial. The lack-of-fit test for this model is F = (284.8-204.43)/(51-42)/4.87 = 1.834 ~ P=0.09 in F(9,42), so this seems an acceptable model reduction with no major loss of fit. MTB > Name C8 'days1' MTB > Let 'days1' = (days/100)**1 MTB > Name C9 'days2' MTB > Let 'days2' = (days/100)**2 MTB > Name C10 'days3' MTB > Let 'days3' = (days/100)**3 MTB > GLM 'y' = 'humid_num'|days1 'humid_num'|days2 'humid_num'|days3; SUBC> Covariates 'days1' 'days2' 'days3'; SUBC> Brief 3 ; SUBC> SResiduals 'SRES2'; SUBC> TResiduals 'TRES2'. General Linear Model: y versus humid_num Factor Type Levels Values humid_num fixed 3 0, 32, 45 Analysis of Variance for y, using Adjusted SS for Tests Source DF Seq SS Adj SS Adj MS F P humid_num 2 11476.4 2.3 1.2 0.21 0.814 days1 1 1562.0 333.7 333.7 59.76 0.000 humid_num*days1 2 3900.5 344.4 172.2 30.84 0.000 days2 1 5.4 196.5 196.5 35.19 0.000 humid_num*days2 2 17.7 166.9 83.5 14.94 0.000 days3 1 191.2 191.2 191.2 34.23 0.000 humid_num*days3 2 185.8 185.8 92.9 16.64 0.000 Error 51 284.8 284.8 5.6 Total 62 17623.8 S = 2.36321 R-Sq = 98.38% R-Sq(adj) = 98.04% Term Coef SE Coef T P Constant 81.2931 0.7591 107.09 0.000 humid_num 0 0.176 1.074 0.16 0.870 32 0.489 1.074 0.46 0.650 days1 -15.446 1.998 -7.73 0.000 days1*humid_num 0 9.046 2.826 3.20 0.002 32 13.026 2.826 4.61 0.000 days2 8.074 1.361 5.93 0.000 days2*humid_num 0 -3.827 1.925 -1.99 0.052 32 -6.576 1.925 -3.42 0.001 days3 -1.4518 0.2481 -5.85 0.000 days3*humid_num 0 0.8130 0.3509 2.32 0.025 32 1.1988 0.3509 3.42 0.001 Unusual Observations for y Obs y Fit SE Fit Residual St Resid 11 75.5000 79.9841 0.7877 -4.4841 -2.01 R 17 87.9000 83.2468 0.9177 4.6532 2.14 R R denotes an observation with a large standardized residual.