Solution file for additional exercise 10.3 ------------------------------------------ (see additional exercise 10.1 for discussion of model, design and notation) MTB > WOpen "H:\VHM\VHM802\Data_csv\hs10_3.csv"; SUBC> FType; SUBC> CSV; SUBC> DecSep; SUBC> Period; SUBC> Field; SUBC> Comma; SUBC> TDelimiter; SUBC> DoubleQuote. Retrieving worksheet from file: ‘H:\VHM\VHM802\Data_csv\hs10_3.csv’ Worksheet was saved on 03/03/2011 MTB > GLM; SUBC> Response 'conc'; SUBC> Nodefault; SUBC> Categorical 'lab' 'material'; SUBC> Random lab; SUBC> Terms lab material lab*material; SUBC> Means material; SUBC> TExpand; SUBC> TMethod; SUBC> TAnova; SUBC> TSummary; SUBC> TCoefficients; SUBC> TEquation; SUBC> TFactor; SUBC> TEMS; SUBC> TVariance; SUBC> TMeans; SUBC> TDiagnostics 0; SUBC> Rtype 2; SUBC> GFOURPACK. General Linear Model: conc versus lab, material Method Factor coding (-1, 0, +1) Factor Information Factor Type Levels Values lab Random 11 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 material Fixed 3 1, 2, 3 Analysis of Variance Source DF Seq SS Contribution Adj SS Adj MS F-Value P-Value lab 10 2.961 0.39% 2.961 0.296 2.31 0.053 material 2 751.888 99.04% 751.888 375.944 2938.10 0.000 lab*material 20 2.559 0.34% 2.559 0.128 2.44 0.011 Error 33 1.730 0.23% 1.730 0.052 Total 65 759.138 100.00% Model Summary S R-sq R-sq(adj) PRESS R-sq(pred) 0.228963 99.77% 99.55% 6.92 99.09% Coefficients Term Coef SE Coef 95% CI T-Value P-Value VIF Constant 16.8061 0.0282 (16.7487, 16.8634) 596.31 0.000 lab 1 -0.1061 0.0891 (-0.2874, 0.0753) -1.19 0.243 * 2 0.0439 0.0891 (-0.1374, 0.2253) 0.49 0.625 * 3 -0.1894 0.0891 (-0.3707, -0.0081) -2.13 0.041 * 4 0.5273 0.0891 ( 0.3459, 0.7086) 5.92 0.000 * 5 -0.0227 0.0891 (-0.2041, 0.1586) -0.26 0.800 * 6 -0.2227 0.0891 (-0.4041, -0.0414) -2.50 0.018 * 7 0.0273 0.0891 (-0.1541, 0.2086) 0.31 0.762 * 8 0.2773 0.0891 ( 0.0959, 0.4586) 3.11 0.004 * 9 -0.1394 0.0891 (-0.3207, 0.0419) -1.56 0.127 * 10 -0.0894 0.0891 (-0.2707, 0.0919) -1.00 0.323 * material 1 4.3121 0.0399 ( 4.2310, 4.3932) 108.19 0.000 1.33 2 -3.9288 0.0399 (-4.0099, -3.8477) -98.57 0.000 1.33 lab*material 1 1 0.288 0.126 ( 0.031, 0.544) 2.28 0.029 * 1 2 0.029 0.126 ( -0.228, 0.285) 0.23 0.821 * 2 1 0.338 0.126 ( 0.081, 0.594) 2.68 0.011 * 2 2 0.179 0.126 ( -0.078, 0.435) 1.42 0.165 * 3 1 -0.179 0.126 ( -0.435, 0.078) -1.42 0.165 * 3 2 0.262 0.126 ( 0.006, 0.519) 2.08 0.045 * 4 1 0.105 0.126 ( -0.152, 0.361) 0.83 0.413 * 4 2 -0.105 0.126 ( -0.361, 0.152) -0.83 0.413 * 5 1 -0.145 0.126 ( -0.402, 0.111) -1.15 0.257 * 5 2 0.095 0.126 ( -0.161, 0.352) 0.76 0.454 * 6 1 -0.245 0.126 ( -0.502, 0.011) -1.95 0.060 * 6 2 0.095 0.126 ( -0.161, 0.352) 0.76 0.454 * 7 1 -0.095 0.126 ( -0.352, 0.161) -0.76 0.454 * 7 2 -0.155 0.126 ( -0.411, 0.102) -1.23 0.229 * 8 1 0.155 0.126 ( -0.102, 0.411) 1.23 0.229 * 8 2 -0.205 0.126 ( -0.461, 0.052) -1.62 0.114 * 9 1 -0.129 0.126 ( -0.385, 0.128) -1.02 0.314 * 9 2 0.012 0.126 ( -0.244, 0.269) 0.10 0.924 * 10 1 0.071 0.126 ( -0.185, 0.328) 0.56 0.576 * 10 2 -0.338 0.126 ( -0.594, -0.081) -2.68 0.011 * Regression Equation conc = 16.8061 - 0.1061 lab_1 + 0.0439 lab_2 - 0.1894 lab_3 + 0.5273 lab_4 - 0.0227 lab_5 - 0.2227 lab_6 + 0.0273 lab_7 + 0.2773 lab_8 - 0.1394 lab_9 - 0.0894 lab_10 - 0.1061 lab_11 + 4.3121 material_1 - 3.9288 material_2 - 0.3833 material_3 + 0.288 lab*material_1 1 + 0.029 lab*material_1 2 - 0.317 lab*material_1 3 + 0.338 lab*material_2 1 + 0.179 lab*material_2 2 - 0.517 lab*material_2 3 - 0.179 lab*material_3 1 + 0.262 lab*material_3 2 - 0.083 lab*material_3 3 + 0.105 lab*material_4 1 - 0.105 lab*material_4 2 + 0.000 lab*material_4 3 - 0.145 lab*material_5 1 + 0.095 lab*material_5 2 + 0.050 lab*material_5 3 - 0.245 lab*material_6 1 + 0.095 lab*material_6 2 + 0.150 lab*material_6 3 - 0.095 lab*material_7 1 - 0.155 lab*material_7 2 + 0.250 lab*material_7 3 + 0.155 lab*material_8 1 - 0.205 lab*material_8 2 + 0.050 lab*material_8 3 - 0.129 lab*material_9 1 + 0.012 lab*material_9 2 + 0.117 lab*material_9 3 + 0.071 lab*material_10 1 - 0.338 lab*material_10 2 + 0.267 lab*material_10 3 - 0.162 lab*material_11 1 + 0.129 lab*material_11 2 + 0.033 lab*material_11 3 Equation treats random terms as though they are fixed. Fits and Diagnostics for Unusual Observations Obs conc Fit SE Fit 95% CI Resid Std Resid Del Resid HI Cook’s D DFITS 8 22.000 21.550 0.162 (21.221, 21.879) 0.450 2.78 3.13 0.5 0.23 3.12748 19 21.100 21.550 0.162 (21.221, 21.879) -0.450 -2.78 -3.13 0.5 0.23 -3.12748 32 12.100 12.450 0.162 (12.121, 12.779) -0.350 -2.16 -2.30 0.5 0.14 -2.29771 43 12.800 12.450 0.162 (12.121, 12.779) 0.350 2.16 2.30 0.5 0.14 2.29771 Obs 8 R 19 R 32 R 43 R R Large residual Expected Mean Squares, using Adjusted SS Source Expected Mean Square for Each Term 1 lab (4) + 2.0000 (3) + 6.0000 (1) 2 material (4) + 2.0000 (3) + Q[2] 3 lab*material (4) + 2.0000 (3) 4 Error (4) ... Means Term Fitted Mean material 1 21.1182 2 12.8773 3 16.4227 Variance Components, using Adjusted SS Source Variance % of Total StDev % of Total lab 0.0280227 23.71% 0.167400 48.69% lab*material 0.0377652 31.95% 0.194333 56.52% Error 0.0524242 44.35% 0.228963 66.59% Total 0.118212 0.343820 Residual Plots for conc MTB > FacPlot 'conc'; SUBC> Factors lab material; SUBC> GInt; SUBC> Full. Interaction Plot for conc MTB > NormTest 'SRES'. Probability Plot of SRES The P-value for the Anderson-Darling test for normality is 0.038. Comments and answers to questions: ---------------------------------- With two replications, the (error) residuals come in pairs (positive and negative), and furthermore the fitted values are clearly divided by the materials. Taking this into account, the (error) residual plots look fine. The weakly significant test for normality may reflect that the largest residual (2.78) is mirrored by another equally large negative residual (-2.78). There does not seem to be any obvious model deviations. We give an additional residual analysis for the two other random effects below. The ANOVA shows a significant material*lab interaction, but as this is modelled by a random effect, we are still allowed to look at the main effects. The interaction plot seems in fact very regular (almost parallel lines). There is, as expected, a huge difference between materials, and there are also some differences between laboratories. Note that it is of no interest here to eliminate non-significant random terms, because the the random effects parameters are of major interest. The estimated variance components are shown in the Minitab listing: sigma^2 (error): 0.052 sigma^2_AB (L*M): 0.038 sigma^2_A (Labs): 0.028 We can use these values to compute the repeatability and reproducibility: r = 2.83 sqrt(0.052) = 0.65 R = 2.83 sqrt(0.052+0.038+0.028) = 0.97 Note that the Minitab least squares means are correct, but the standard errors are missing. In this case, we have no particular interest in intervals for the the materials anyway. The comparisons for materials in the Comparison menu would be correct, if we wanted to use them. Minitab commands for residual analysis for Lab*Mat and Lab: ----------------------------------------------------------- MTB > Name c5 "ByVar1" c6 "ByVar2" c7 "Mean1" MTB > Statistics 'conc'; SUBC> By 'lab' 'material'; SUBC> GValues 'ByVar1'-'ByVar2'; SUBC> Mean 'Mean1'. MTB > Name C8 "SRES_1". MTB > GLM; SUBC> Response 'Mean1'; SUBC> Nodefault; SUBC> Categorical 'ByVar1' 'ByVar2'; SUBC> Terms ByVar1 ByVar2; SUBC> TExpand; SUBC> TMethod; SUBC> TAnova; SUBC> TSummary; SUBC> TCoefficients; SUBC> TEquation; SUBC> TFactor; SUBC> TDiagnostics 0; SUBC> Rtype 2; SUBC> GFOURPACK; SUBC> SResiduals 'SRES_1'. General Linear Model: Mean1 versus ByVar1, ByVar2 Factor Information Factor Type Levels Values ByVar1 Fixed 11 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 ByVar2 Fixed 3 1, 2, 3 Analysis of Variance Source DF Seq SS Contribution Adj SS Adj MS F-Value P-Value ByVar1 10 1.480 0.39% 1.480 0.148 2.31 0.053 ByVar2 2 375.944 99.27% 375.944 187.972 2938.10 0.000 Error 20 1.280 0.34% 1.280 0.064 Total 32 378.704 100.00% S R-sq R-sq(adj) PRESS R-sq(pred) 0.252937 99.66% 99.46% 3.48356 99.08% Coefficients Term Coef SE Coef 95% CI T-Value P-Value VIF Constant 16.8061 0.0440 (16.7142, 16.8979) 381.69 0.000 ByVar1 1 -0.106 0.139 ( -0.397, 0.184) -0.76 0.455 1.82 2 0.044 0.139 ( -0.247, 0.334) 0.32 0.756 1.82 3 -0.189 0.139 ( -0.480, 0.101) -1.36 0.189 1.82 4 0.527 0.139 ( 0.237, 0.818) 3.79 0.001 1.82 5 -0.023 0.139 ( -0.313, 0.268) -0.16 0.872 1.82 6 -0.223 0.139 ( -0.513, 0.068) -1.60 0.125 1.82 7 0.027 0.139 ( -0.263, 0.318) 0.20 0.847 1.82 8 0.277 0.139 ( -0.013, 0.568) 1.99 0.060 1.82 9 -0.139 0.139 ( -0.430, 0.151) -1.00 0.329 1.82 10 -0.089 0.139 ( -0.380, 0.201) -0.64 0.528 1.82 ByVar2 1 4.3121 0.0623 ( 4.1822, 4.4420) 69.25 0.000 1.33 2 -3.9288 0.0623 (-4.0587, -3.7989) -63.09 0.000 1.33 Regression Equation Mean1 = 16.8061 - 0.106 ByVar1_1 + 0.044 ByVar1_2 - 0.189 ByVar1_3 + 0.527 ByVar1_4 - 0.023 ByVar1_5 - 0.223 ByVar1_6 + 0.027 ByVar1_7 + 0.277 ByVar1_8 - 0.139 ByVar1_9 - 0.089 ByVar1_10 - 0.106 ByVar1_11 + 4.3121 ByVar2_1 - 3.9288 ByVar2_2 - 0.3833 ByVar2_3 Fits and Diagnostics for Unusual Observations Obs Mean1 Fit SE Fit 95% CI Resid Std Resid Del Resid HI Cook’s D 6 15.950 16.467 0.159 (16.136, 16.798) -0.517 -2.62 -3.16 0.393939 0.34 Obs DFITS 6 -2.54614 R R Large residual Residual Plots for Mean1 MTB > NormTest 'SRES_1'. Probability Plot of SRES_1 The P-value for the Anderson-Darling test for normality is 0.544 Comments: --------- This analysis for the lab*mat means gives the same F-statistics as above. The residuals in the analysis can be used to check the assumptions for the Lab*Mat random effects. In this case the residual plots look similar to the previous residual plots, and are no cause of concern. MTB > Name c9 "ByVar3" c10 "Mean3" MTB > Statistics 'conc'; SUBC> By 'lab'; SUBC> GValues 'ByVar3'; SUBC> Mean 'Mean3'. MTB > NormTest 'Mean3'. Probability Plot of Mean3 The P-value for the Anderson-Darling test for normality is 0.032. Comments: --------- The laboratory means themselves should, if the model is correct, correspond to a sample from a normal distribution as well. However, the normal plot does not look too good - there are two labs with values clearly above the rest, and in particular lab no. 4 looks suspicious. Caution should be exercised to not include outlying labs from the calculations of reproducibility, because they will lead to overestimation of the (natural) variability between laboratories. The conclusion here is not obvious, but certainly one may express doubt about the procedures at lab no. 4.