Solution file for Exercise 13.5 (GO) ------------------------------------ Data: measurements of responses of 16 disk drives produced with 4 different substrates (A-D), on 4 different days, by 4 different machines and 4 different operators. Notation: y_i = response (in microvolts times 10^-2) for i'th disk drive, i=1,...,16, or y_ijkl = response (in microvolts times 10^-2) for drive produced with substrate i, by operator j, and on machine k on day l, i=A,B,C,D; j=1,2,3,4; k=1,2,3,4; l=1,2,3,4. The design is a 4x4 Graeco-Latin square because the symbols for both substrates and days occur once in every row and column, and in addition every (substrate,day) occurs exactly once. The machine, operator and day may be considered as blocking factors, thus the design allows to account for three blocking factors simultaneously. Note that the sequential and partial (adjusted) sum of squares in the ANOVA table below are identical; this is a result of the orthogonality of all factors (allowing them to be assessed independently). The statistical model is y_i = mu + alpha_substrate(i) + beta_operator(i) + gamma_macine(i) + delta_day(i) + eps_i, or y_ijkl = mu + alpha_i + beta_j + gamma_k + + delta_l + epsilon_ijkl, depending on the chosen notation. MTB > WOpen "H:\VHM\VHM802\Data_csv\ch13ex5.csv"; SUBC> FType; SUBC> CSV; SUBC> DecSep; SUBC> Period; SUBC> Field; SUBC> Comma; SUBC> TDelimiter; SUBC> DoubleQuote. Retrieving worksheet from file: ‘H:\VHM\VHM802\Data_csv\ch13ex5.csv’ Worksheet was saved on 04/02/2012 MTB > GLM; SUBC> Response 'y'; SUBC> Nodefault; SUBC> Categorical 'operator' 'machine' 'day_txt' 'tx_txt'; SUBC> Terms operator machine C6 C7; SUBC> Means operator machine C6 C7; SUBC> TExpand; SUBC> TMethod; SUBC> TAnova; SUBC> TSummary; SUBC> TCoefficients; SUBC> TEquation; SUBC> TFactor; SUBC> TMeans; SUBC> TDiagnostics 0; SUBC> Rtype 2; SUBC> GFOURPACK. General Linear Model: y versus operator, machine, day_txt, tx_txt Method Factor coding (-1, 0, +1) Factor Information Factor Type Levels Values operator Fixed 4 1, 2, 3, 4 machine Fixed 4 1, 2, 3, 4 day_txt Fixed 4 alpha, beta, delta, gamma tx_txt Fixed 4 A, B, C, D Analysis of Variance Source DF Seq SS Contribution Adj SS Adj MS F-Value P-Value operator 3 14.000 11.48% 14.000 4.667 0.65 0.633 machine 3 21.500 17.62% 21.500 7.167 1.00 0.500 day_txt 3 3.500 2.87% 3.500 1.167 0.16 0.915 tx_txt 3 61.500 50.41% 61.500 20.500 2.86 0.206 Error 3 21.500 17.62% 21.500 7.167 Total 15 122.000 100.00% Model Summary S R-sq R-sq(adj) PRESS R-sq(pred) 2.67706 82.38% 11.89% 611.556 0.00% Coefficients Term Coef SE Coef 95% CI T-Value P-Value VIF Constant 6.000 0.669 (3.870, 8.130) 8.97 0.003 operator 1 -0.50 1.16 (-4.19, 3.19) -0.43 0.695 1.50 2 1.50 1.16 (-2.19, 5.19) 1.29 0.286 1.50 3 -1.00 1.16 (-4.69, 2.69) -0.86 0.452 1.50 machine 1 1.25 1.16 (-2.44, 4.94) 1.08 0.360 1.50 2 -1.50 1.16 (-5.19, 2.19) -1.29 0.286 1.50 3 1.00 1.16 (-2.69, 4.69) 0.86 0.452 1.50 day_txt alpha 0.00 1.16 (-3.69, 3.69) 0.00 1.000 1.50 beta 0.25 1.16 (-3.44, 3.94) 0.22 0.843 1.50 delta -0.75 1.16 (-4.44, 2.94) -0.65 0.564 1.50 tx_txt A -0.25 1.16 (-3.94, 3.44) -0.22 0.843 1.50 B -0.25 1.16 (-3.94, 3.44) -0.22 0.843 1.50 C 3.00 1.16 (-0.69, 6.69) 2.59 0.081 1.50 Regression Equation y = 6.000 - 0.50 operator_1 + 1.50 operator_2 - 1.00 operator_3 - 0.00 operator_4 + 1.25 machine_1 - 1.50 machine_2 + 1.00 machine_3 - 0.75 machine_4 + 0.00 day_txt_alpha + 0.25 day_txt_beta - 0.75 day_txt_delta + 0.50 day_txt_gamma - 0.25 tx_txt_A - 0.25 tx_txt_B + 3.00 tx_txt_C - 2.50 tx_txt_D Means Fitted Term Mean SE Mean operator 1 5.50 1.34 2 7.50 1.34 3 5.00 1.34 4 6.00 1.34 machine 1 7.25 1.34 2 4.50 1.34 3 7.00 1.34 4 5.25 1.34 day_txt alpha 6.00 1.34 beta 6.25 1.34 delta 5.25 1.34 gamma 6.50 1.34 tx_txt A 5.75 1.34 B 5.75 1.34 C 9.00 1.34 D 3.50 1.34 Residual Plots for y Comments: --------- The analysis shows no significant effect of treatments, and nor any significant effects of any of the blocking factors. With only 3 degrees of freedom for error, the residuals are not of any use for model checking (only four different values are taken). Note that the power of the F-tests is also rather low with so few degrees of freedom for error, and we should perhaps not consider the treatment effects as totally non-interesting. The means show the biggest difference between C an D, the two types of glass substrates. It is not clear whether that would be an expected or unexpected result. One question is whether to refit the model without the non-significant factors (excluding the treatment, of course, which is the factor of primary interest). Usually this is not worthwhile because the partial and sequential sum of squares coincide (so that results would be unchanged). However, with only 3 degrees of freedom for error, one potential and important advantage could be to increase the degrees of freedom for error (pooling). Because the non-significant effects here are both quite small, pooling will also decrease the estimated error variance, which also leads to stronger effects of the remaining factors. We explore the effects of pooling all blocking effects into error, thereby effectively ignoring the statistical design completely (and reducing the analysis to a one-way ANOVA). MTB > OneWay; SUBC> Response 'y'; SUBC> Categorical 'tx_txt'; SUBC> IType 0; SUBC> Fisher 5; SUBC> TGrouping; SUBC> TMTest; SUBC> GIntPlot; SUBC> GFourpack; SUBC> TExpand; SUBC> TMethod; SUBC> TFactor; SUBC> TANOVA; SUBC> TSummary; SUBC> TMeans; SUBC> Nodefault. One-way ANOVA: y versus tx_txt Method Null hypothesis All means are equal Alternative hypothesis At least one mean is different Significance level a = 0.05 Equal variances were assumed for the analysis. Factor Information Factor Levels Values tx_txt 4 A, B, C, D Analysis of Variance Source DF Seq SS Contribution Adj SS Adj MS F-Value P-Value tx_txt 3 61.50 50.41% 61.50 20.500 4.07 0.033 Error 12 60.50 49.59% 60.50 5.042 Total 15 122.00 100.00% S R-sq R-sq(adj) PRESS R-sq(pred) 2.24537 50.41% 38.01% 107.556 11.84% Means tx_txt N Mean StDev 95% CI A 4 5.75 2.22 ( 3.30, 8.20) B 4 5.75 3.30 ( 3.30, 8.20) C 4 9.000 1.633 (6.554, 11.446) D 4 3.500 1.291 (1.054, 5.946) Pooled StDev = 2.24537 Fisher Pairwise Comparisons Grouping Information Using the Fisher LSD Method and 95% Confidence tx_txt N Mean Grouping C 4 9.000 A B 4 5.75 A B A 4 5.75 A B D 4 3.500 B Means that do not share a letter are significantly different. Fisher Individual Tests for Differences of Means Difference Difference SE of Adjusted of Levels of Means Difference 95% CI T-Value P-Value B - A 0.00 1.59 (-3.46, 3.46) 0.00 1.000 C - A 3.25 1.59 (-0.21, 6.71) 2.05 0.063 D - A -2.25 1.59 (-5.71, 1.21) -1.42 0.182 C - B 3.25 1.59 (-0.21, 6.71) 2.05 0.063 D - B -2.25 1.59 (-5.71, 1.21) -1.42 0.182 D - C -5.50 1.59 (-8.96, -2.04) -3.46 0.005 Simultaneous confidence level = 81.57% Interval Plot of y vs tx_txt Residual Plots for y Comments: --------- The change in results is remarkable, with the treatment factor now becoming significant. The only significant pairwise comparison between treatments is between C and D (the two glass substrates), and this comparison would also be significant after Bonferroni adjustment (i.e., multiplying all P-value by 6). The residual plots look good. It is essentially impossible to determine from the data whether the analysis without the blocking factors is valid. The best conclusion is perhaps that the study shows an indication of treatment differences and suggests further investigation of treatment effects, preferably in an experiment with greater power than the 4x4 Graeco-Latin design.