Solution file for GO Problems 3.2, 4.2 and 5.1 ---------------------------------------------- Data: measurements of longevity (in days) of male fruit flies subjected to 5 different reproduction conditions. A total of 125 male fruit flies were randomly distributed onto 5 groups that differed in the exposure to females. Group 1 had no exposure to females; Groups 2 and 4 were daily exposed to 1 and 8 pregnant (therefore unreceptive) females, respectively. Groups 3 and 5 were daily exposed to 1 and 8 virgin (therefore receptive) females, respectively. The data constitute 5 independent samples with continuous outcome (although recorded in days, but this discretisation should not be serious because the data contain a wide range of days), and the model immediately suggested is a one-way ANOVA. The experiment constitutes a completely randomized design with 5 groups. Problem 3.2: ------------ We compute per group descriptive summaries and run the one-way ANOVA analysis, including checks of the assumptions of normality and same standard deviations in the groups. MTB > WOpen "H:\VHM\VHM802\Data_csv\ch03pr2.csv"; SUBC> FType; SUBC> CSV; SUBC> DecSep; SUBC> Period; SUBC> Field; SUBC> Comma; SUBC> TDelimiter; SUBC> DoubleQuote. Retrieving worksheet from file: ‘H:\VHM\VHM802\Data_csv\ch03pr2.csv’ Worksheet was saved on 14/02/2011 MTB > GSummary 'longev'; SUBC> By 'compan'. Results for compan = 1 pregnant Summary Report for longev (compan = 1 pregnant) Results for compan = 1 virgin Summary Report for longev (compan = 1 virgin) Results for compan = 8 pregnant Summary Report for longev (compan = 8 pregnant) Results for compan = 8 virgin Summary Report for longev (compan = 8 virgin) Results for compan = none Summary Report for longev (compan = none) MTB > OneWay; SUBC> Response 'longev'; SUBC> Categorical 'compan'; SUBC> IType 0; SUBC> GMCI; SUBC> GIntPlot; SUBC> GFourpack; SUBC> TMethod; SUBC> TFactor; SUBC> TANOVA; SUBC> TSummary; SUBC> TMeans; SUBC> Nodefault. One-way ANOVA: longev versus compan Method Null hypothesis All means are equal Alternative hypothesis At least one mean is different Significance level a = 0.05 Equal variances were assumed for the analysis. Factor Information Factor Levels Values compan 5 1 pregnant, 1 virgin, 8 pregnant, 8 virgin, none Analysis of Variance Source DF Adj SS Adj MS F-Value P-Value compan 4 11939 2984.8 13.61 0.000 Error 120 26314 219.3 Total 124 38253 Model Summary S R-sq R-sq(adj) R-sq(pred) 14.8081 31.21% 28.92% 25.36% Means compan N Mean StDev 95% CI 1 pregnant 25 63.56 16.45 (57.70, 69.42) 1 virgin 25 64.80 15.65 (58.94, 70.66) 8 pregnant 25 56.76 14.93 (50.90, 62.62) 8 virgin 25 38.72 12.10 (32.86, 44.58) none 25 63.36 14.54 (57.50, 69.22) Pooled StDev = 14.8081 Interval Plot of longev vs compan Residual Plots for longev Comments: --------- We first note that no problems with the model assumptions could be found. The within-group distributions look fairly normal (and all normality tests are non-significant). The residual plots look very nice for a dataset of this size, and the standard deviations are quite close. The ANOVA table shows a strongly significant difference between groups, despite a fairly low R^2 value. The estimated means and the graphical representation of confidence intervals suggest that group 5 (8 virgins) differs significantly from all other groups which seem pretty close. The non-overlapping confidence intervals with group 5 shows that t-tests unadjusted for multiple comparisons would all be significant when comparing group 5 to the other groups. The almost totally overlapping intervals of groups 1-3 (where estimates are inside the other intervals) shows that there is no significant difference between these groups. Problem 4.2: ------------ A set of orthogonal contrasts can be set up in many ways, the following seemed the most natural to me based on the description of the groups. For comparison, the strongest (simple) contrast in the data, between group 5 and the others, is also included although it is not part of the orthogonal set. Contrast Interpretation Coefficients -------------------------------------------------- company contact to females 4 -1 -1 -1 -1 receptive pregnant vs virgins 0 1 -1 1 -1 # pregnant 1 vs 8 pregnant 0 1 0 -1 0 # virgins 1 vs 8 virgins 0 0 1 0 -1 group 5 group 5 vs others -1 -1 -1 -1 4 The first four contrasts are pairwise orthogonal, because for any pair of them the sum of products of coefficients is zero. For example, for contrasts 1 and 2: 4*0 + (-1)*1) + (-1)*(-1) + (-1)*1 + (-1)*(-1) = 0 or for contrasts 2 and 3: 0*0 + 1*1 + (-1)*0 + 1*(-1) + (-1)*0 = 0. Contrast Estimate SE SS SS(%) t P(t) F(Schef) P(Schef) ------------------------------------------------------------------------------- company 29.6 13.2 1095.2 9.2 2.23 0.027 1.249 0.294 receptive 16.8 5.92 1764.0 14.8 2.84 0.005 2.011 0.097 # pregnant 6.8 4.19 578.0 4.8 1.62 0.107 0.659 0.621 # virgins 26.1 4.19 8502.1 71.2 6.23 0.000 9.693 0.000 group 5 -93.6 13.2 10951.2 91.7 -7.07 0.000 12.49 0.000 ------------------------------------------------------------------------------- formulae: SS=(estimate^2)/[(w_1^2+...+w_5^2)/25] (or t^2*MSE) SS (%) = SS / SSTrT (SSTrT=11939.28) t=Est/SE=sqrt(SS/MSE) (MSE=219.28) P(t) ~ t(120) F(Scheffe)=SS/4/MSE (or t^2/4) P(Scheffe) ~ F(4,120) If the contrasts were pre-planned, the P-values from the t-test could be used without adjustment (unless there was concern about carrying out 4 tests). This would make 3 out of the first 4 contrasts significant. However, it's clear from looking at the results that these contrasts don't represent the real effect well. The last contrast, developed by inspecting the data (means), reflects the pattern in the data, and is even with the Scheffe test strongly significant. This is probably the best contrast representation of the difference between groups. Problem 5.1: ------------ We rerun the model as a General Linear Model to get easy access to Bonferroni corrected multiple comparisons. MTB > GLM; SUBC> Response 'longev'; SUBC> Nodefault; SUBC> Categorical 'compan'; SUBC> Terms compan; SUBC> TMethod; SUBC> TAnova; SUBC> TSummary; SUBC> TCoefficients; SUBC> TEquation; SUBC> TFactor; SUBC> TDiagnostics 0. General Linear Model: longev versus compan ... Analysis of Variance Source DF Adj SS Adj MS F-Value P-Value compan 4 11939 2984.8 13.61 0.000 Error 120 26314 219.3 Total 124 38253 ... MTB > Compare 'longev'; SUBC> Pairwise compan; SUBC> Bonferroni; SUBC> GIntPlot; SUBC> NoDefault; SUBC> TGrouping; SUBC> TMTest. Comparisons for longev Bonferroni Pairwise Comparisons: Response = longev, Term = compan Grouping Information Using the Bonferroni Method and 95% Confidence compan N Mean Grouping 1 virgin 25 64.80 A 1 pregnant 25 63.56 A none 25 63.36 A 8 pregnant 25 56.76 A 8 virgin 25 38.72 B Means that do not share a letter are significantly different. Bonferroni Simultaneous Tests for Differences of Means Difference SE of Simultaneous Adjusted Difference of compan Levels of Means Difference 95% CI T-Value P-Value 1 virgin - 1 pregnant 1.24 4.19 (-10.74, 13.22) 0.30 1.000 8 pregnant - 1 pregnant -6.80 4.19 (-18.78, 5.18) -1.62 1.000 8 virgin - 1 pregnant -24.84 4.19 (-36.82, -12.86) -5.93 0.000 none - 1 pregnant -0.20 4.19 (-12.18, 11.78) -0.05 1.000 8 pregnant - 1 virgin -8.04 4.19 (-20.02, 3.94) -1.92 0.573 8 virgin - 1 virgin -26.08 4.19 (-38.06, -14.10) -6.23 0.000 none - 1 virgin -1.44 4.19 (-13.42, 10.54) -0.34 1.000 8 virgin - 8 pregnant -18.04 4.19 (-30.02, -6.06) -4.31 0.000 none - 8 pregnant 6.60 4.19 ( -5.38, 18.58) 1.58 1.000 none - 8 virgin 24.64 4.19 ( 12.66, 36.62) 5.88 0.000 Individual confidence level = 99.50% Bonferroni Simultaneous 95% CIs Comments for Bonferroni method: ------------------------------- The results from the multiple comparisons are very clear: group 5 differs significantly from all other groups, the differences among which on the other hand are nowhere near significant. Thus the letter coding would be: 5b 4a 1a 2a 3a Holm method: ------------ We start by rearranging the list above in increasing order of P-values (or equivalently from more to less extreme t-values): Difference SE of Adjusted group of Means Difference T-Value P-Value 5 - 3 -26.08 4.188 -6.227 0.0000 5 - 2 -24.84 4.188 -5.931 0.0000 5 - 1 -24.64 4.188 -5.883 0.0000 5 - 4 -18.04 4.188 -4.307 0.0003 4 - 3 -8.04 4.188 -1.920 0.5728 4 - 2 -6.80 4.188 -1.624 1.0000 4 - 1 -6.60 4.188 -1.576 1.0000 3 - 1 1.44 4.188 0.344 1.0000 3 - 2 1.24 4.188 0.296 1.0000 2 - 1 0.20 4.188 0.048 1.0000 With a total of 10 comparisons, the Bonferroni-adjusted P-values above have all been multiplied by 10. The Holm method will retrieve the original P-values and then multiply those by 10,9,8,...,1 from top to bottom row. The first four rows will still have significant P-values with the Holm adjustment because the P-values will be smaller (except for the first row where the P-value is the same) than those listed above, and they were already significant with the Bonferroni method. For row five we get the Holm adjusted P-value as (0.5728/10)*6=0.34. This means that the comparison in row 5 is non-significant and hence all subsequent rows are non-significant as well by the Holm method. The significant comparisons are therefore the same as for the Bonferroni method.