Supplementary Exercise 2.59 of IPS7e ------------------------------------ Data on reaction time (with computer mouse) and distance on computer screen for 40 trials carried out by one subject, 20 with each hand. The reaction time is a response variable whereas the distances between successive points on the screen are controlled in the trial (determined by the software). The hand used is an explanatory variable. (a) Minitab commands for requested plot: MTB > WOpen "H:\VHM\VHM801\Datasets\Minitab\Chapter 2\ex02_059.mtw". Retrieving worksheet from file: 'H:\VHM\VHM801\Datasets\Minitab\Chapter 2\ex02_059.mtw' Worksheet was saved on 07/11/2014 MTB > Plot 'time'*'dist'; SUBC> Symbol 'hand'. Scatterplot of time vs dist (b) Interpretation of plot: - The right-hand points lie below the left-hand points. This means that the right-hand times are shorter, so the subject is right-handed. - The left-hand points show a wide scatter with perhaps a slight tendency towards larger reaction times with larger distances. - The right-hand points are squeezed together at the bottom of the plot, so it is difficult to see any patterns; however, there does not seem to be any clear dependence of reaction times on distance. (c) Minitab command for plot with two regressions (within same worksheet): MTB > Plot 'time'*'dist'; SUBC> Symbol 'hand'; SUBC> Regress 'hand'. Scatterplot of time vs dist It is seen that the slopes are quite different, steeper for the left hand. The two regression lines can be read of the plot by moving the pointer to each of the lines. Realistically it is more practical to do separate analyses for the left-hand and right-hand data, and for this we will split the Minitab worksheet by the variable hand. MTB > Split; SUBC> NoMatrices; SUBC> NoConstants; SUBC> By 'hand'. Results for hand = left Results for hand = right Results for: ex02_059.mtw(hand = right) MTB > Fitline 'time' 'dist'; SUBC> GFourpack; SUBC> RType 1; SUBC> Confidence 95.0. Regression Analysis: time versus dist The regression equation is time = 99.36 + 0.02831 dist S = 8.06850 R-Sq = 9.3% R-Sq(adj) = 4.2% Analysis of Variance Source DF SS MS F P Regression 1 119.94 119.939 1.84 0.191 Error 18 1171.81 65.101 Total 19 1291.75 Fitted Line: time versus dist Results for: ex02_059.mtw(hand = left) MTB > Fitline 'time' 'dist'; SUBC> GFourpack; SUBC> RType 1; SUBC> Confidence 95.0. Regression Analysis: time versus dist The regression equation is time = 171.5 + 0.2619 dist S = 71.1125 R-Sq = 10.1% R-Sq(adj) = 5.1% Analysis of Variance Source DF SS MS F P Regression 1 10266 10266.1 2.03 0.171 Error 18 91026 5057.0 Total 19 101292 Fitted Line: time versus dist Comments: --------- The statistical models (for left-hand and right-hand data) are linear regression models: time_i = beta0 + beta1 * dist_i + eps_i where the errors (eps_i) are i.i.d. from N(0,sigma), and the model parameters (beta0, beta1, sigma) are different for left and right hands. The fitted line plots for the two subsets look quite similar when viewed on their own scale: the points are scattered widely around a line with a slightly increasing slope. None of the separate regressions correspond to a significant association between distance and time (P=0.17 and P=0.19). The following table summarizes the results: hand intercept slope stand.dev. R^2 left 171.5 0.262 71.1 10.1% right 99.4 0.0283 8.07 9.3% The estimated line for the left hand is steeper and the points have a much larger spread about the line, both by a factor of about 10 relative to the right hand. The intercept for the left-hand line is about twice that of the right-hand line, corresponding to the reaction time for a point that requires no movement of the mouse. The two regressions have similar strength of association when assessed by the P-value for the test for no association (this comparison makes sense because the two subsets are of the same size) and also by the R^2, the proportion of variance explained (the numerical measure requested). To describe the R^2 as a measure of the "success of the regression" is in my view poor terminology. Another potential measure of the predictive ability of the regression is the spread about the regression line, but this measure is difficult to compare between the two datasets because the left-hand values by themselves are far more variable. (d) The 4-in-1 panels of residual plots produced in (c) included in the lower right corner a plot of the residuals against observation order. No obvious increasing or decreasing trends are seen in these plots, neither for the left nor for the right hand. We therefore conclude that none of the suggested systematic trends are visible in the data.