Supplementary Exercises 7.91 and 7.93 of IPS7e
----------------------------------------------

Data: Voice onset time (COT) for 6-year-old children and adults when
pronouncing the work "bees". The data include 10 children and 20 adults.

Model: the 2 samples are independent and each a simple random sample 
(i.i.d. sample) from a distribution with unknown mean and standard
devation (mu1 and sigma1 for the children, mu2 and sigma2 for the adults).

Estimates:
                 children    adults
n                     10         20
sample mean        -3.67     -23.17
sample s           33.89      50.74
standard error     10.72      11.35 , computed as s/sqrt(n)


7.91:
-----
(a)
The SEs for children and adults are given above. All calculations, 
including also the standard error for the difference between VOT means 
for children and adults, will in this exercise be computed without 
any assumptions on the variances (i.e., without assuming equal
variances); for discussion hereof see 7.93.

SE(mean diff.) = sqrt(s1^2/n1 + s2^2/n2) = sqrt(10.72^2 + 11.35^2)
               = sqrt(243.58) = 15.61

(b)
The null hypothesis is H0: mu1=mu2. There is nothing in the statement of
the context of the data to suggest a one-sided alternative, so we use
Ha: mu1<>mu2. The calculation of the test statistic is done using Minitab.

MTB > TwoT 10 -3.67 33.89 20 -23.17 50.74;
SUBC>   Confidence 95.0;
SUBC>   Test 0.0;
SUBC>   Alternative 0.
Two-Sample T-Test and CI 
                            SE
Sample   N   Mean  StDev  Mean
1       10   -3.7   33.9    11
2       20  -23.2   50.7    11

Difference = mu (1) - mu (2)
Estimate for difference:  19.5
95% CI for difference:  (-12.6, 51.6)
T-Test of difference = 0 (vs not =): T-Value = 1.25  P-Value = 0.223  DF = 25

Comments:
---------
The P-value from a t-distribution with 25 df is 0.223, and thus totally
non-significant. There is no evidence to indicate that the VOT means for
children and adults differ.

(c)
The Minitab listing also includes a 95% confidence interval for the mean
difference:   mu1-mu2: (-12.6,51.6)
As stated in the question, we would know from the non-significant 
P-value from (b) that the CI contains 0. This is because a test based 
on the confidence interval would be significant if 0 was not included 
in the interval, and we know the test should be non-significant.


7.93:
-----
Because the pooled variance t-test (based on assuming equal variances)
is not part of the VHM 801 course syllabus, we confine ourselves to
showing the Minitab listing and give interpretations.

MTB > TwoT 10 -3.67 33.89 20 -23.17 50.74;
SUBC>   Confidence 95.0;
SUBC>   Test 0.0;
SUBC>   Alternative 0;
SUBC>   Pooled.
Two-Sample T-Test and CI 
                            SE
Sample   N   Mean  StDev  Mean
1       10   -3.7   33.9    11
2       20  -23.2   50.7    11

Difference = mu (1) - mu (2)
Estimate for difference:  19.5
95% CI for difference:  (-17.0, 56.0)
T-Test of difference = 0 (vs not =): T-Value = 1.09  P-Value = 0.283  DF = 28
Both use Pooled StDev = 46.0020

Comments:
---------
The listing gives the pooled standard deviation as s=46.002, between 
the two sample standard deviations and clearly closest to the 
standard deviation for the largest sample (the largest sample has
highest weight). From this value we can (re)compute the SE for the 
mean difference:

SE(mean diff.) = s*sqrt(1/n1 + 2/n2) = 46.002*sqrt(1/10 +1/20) = 17.82

This value differs a bit from the 15.61 we computed in 7.91. This is
because the largest sample standard deviation came from the largest
sample, and therefore it has less impact without pooling the standard
deviation (it would be the opposite effect if the the largest s came
from the smallest sample).

Because of the larger SE, the test statistic drops down to 1.09, but
this does not change the conclusion substantially. The pooled df was 
28 (=10+20-2) and thus a bit larger than the approximate df used in 
Exercise 7.91, but the biggest difference between the two procedures is 
the estimated standard error. It seems intuitively clear that the best 
estimate for the SE comes from the unpooled procedure.