Supplementary Exercises 8.84 and 8.85 of IPS7e
----------------------------------------------

n=100 employees asked whether work stress has a negative impact on personal life. 
X=number of those employees answering yes; we observe X=68. 
Assume X to follow B(100,p). (binomial setting)

We first compute the estimates:
  sample proportion: p_hat = X/n = 0.68, 
  standard error of p_hat: sqrt(p_hat*(1-p_hat)/n) = 0.04665.

8.84:
-----
The condition for use of the classical (normal distribution approximation) method 
for the confidence interval is clearly satisfied because there the data contain
more than 15 positives and 15 negatives. For illustration, the other methods 
for computing a confidence interval are included as well, although in this 
situation the classical interval is acceptable.

    classical approximate 95% CI for p: p_hat +- 1.96*0.04665 
                                        0.680 +- 0.0914 = (0.589,0.771)
    plus four approximate method:
      p_tilde = (X+2)/(n+4) = 70/104 = 0.673, 
      SE(p_tilde) = sqrt(p_tilde*(1-p_tilde)/(n+4)) = 0.045998
      95% CI for p: p_tilde +- 1.96*0.045998 = 0.673 +- 0.0902 = (0.583,0.763)

    "exact" 95% CI for p: (0.579,0.770), from Minitab

The CIs are similar but not exactly the same. The plus four CI is not 
symmetrical around the estimate (it's symmetrical around p_tilde). The "exact"
interval is seen to be wider than the other two, by approximately 0.01,
reflecting that it is conservative.

Minitab commands for the confidence intervals:

---

 POne 100 68;
  Confidence 95.0;
  Alternative 0;
  UseZ.

Sample   X    N  Sample p         95% CI
1       68  100  0.680000  (0.588572, 0.771428)

Using the normal approximation.


 POne 104 70;
  Confidence 95.0;
  Alternative 0;
  UseZ.

Sample   X    N  Sample p         95% CI
1       70  104  0.673077  (0.582923, 0.763231)

Using the normal approximation.


MTB > POne 100 68;
SUBC>   Confidence 95.0;
SUBC>   Alternative 0.

Sample   X    N  Sample p         95% CI
1       68  100  0.680000  (0.579233, 0.769780)

---


8.85:
-----
The national survey had 75% of respondents answering yes. Because that
survey was large it may be acceptable to assume there is no error
associated with that estimate, and hence treat it as a fixed value. This
means we will be testing the hypotheses
  H0: p=0.75 and Ha: p<>0.75
based on the single sample. If data from the large survey were
available, it would have been appropriate to use methods for two
independent samples (proportions).

The conditions for use of the classical z-test are met here because 
  100*0.75=75 >= 10, and 100*(1-0.75)=25 >= 10.

The calculation goes as follows:
  z = (0.68-0.75) / sqrt(0.75*0.25/100) = -1.62
  P = 2*P(z>1.62) = 0.106.
We therefore conclude that there is not sufficient evidence to reject
the null hypothesis, and the proportion of stressed employees at the 
restaurant could very well match the nationwide level.

For illustration purposes, we also compute P-values for an exact test
based on the binomial distribution. This method is generally preferable
and truly exact, but unless the tested proportion equals 0.5 there is no
uniform rule for how to compute two-sided P-values.

The simplest rule is to double the one-sided P-value:
  P = 2*P(X<=68) = 2*0.0693 = 0.139.
Minitab computes the 2-sided P-value by adding probabilities for
outcomes equally far from the hypothesized value on both sides:
  P = P(X<=68) + P(X>=82) = 0.0693+0.0630 = 0.132.
Stata and R compute the 2-sided P-value by adding probabilities not larger than
that of the observed count on both sides:
  P = P(X<=68) + P(X>=83) = 0.0693+0.0376 = 0.107.

It is seen that among the exact P-values, the method used in Stata and R
(which is considered more accurate) agrees well with the z-test, due to
the fairly large sample size.

Minitab commands for the tests:

---

 POne 100 68;
  Test .75;
  Confidence 95.0;
  Alternative 0;
  UseZ.

Test of p = 0.75 vs p not = 0.75

Sample   X    N  Sample p         95% CI         Z-Value  P-Value
1       68  100  0.680000  (0.588572, 0.771428)    -1.62    0.106

Using the normal approximation.


 POne 100 68;
  Test .75;
  Confidence 95.0;
  Alternative 0.

Test of p = 0.75 vs p not = 0.75
                                                   Exact
Sample   X    N  Sample p         95% CI         P-Value
1       68  100  0.680000  (0.579233, 0.769780)    0.132

---