Extra Exercise 21
-----------------
(continuation of Supplementary Exercises 10.33 and 10.7)

Consider the CRP and retinol data from Supplementary Exercise 10.33. In
that exercise we used a linear regression to predict retinol from CRP.
We noted however that both variables were responses, so it would be
meaningful to compute a correlation coefficient. From the descriptive
analysis for the two variables it was clear however that assuming normal
distributions is quite unreasonable. Therefore, the nonparametric rank
correlation (Spearman's) is suggested.

Minitab commands and output for Spearman's rank correlation, both by 
using the Stat-Basic-Correlation menu and by manually computing the ranks:

MTB > WOpen "R:\Chapter 10\ex10_033.mtw".
Retrieving worksheet from file: ‘R:\Chapter 10\ex10_033.mtw’
Worksheet was saved on 07/11/2014

MTB > Correlation  'retinol' 'crp'.
Correlation: retinol, crp 

Pearson correlation of retinol and crp = -0.327
P-Value = 0.039

MTB > Correlation  'retinol' 'crp';
SUBC>   Spearman.
Spearman Rho: retinol, crp 

Spearman rho for retinol and crp = -0.348
P-Value = 0.028

MTB > name c3 'rankret'
MTB > Rank 'retinol' 'rankret'.
MTB > name c4 'rankcrp'
MTB > Rank 'crp' 'rankcrp'.
MTB > Correlation 'rankret' 'rankcrp'.
Correlations: rankret, rankcrp 

Pearson correlation of rankret and rankcrp = -0.348
P-Value = 0.028

MTB > Correlation  'retinol' 'crp'.
Correlations: retinol, crp 

Pearson correlation of retinol and crp = -0.327
P-Value = 0.039

Comments:
---------
The Spearman rank correlation is -0.348, and with n>30 we can assess its
significance by the usual t-test for which Minitab gives P=0.028. This is 
exactly what the Minitab menu does (regardless of n, which is a poor 
approximation for small n). By contrast, the Pearson correlation is 
-0.327 with P=0.039.

It is seen that the two estimates are pretty similar, despite the 
concerns about violations of the model assumptions noted in
Supplementary Exercise 10.33. 

------

Consider now the golf data from Supplementary Exercises 2.2 and 10.7. 

Minitab commands and output:

MTB > WOpen "R:\Chapter 2\ex02_002.mtw".
Retrieving worksheet from file: ‘R:\Chapter 2\ex02_002.mtw’
Worksheet was saved on 15/11/2014

MTB > Correlation  'round1' 'round2'.
Correlation: round1, round2 

Pearson correlation of round1 and round2 = 0.687
P-Value = 0.014

MTB > Correlation  'round1' 'round2';
SUBC>   Spearman.
Spearman Rho: round1, round2 

Spearman rho for round1 and round2 = 0.669
P-Value = 0.017

MTB > name c4 'rank1'
MTB > Rank 'round1' 'rank1'.
MTB > name c5 'rank2'
MTB > Rank 'round2' 'rank2'.
MTB > Correlation  'rank1' 'rank2';
SUBC>   NoPValues.
Correlations: rank1, rank2 

Pearson correlation of rank1 and rank2 = 0.669

Comments:
---------
The Pearson and Spearman correlations are quite similar. The P-value for
testing rho=0 based on the Spearman correlation should use the table
with critical values mentioned in the text (and *not* the Minitab P-value). 
The observed r=0.67 for n=12 corresponds to a P-value between 0.02 and 0.05, 
somewhat higher than for the Pearson correlation. This test is less powerful 
than the t-test for the Pearson correlation, but it also has less assumptions. 

We explore the impact of the two extreme observations (Player 7 and 8)
by giving the Pearson and Spearman correlations for different subsets of
the data:

dataset                 Pearson corr   Spearman corr
-----------------------------------------------------
full                       0.687          0.669
without player 8           0.842          0.715
without player 7           0.550          0.606
without players 7&8        0.661          0.620

It is seen that Spearman correlation is indeed less sensitive to the
omission of extreme observations.