Extra exercise 19
-----------------

1. The study is observational. Some samples are from a controlled group
of animals; other samples are collected from free-ranging animals over
a 5-year period with little information available about the animals,
and therefore about the population they might represent. No matching
of the two groups is described (e.g. one could imagine a matching on
sampling times). It seems unlikely that any blinding of the analyses
was undertaken. 

2. Analysis for Figure 1.
The design is two independent samples. It appears that n=20 in both the
free-range and the captive group. The outcome is the concentration of
corticoids in feces material. The figure gives means and standard
errors. Apparently both the mean and standard error (and hence also the
standard deviation, as n=20 in both groups) are larger in the captive
group. The paper reports a Wilcoxon signed rank test, corresponding to
a matched pairs design. This is probably a misunderstanding of the
nomenclature because it seems unlikely that a matched pairs analysis
was carried out for two independent samples. The correct name would then
be a Wilcoxon (or Mann-Whitney) two-sample test. Another point that can
be questioned is the rationale behind choosing a nonparametric method. 
Concentrations often show well-behaved distributions (possibly after 
transformation). The results are presented in terms of the means although 
the nonparametric analysis focuses on the medians.

2. Analysis behind Figure 2.
The design is three independent samples (free-ranging animals of unknown
sex, and captive animals of both sexes). The statistical analysis is
again described as Wilcoxon's signed rank test, and similar criticism as
above applies. However, it also seems that the analysis consisted in
pairwise two-sample analyses (e.g. free-ranging animals versus captive
males, and free-ranging versus captive females). This approach to data
with multiple groups is wrong; one should first assess the overall
hypothesis of no difference between all groups, and then if the overall
test is significant continue with pairwise comparisons between all or
selected pairs of groups (as described in, e.g., Exercise 15.47 of
IPS7e).