Extra exercise 19 ----------------- 1. The study is observational. Some samples are from a controlled group of animals; other samples are collected from free-ranging animals over a 5-year period with little information available about the animals, and therefore about the population they might represent. No matching of the two groups is described (e.g. one could imagine a matching on sampling times). It seems unlikely that any blinding of the analyses was undertaken. 2. Analysis for Figure 1. The design is two independent samples. It appears that n=20 in both the free-range and the captive group. The outcome is the concentration of corticoids in feces material. The figure gives means and standard errors. Apparently both the mean and standard error (and hence also the standard deviation, as n=20 in both groups) are larger in the captive group. The paper reports a Wilcoxon signed rank test, corresponding to a matched pairs design. This is probably a misunderstanding of the nomenclature because it seems unlikely that a matched pairs analysis was carried out for two independent samples. The correct name would then be a Wilcoxon (or Mann-Whitney) two-sample test. Another point that can be questioned is the rationale behind choosing a nonparametric method. Concentrations often show well-behaved distributions (possibly after transformation). The results are presented in terms of the means although the nonparametric analysis focuses on the medians. 2. Analysis behind Figure 2. The design is three independent samples (free-ranging animals of unknown sex, and captive animals of both sexes). The statistical analysis is again described as Wilcoxon's signed rank test, and similar criticism as above applies. However, it also seems that the analysis consisted in pairwise two-sample analyses (e.g. free-ranging animals versus captive males, and free-ranging versus captive females). This approach to data with multiple groups is wrong; one should first assess the overall hypothesis of no difference between all groups, and then if the overall test is significant continue with pairwise comparisons between all or selected pairs of groups (as described in, e.g., Exercise 15.47 of IPS7e).