Supplementary Exercise 9.62 of IPS7e ------------------------------------ Sketch of solution only (most detail for cats). A case-control design. Cases are cats brought to an animal shelter and controls are cats still in a private home. Cases and controls are selected randomly from the respective populations, and information is obtained about the source of the cat. For controls this must mean the source of the cat when it arrived in the private home (private/breeder, pet store, other). The only response variable is source of cat. The classification into cases and controls is an explanatory variable because it is fixed in advance of the study (it has been decided to sample a certain number of cats from the two populations). Model: 2 independent samples from multinomial distributions. We compare the distribution of sources within the random samples among cases and controls, respectively. Note that according to IPS terminology, the classification of explanatory and response variables would have been reversed because a relationship source -> present location (shelter/private) is anticipated. In my view (and not mine alone) this is misleading and confusing, and it may easily lead one to the wrong model for the data. There is no question that considering the data as 3 independent samples corresponding to the sources (columns) is wrong and meaningless! MTB > WOpen "H:\VHM\VHM801\Datasets\Minitab\Chapter 9\ex09_062.mtw". Retrieving worksheet from file: 'H:\VHM\VHM801\Datasets\Minitab\Chapter 9\ex09_062.mtw' Worksheet was saved on 24/10/2014 MTB > XTabs 'cc' 'source' 'animal'; SUBC> Layout 1 1; SUBC> Frequencies 'count'; SUBC> Counts; SUBC> RowPercents; SUBC> ChiSquare; SUBC> Expected; SUBC> DMissing 'cc' 'source' 'animal'. Tabulated Statistics: cc, source, animal Using frequencies in count Results for animal = cat Rows: cc Columns: source other petstore priv All cases 76 16 124 216 35.19 7.41 57.41 100.00 91.03 13.05 111.92 controls 203 24 219 446 45.52 5.38 49.10 100.00 187.97 26.95 231.08 All 279 40 343 662 42.15 6.04 51.81 100.00 Cell Contents: Count % of Row Expected count Pearson Chi-Square = 6.611, DF = 2, P-Value = 0.037 (Likelihood Ratio Chi-Square = 6.663, DF = 2, P-Value = 0.036) Comments and answers to questions: ---------------------------------- a) Estimation: the sample row proportions (proportions within each row) are given in the table above. We must use the row proportions because only the column variable is a response variable. The largest difference between the two samples is for the proportion of cats from other sources (35% among cases and 46% among controls), but also the proportion of cats from private sources differ considerably (57% among cases and 49% among controls). Hypothesis: H0: no difference in the source proportions for the cases and controls. Ha: some difference... Test: chi-square statistic X^2 = 6.61, df = (2-1)*(3-1) = 2, P-value = 0.037. Conclusion: we have some (weak) evidence to reject H0 - there is some difference in the proportions, the most notable ones described above. In words, less cats relinquished to animals shelter are from other sources and more from privates than in the overall population. This implies that a cat being from a private source is a risk factor for relinquishment, and being from other sources is a protective factor. It seems difficult to make sense of this finding; you may look up the paper to see a discussion of it. b) Minitab listing for dogs: Results for animal = dog Rows: cc Columns: source other petstore priv All cases 90 7 188 285 31.58 2.46 65.96 100.00 65.27 21.10 198.63 controls 142 68 518 728 19.51 9.34 71.15 100.00 166.73 53.90 507.37 All 232 75 706 1013 22.90 7.40 69.69 100.00 Cell Contents: Count % of Row Expected count Pearson Chi-Square = 26.939, DF = 2, P-Value = 0.000 (Likelihood Ratio Chi-Square = 29.190, DF = 2, P-Value = 0.000) Comments: --------- The X^2 test is clearly significant. There is not the same distribution of the sources of dogs in shelters than in the control population. Dogs in shelters are more like to come from Other sources (including stray dogs, born in shelter, and born in home). It seems natural that such dogs could be more difficult to deal with or be less wanted by their owners than dogs from private or pet store sources.