Assignment III for Biostats Course VHM 801 at AVC - Fall semester 2018
The assignment is worth 10% of the final course mark.
Please be aware that by handing
in the home assignment you implicitly acknowledge to have read and accepted
the instructions for home assignments as described
on the VHM 801 homepage.
Twin pair studies are frequently used to explore genetic hypotheses and effects.
For this assignment, the question of interest is whether tobacco smoking habits have a genetic component. For this purpose
information in a database on twins was retrieved, and a number of twin pairs were classified as having
different or the same smoking habits. Here smoking
habits include both whether each individual was a smoker or not, and in the former case also the type
of smoking (e.g., cigarettes, cigars or pipe). Additionally, the twin pairs were classified genetically
as dizygotic or monozygotic, and among the latter also according to whether the twins were separated at birth
or brought up together.
| Type of twin pair
|
|---|
| Smoking habits | dizygotic | monozygotic, separated | monozygotic, joint
|
|---|
| same | 9 | 23 | 21
|
|---|
| different | 9 | 4 | 5
|
|---|
The counts in the table are available as a data set in Minitab format
and as a comma-separated file, for import into Stata and other statistical software.
(Hint: The data format is suited for the first question below, but for subsequent questions
you may need to modify the data format or enter the data anew.)
The home assignment has four questions (a)-(d) which should all be answered. In general,
the assumptions of every
statistical procedure used should be stated (formally or informally) and checked
(where possible), and every statistical analysis should
be summarized in a conclusion.
- As a first analysis, use these data to investigate whether any association (or dependence)
seems to exist between the likeness of smoking habits within a twin pair and the
type of twin pair, as classified above. Carry out a statistical test and draw conclusions. (Note: As we will further
discuss the findings in the context of genetics in the following questions, you may limit your
present conclusion to a statement in statistical terms.)
- Regardless of the results of your first analysis, we will proceed by focusing on whether
smoking habits are equally likely to be the same in dizygotic and monozygotic twin pairs (combining
all monozygotic pairs, irrespective of whether the twins were separated or not). Estimate for both dizygotic
and monozygotic twin pairs the probability of the twins having the same smoking habits, and supplement
these estimates with suitable 95% confidence intervals. Do your results indicate smoking habits
to be more alike in twin pairs with a stronger genetic similarity? Use a statistical test to
provide a measure of the evidence offered by the data towards this claim. Draw (first) conclusions
about any genetic impact on smoking habits.
- In continuation of the previous question, it is of interest to further explore whether the likeness
of smoking habits could be linked with the environment in which children were brought up. Use the data for
(only) monozygotic twin pairs to estimate a (statistical) parameter that could describe
a potential environmental effect, with an associated 95%
confidence interval, and use a statistical test to assess whether the data
show evidence of this parameter being relevant (i.e., non-zero).
Draw conclusions about any environmental impact on smoking habits.
- Finally discuss how the results from these data could be used to
form an argument against the "hypothesis" that smoking causes detrimental health effects (in particular
regarding lung cancer), based on observational studies that have shown associations
between smoking and such detrimental health effects. (Hint: It
may be helpful to explore this in terms of the diagrams/schematics
introduced in Session 2 to describe confounding.)
Henrik Stryhn
(hstryhn@upei.ca) 2018-11-01