Assignment III for Biostats Course VHM 801 at AVC - Fall semester 2016
The assignment is worth 10% of the final course mark. Please be aware that by handing
in the home assignment you implicitly acknowledge to have read and accepted
the instructions for home assignments as described
on the VHM 801 homepage.
The assignment is a continuation of the first home
assignment on a study of women's back pain during pregnancy.
You may want to revisit the first home assignment as a preparation
for this assignment, and you will use the previously described and supplied dataset.
For your work here, you should generally pay attention to all issues and problems with the
dataset identified in the first home assignment (such as those described in the solution
posted on the VHM 801 homepage). Because the issues around
generalizability of the results to a suitable population cannot be
resolved analytically, we will for the purpose of this assignment
assume the existence of a suitable population of women (pregnancies) for which the sample is
representative.
A full mark requires satisfactory answers to the three questions below.
- The focus of the study is on the back pain severity scores.
Give a statistical model for these pain scores that does not differentiate
between the characteristics contained in the additional variables
recorded; that is, a model for the pain score in the absence of any
additional information about the woman and pregnancy. Estimate the
model's parameters, with corresponding 95% confidence intervals.
(Hint: It is allowed, although not necessary, to do this in several
steps that each focus on a single response category.)
- Carry out a statistical analysis for the association between
the back pain severity score and a categorical variable of your choice. Your
chosen categorical variable should involve either a plausible biological association
with back pain (for which an argument should be given) or
an association for which "interesting" findings are obtained. It is allowed
to generate a categorical variable from a continuous variable or to modify the
categories of an existing categorical variable, as long as such data
modifications are explained and justified. Irrespective of your choice of categorical variable,
state your statistical model and hypotheses carefully, explain your choice
of statistical procedure, draw conclusions from your analysis and
interpret your results.
- Carry out a statistical analysis for the association between
the back pain severity score and a quantitative and continuous response variable of your choice. Your
chosen continuous variable should involve a plausible biological association
(for which an argument should be given). If you categorized a continuous
variable in part 2, you should choose another continuous variable for
this part. Irrespective of your choice of continuous
variable, state your statistical model and hypotheses carefully, explain your choice
of statistical procedure, draw conclusions from your analysis and
interpret your results.
(Hint: As the pain scores are categorical, it is not
straightforward to model these as the outcome variable. The following
approach is valid, especially for exploratory purposes: use the categorical outcome
variable to define groups for which the continuous variable can be
compared. (As a general example, to assess an association between "age"
and a dichotomous outcome taking the values "yes" and "no", compare the
age distribution among subjects with outcome "yes" and the age distribution
among subjects
with outcome "no". A difference in the age distribution between the two groups
will then reflect an association between "age" and the dichotomous outcome.) For the four categories of pain scores, you may
either use methods to compare multiple samples (covered in Session
10 of VHM 801) or create a dichotomous outcome. In the latter case, the data
modification should again be explained and justified.)
Henrik Stryhn
(hstryhn@upei.ca) 2016-11-02