* do-file for Additional exercise 10.8, VHM 802
version 18 /* works also with versions 14-17 */
set more off
set scheme stcolor_alt
cd "r:\"

import delimited hs10_8.csv, clear
encode drug, gen(Drug)
encode hist0, gen(Hist0)
egen drughist0 = group(drug hist0)

* profile plots
sort dog time /* sorting required for overlay command */
overlay conc time, by(dog) c(l)
xtline conc, i(dog) t(time) overlay
* plots can be restricted to groups, e.g.
xtline conc if Drug==1 & Hist0==1, i(dog) t(time) overlay

* anova-based analysis (hierarchical model) for raw data
anova conc Drug##Hist0 / dog|Drug#Hist0 time##Drug##Hist0
* residuals analysis
predict pred, xb
predict stdres, rstandard
scatter stdres pred
qnorm stdres
swilk stdres

* residuals look terrible, try ln transformation
generate lnconc = ln(conc)
anova lnconc Drug##Hist0 / dog|Drug#Hist0 time##Drug##Hist0
* residuals analysis
predict pred2, xb
predict stdres2, rstandard
scatter stdres2 pred2
qnorm stdres2
swilk stdres2
* residuals look better, stick to ln transformed data
* as an addition, try box-cox transformation on fixed effects model
xi: boxcox conc i.dog i.time*i.drughist0
drop _I*
* points to lambda close to zero

* separate analyses for each time point
bysort time: anova lnconc Drug##Hist0
* alternative form with combined Drug & Hist0 factor
bysort time: oneway lnconc drughist0, tabulate

* response feature: range
preserve /* run next 4 lines together with this one */
collapse (min) minconc=lnconc (max) maxconc=lnconc (mean) Drug Hist0 drughist0, by(dog)
generate range = maxconc-minconc
anova range Drug##Hist0
oneway range drughist0, tabulate
restore

* response feature: slope 
* for demonstration purposes because the slope does not seem of interest here
preserve /* run next 12 lines together with this one */
anova lnconc dog dog#c.time
regress
sort drug hist0 
matrix define temp = e(b)'
matrix list temp
* from list of values in matrix, you need to determine the relevant ones
matrix define slope = temp[17..32,1]
* need to restrict dataset to one value per dog
keep if time==0
svmat slope
anova slope Drug##Hist0
oneway slope drughist0, tabulate
restore

* hierarchical model revisited for model checking
* slightly inaccurate due to the missing value
preserve /* run next 7 lines together with this one */
collapse (mean) lnconc, by(Drug Hist0 dog)
anova lnconc Drug##Hist0
predict pred, xb
predict stdres, rstandard
scatter stdres pred
qnorm stdres
swilk stdres

* likelihood-based analysis using mixed
mixed lnconc Drug##Hist0##time || dog:, reml
predict pred3, fitted
predict stdres3, rstandard
scatter stdres3 pred3
qnorm stdres3
swilk stdres3
predict pred_ref, reffects
bysort dog: gen within_n=_n
qnorm pred_ref if within_n==1
swilk pred_ref if within_n==1

* repeated measures ANOVA (only correct for balanced data)
anova lnconc drughist0 / dog|drughist0 time##drughist0, repeated(time)
anova lnconc Drug##Hist0 / dog|Drug#Hist0 time##Drug##Hist0, repeated (time)

* repeated measures correlation structures, using likelihood-based analysis
* first, equal time distances
generate eqtime=time-(time>1)-(time>3)
* ar(1)
mixed lnconc Drug##Hist0##time || dog:, reml res(ar 1, t(eqtime)) nocons
estat wcor
* random effects and ar(1)
mixed lnconc Drug##Hist0##time || dog:, reml res(ar 1, t(eqtime))
estat wcor
* Toeplitz
mixed lnconc Drug##Hist0##time || dog:, reml res(toeplitz, t(eqtime)) nocons
estat wcor
* next, using observed unequal time distances
* ar(1)
mixed lnconc Drug##Hist0##time || dog:, reml res(ar 1, t(time)) nocons
estat wcor
* random effects and ar(1)
mixed lnconc Drug##Hist0##time || dog:, reml res(ar 1, t(time))
estat wcor
* Toeplitz
mixed lnconc Drug##Hist0##time || dog:, reml res(toeplitz, t(time)) nocons
estat wcor
* unstructured (does not depend on time points)
mixed lnconc Drug##Hist0##time || dog:, reml res(uns, t(time)) nocons
estat wcor

