*************************************************************
*Practical 1 Exercise: 
*Discrete-Time Models of the Time to a Single Event
*************************************************************

set more off

use ncds, clear

***Data preparation: the person-period file

*Calculate duration (in number of years from age 16 to 1st partnership from t=1)

gen dur=age1st-16+1

*Generate dur records for each individual

expand dur
sort id

*Create t and binary response for each year y
*y=1 for the last record for uncensored cases and y=0 otherwise

by id: gen t=_n
gen age=t+15
gen y=0
replace y=1 if (age==age1st & event==1)

*Create time-varying covariate fulltime

gen fulltime=1
replace fulltime=0 if age>ageleft

*Create region dummies

tab region, gen(reg)

*Keep observations on males only

keep if female==0 

***Fitting a quadratic in age (with region dummies as covariates)

gen tsq=t*t
logit y t tsq fulltime reg2 reg3 reg4

*Test for differences by region

test reg2 reg3 reg4 

*Calculate predicted probability of an event (discrete-time hazard)

predict phaz, pr

*Plot hazard by region (for men not in fulltime education)

sort t
scatter phaz t if region==1 & fulltime==0, legend(label(1 "Scot & N")) || ///
  scatter phaz t if region==2 & fulltime==0, legend(label(2 "Wales & Mids")) || ///
  scatter phaz t if region==3 & fulltime==0, legend(label(3 "S & E")) || ///
  scatter phaz t if region==4 & fulltime==0, legend(label(4 "SE inc London")) 
  

***Allowing for non-proportional differences by region

gen t_reg2=t*reg2
gen tsq_reg2=tsq*reg2
gen t_reg3=t*reg3
gen tsq_reg3=tsq*reg3
gen t_reg4=t*reg4
gen tsq_reg4=tsq*reg4
logit y t tsq fulltime reg2 reg3 reg4 ///
t_reg2 tsq_reg2 t_reg3 tsq_reg3 t_reg4 tsq_reg4 

*Calculate predicted probability of an event (discrete-time hazard)

predict hazint, pr

*Plot hazard by region

sort t
scatter hazint t if region==1 & fulltime==0, legend(label(1 "Scot & N")) || ///
  scatter hazint t if region==2 & fulltime==0, legend(label(2 "Wales & Mids")) || ///
  scatter hazint t if region==3 & fulltime==0, legend(label(3 "S & E")) || ///
  scatter hazint t if region==4 & fulltime==0, legend(label(4 "SE inc London"))

*Test for non-proportionality

test t_reg2 tsq_reg2 t_reg3 tsq_reg3 t_reg4 tsq_reg4
