 
 
 
 
 
 
 
  
Nelder and Wedderburn (1972) [282]
Original GLM paper. For models selection, proposes comparing deviance to
degrees of freedom. For well-fitting models, $L^{2}-df\approx 0$.
Goutis and Robert (1998) [168]
Choice between nested models where the larger model $M_{2}$ is
regarded as adequate. Computes the posterior distribution of the
Kullback-Leibler distance between $p(y|\theta; M_{1})$ and
$p(y|\theta; M_{2})$; the simpler model is accepted if (e.g.) the
posterior mean is small enough. Fairly straightforward for GLMs,
otherwise use MCMC.