Model selection methods

Next: Examples: models for social Up: Model selection in Sociological Previous: Model selection in Sociological Contents

Model selection methods

Raftery (1986) [304] Comment to Grusky and Hauser (1984) [170]. Properties of LR test when $n$ is large. BIC proposed instead, favours quasi-symmetry model for the data of [170].

Hout and Raftery (1988) [192] (A CASMIN conference paper). BIC to address the large- $N$ problem. Example (from [Hauser (1984)]): $N=14 258$ French and English men, class of origin, destination, country; homogenous quasi-symmetry model selected. Other model selection problems discussed: sparse tables and zero counts; nonnested models (e.g. continuous scales vs. nominal levels) and combining them; mathematical vs. verbal formulation of models and hypotheses.

Davis (1990) [105] Survey of sample sizes in leading sociological journals. Role of $N$ in significance testing. Discussion of (and formulas for) $CN$ , the sample size for which the observed effect would be exactly significant at a given level.

Raftery (1995) [307] Bayesian model selection for sociological audience. Problems with standard hypothesis tests: $p$ -values with large $n$ and in multiple comparisons; selection from many (possibly nonnested) competing models; model uncertainty. Bayesian approach to these; derivation of BIC, examples of specific types of models, choice of ` $n$ ', interpretation and relation to p-values. Data of Grusky and Hauer (1984) [170] used as example of model selection in very large data sets. Bayesian model averaging. Discussion in [156] and [182], rejoinder in [308].

Gelman and Rubin (1995) [156] Discussion of Raftery (1995) [307]. Mostly critical: argues that BIC attempts to provide rationale for selecting a model which does not fit. In his rejoinder, Raftery [308] comments that this is based on use of LR tests to decide which model `fits' and misunderstanding of the prior implied by BIC. G & S argue for the distinction between statistical and practical significance, i.e. whether a model is acceptable depends also on the intended purposes to which it is to be used. In general, G & S de-emphasise model selection, preferring complex models (embedding the main choices in a flexible class of models), collection of further data (to make selection easier) and, sometimes, model averaging.

Hauser (1995) [182] Discussion of Raftery (1995) [307]. Very positive. Discussion of large- $n$ model selection problems in Grusky and Hauser (1984) [170] and the BIC resolution of them in Raftery (1986) [304]. Further examples of the use of BIC in sociological modelling. Recommendations for universal acceptance of BIC.

Weakliem (1998) [] Criticism of BIC for sociological audience. Main points: (i) BIC implies (best approximates) a BF with a certain prior, which may or may not be sensible; (ii) the `sample size' $n$ in BIC is not well defined, should be amount of information in the sample but this is not easy to define. Many other comments and suggested modifications to BIC.

Next: Examples: models for social Up: Model selection in Sociological Previous: Model selection in Sociological Contents

Jouni Kuha 2003-07-16