Abstract
When a linear model is chosen by searching for the best subset among a set of candidate predictors, a fixed penalty such as that imposed by the Akaike information criterion may penalize model complexity inadequately, leading to biased model selection. We study resampling-based information criteria that aim to overcome this problem through improved estimation of the effective model dimension. The first proposed approach builds upon previous work on bootstrap-based model selection. We then propose a more novel approach based on cross-validation. Simulations and analyses of a functional neuroimaging data set illustrate the strong performance of our resampling-based methods, which are implemented in a new R package.
Original language | English |
---|---|
Pages (from-to) | 1161-1186 |
Number of pages | 26 |
Journal | Annals of the Institute of Statistical Mathematics |
Volume | 64 |
Issue number | 6 |
DOIs | |
State | Published - Dec 2012 |
Externally published | Yes |
Bibliographical note
Funding Information:Acknowledgments The first author’s research is supported in part by National Science Foundation grant DMS-0907017. The authors thank Mike Milham, Eva Petkova, Thad Tarpey, Lee Dicker and Tao Zhang, for illuminating discussions; Zarrar Shehzad, for assistance with the functional connectivity data; and the Associate Editor and referee, whose incisive comments led to major improvements in the paper.
Keywords
- Adaptive model selection
- Covariance inflation criterion
- Cross-validation
- Extended information criterion
- Functional connectivity
- Overoptimism
ASJC Scopus subject areas
- Statistics and Probability