Abstract
Regression of a scalar response on signal predictors, such as near-infrared (NIR) spectra of chemical samples, presents a major challenge when, as is typically the case, the dimension of the signals far exceeds their number. Most solutions to this problem reduce the dimension of the predictors either by regressing on components [e.g., principal component regression (PCR) and partial least squares (PLS)] or by smoothing methods, which restrict the coefficient function to the span of a spline basis. This article introduces functional versions of PCR and PLS, which combine both of the foregoing dimension-reduction approaches. Two versions of functional PCR are developed, both using B-splines and roughness penalties. The regularized-components version applies such a penalty to the construction of the principal components (i.e., it uses functional principal components), whereas the regularized-regression version incorporates a penalty in the regression. For the latter form of functional PCR, the penalty parameter may be selected by generalized cross-validation, restricted maximum likelihood (REML), or a minimum mean integrated squared error criterion. Proceeding similarly, we develop two versions of functional PLS. Asymptotic convergence properties of regularized-regression functional PCR are demonstrated. A simulation study and split-sample validation with several NIR spectroscopy data sets indicate that functional PCR and functional PLS, especially the regularized-regression versions with REML, offer advantages over existing methods in terms of both estimation of the coefficient function and prediction of future observations.
Original language | English |
---|---|
Pages (from-to) | 984-996 |
Number of pages | 13 |
Journal | Journal of the American Statistical Association |
Volume | 102 |
Issue number | 479 |
DOIs | |
State | Published - Sep 2007 |
Externally published | Yes |
Bibliographical note
Funding Information:Philip T. Reiss is Assistant Professor, New York University Child Study Center, New York, NY 10016. R. Todd Ogden is Associate Professor, Department of Biostatistics, Columbia University, New York, NY 10032. The authors thank Chung Chang, Martin Lindquist, Marianthi Markatou, Ian McKeague, Martina Pavlicova, Eva Petkova, Ioana Schiopu-Kratina, and Hongtu Zhu for informative discussions; Hervé Cardot, Sijmen de Jong, Paul Eilers, and Brian Marx for graciously answering questions about their related work; Phil Hopke for making the spectroscopy data publicly available; and the joint editors, associate editor, and referees, whose comments significantly improved the article. The first author gratefully acknowledges support from the National Institute of Mental Health through grant 1 F31 MH73379-01A1.
Keywords
- B-splines
- Functional linear model
- Linear mixed model
- Multivariate calibration
- SIMPLS
- Signal regression
ASJC Scopus subject areas
- Statistics and Probability
- Statistics, Probability and Uncertainty