|
Archive@NYU >
Stern School of Business >
IOMS: Statistics Working Papers >
Please use this identifier to cite or link to this item:
http://hdl.handle.net/2451/31317
|
| Title: | Efficiency and Consistency for Regularization Parameter Selection in
Penalized Regression: Asymptotics and Finite-Sample Corrections |
| Authors: | Flynn, Cheryl J. Hurvich, Clifford M. Simonoff, Jeffrey S. |
| Keywords: | Akaike information criterion; Bayesian information criterion; Least ab-
solute shrinkage and selection operator; Model selection/ Variable
Selection; Penalized regression; Smoothly clipped absolute deviation. |
| Issue Date: | 17-Nov-2011 |
| Publisher: | Stern School of Business, New York University |
| Series/Report no.: | SOR-2011-02 |
| Abstract: | This paper studies the asymptotic and nite-sample performance of
penalized regression methods when different selectors of the
regularization parameter are used under the assumption that the true
model is, or is not, included among the candidate model. In the latter
setting, we relax assumptions in the existing theory to show that
several classical information criteria are asymptotically efficient
selectors of the regularization parameter. In both settings, we assess
the nite-sample performance of these as well as other common selectors
and demonstrate that their performance can suffer due to sensitivity to
the number of variables that are included in the full model. As
alternatives, we propose two corrected information criteria which are
shown to outperform the existing procedures while still maintaining the
desired asymptotic properties. In the non-true model world, we relax
the assumption made in the literature that the true error variance is
known or that a consistent estimator is available to prove that Akaike's
information criterion (AIC), Cp and Generalized cross-validation (GCV)
themselves are asymptotically efficient selectors of the regularization
parameter and we study their performance in nite samples. In classical
regression, AIC tends to select overly complex models when the dimension
of the maximum candidate model is large relative to the sample size.
Simulation studies suggest that AIC suffers from the same shortcomings
when used in penalized regression. We therefore propose the use of the
classical AICc as an alternative. In the true model world, a similar
investigation into the nite sample properties of BIC reveals analogous
overfitting tendencies and leads us to further propose the use of a
corrected BIC (BICc). In their respective settings (whether the true
model is, or is not, among the candidate models), BICc and AICc have the
desired asymptotic properties and we use simulations to assess their
performance, as well as that of other selectors, in nite samples for
penalized regressions fit using the Smoothly clipped absolute deviation
(SCAD) and Least absolute shrinkage and selection operator (Lasso)
penalty functions. We nd that AICc and 10-fold cross-validation
outperform the other selectors in terms of squared error loss, and BICc
avoids the tendency of BIC to select overly complex models when the
dimension of the maximum candidate model is large relative to the sample size. |
| URI: | http://hdl.handle.net/2451/31317 |
| Appears in Collections: | IOMS: Statistics Working Papers
|
All items in Faculty Digital Archive are protected by copyright, with all rights reserved.
|