Skip navigation
Title: 

Efficiency and Consistency for Regularization Parameter Selection in Penalized Regression: Asymptotics and Finite-Sample Corrections

Authors: Flynn, Cheryl J.
Hurvich, Clifford M.
Simonoff, Jeffrey S.
Keywords: Akaike information criterion; Bayesian information criterion; Least ab- solute shrinkage and selection operator; Model selection/ Variable Selection; Penalized regression; Smoothly clipped absolute deviation.
Issue Date: 17-Nov-2011
Publisher: Stern School of Business, New York University
Series/Report no.: SOR-2011-02
Abstract: This paper studies the asymptotic and nite-sample performance of penalized regression methods when different selectors of the regularization parameter are used under the assumption that the true model is, or is not, included among the candidate model. In the latter setting, we relax assumptions in the existing theory to show that several classical information criteria are asymptotically efficient selectors of the regularization parameter. In both settings, we assess the nite-sample performance of these as well as other common selectors and demonstrate that their performance can suffer due to sensitivity to the number of variables that are included in the full model. As alternatives, we propose two corrected information criteria which are shown to outperform the existing procedures while still maintaining the desired asymptotic properties. In the non-true model world, we relax the assumption made in the literature that the true error variance is known or that a consistent estimator is available to prove that Akaike's information criterion (AIC), Cp and Generalized cross-validation (GCV) themselves are asymptotically efficient selectors of the regularization parameter and we study their performance in nite samples. In classical regression, AIC tends to select overly complex models when the dimension of the maximum candidate model is large relative to the sample size. Simulation studies suggest that AIC suffers from the same shortcomings when used in penalized regression. We therefore propose the use of the classical AICc as an alternative. In the true model world, a similar investigation into the nite sample properties of BIC reveals analogous overfitting tendencies and leads us to further propose the use of a corrected BIC (BICc). In their respective settings (whether the true model is, or is not, among the candidate models), BICc and AICc have the desired asymptotic properties and we use simulations to assess their performance, as well as that of other selectors, in nite samples for penalized regressions fit using the Smoothly clipped absolute deviation (SCAD) and Least absolute shrinkage and selection operator (Lasso) penalty functions. We nd that AICc and 10-fold cross-validation outperform the other selectors in terms of squared error loss, and BICc avoids the tendency of BIC to select overly complex models when the dimension of the maximum candidate model is large relative to the sample size.
URI: http://hdl.handle.net/2451/31317
Appears in Collections:IOMS: Statistics Working Papers

Files in This Item:
File Description SizeFormat 
RegParSel_Nov2011.pdf1.37 MBAdobe PDFView/Open
suppmaterial.pdf243.34 kBAdobe PDFView/Open


Items in FDA are protected by copyright, with all rights reserved, unless otherwise indicated.