Skip navigation
Full metadata record
DC FieldValueLanguage
dc.contributor.authorDing, Yufeng-
dc.contributor.authorSimonoff, Jeffrey S.-
dc.date.accessioned2008-05-25T13:58:55Z-
dc.date.available2008-05-25T13:58:55Z-
dc.date.issued2006-12-03-
dc.identifier.urihttp://hdl.handle.net/2451/26305-
dc.description.abstractThere are many different missing data methods used by classification tree algorithms, but few studies have been done comparing their appropriateness and performance. This paper provides both analytic and Monte Carlo evidence regarding the effectiveness of six popular missing data methods for classification trees. We show that in the context of classification trees, the relationship between the missingness and the dependent variable, rather than the standard missingness classification approach of Little and Rubin (2002) (missing completely at random (MCAR), missing at random (MAR) and not missing at random (NMAR)), is the most helpful criterion to distinguish different missing data methods. We make recommendations as to the best method to use in various situations. The paper concludes with discussion of a real data set related to predicting bankruptcy of a firm.en
dc.languageEnglishEN
dc.language.isoen_USen
dc.publisherStern School of Business, New York Universityen
dc.relation.ispartofseriesSOR-2006-3en
dc.subjectC4.5en
dc.subjectCARTen
dc.subjectClassification treeen
dc.subjectSeparate classen
dc.titleAn Investigation of Missing Data Methods for Classiffcation Treesen
dc.typeWorking Paperen
dc.description.seriesStatistics Working Papers SeriesEN
Appears in Collections:IOMS: Statistics Working Papers

Files in This Item:
File Description SizeFormat 
06-03.pdf371.85 kBAdobe PDFView/Open


Items in FDA are protected by copyright, with all rights reserved, unless otherwise indicated.