|
Archive@NYU >
Stern School of Business >
CeDER Published Papers >
Please use this identifier to cite or link to this item:
http://hdl.handle.net/2451/27770
|
| Title: | Tree Induction vs. Logistic Regression: A Learning-Curve Analysis |
| Authors: | Perlich, Claudia Provost, Foster Simonoff, Jeffrey |
| Keywords: | decision trees learning curves logistic regression ROC analysis Tree induction |
| Issue Date: | 1-Jun-2003 |
| Publisher: | Journal of Machine Learning Research |
| Citation: | 4 (2003) pp. 211-255 |
| Series/Report no.: | CeDER-PP-2003-05 |
| Abstract: | Tree induction and logistic regression are two standard, off-the-shelf
methods for building models for classification. We present a large-scale
experimental comparison of logistic regression and tree induction,
assessing classification accuracy and the quality of rankings based on
classmembership probabilities. We use a learning-curve analysis to
examine the relationship of these measures to the size of the training
set. The results of the study show several things. (1) Contrary to some
prior observations, logistic regression does not generally outperform
tree induction. (2) More specifically, and not surprisingly, logistic
regression is better for smaller training sets and tree induction for
larger data sets. Importantly, this often holds for training sets drawn
from the same domain (that is, the learning curves cross), so
conclusions about induction-algorithmsuperiority on a given domain must
be based on an analysis of the learning curves. (3) Contrary to
conventional wisdom, tree induction is effective at producing
probability-based rankings, although apparently comparatively less so
for a given training-set size than at making classifications. Finally,
(4) the domains on which tree induction and logistic regression are
ultimately preferable can be characterized surprisingly well by a simple
measure of the separability of signal from noise. |
| URI: | http://hdl.handle.net/2451/27770 |
| Appears in Collections: | CeDER Published Papers
|
All items in Faculty Digital Archive are protected by copyright, with all rights reserved.
|