|
Archive@NYU >
Stern School of Business >
CeDER Published Papers >
Please use this identifier to cite or link to this item:
http://hdl.handle.net/2451/27808
|
| Title: | Pointwise ROC Confidence Bounds: An Empirical Evaluation |
| Authors: | Macskassy, Sofus Provost, Foster Rosset, Saharon |
| Issue Date: | 2005 |
| Citation: | Proceedings of the ICML 2005 workshop on ROC |
| Series/Report no.: | CeDER-PP-2005-06 |
| Abstract: | This paper is about constructing and evaluating pointwise confidence
bounds on an ROC curve. We describe four confidencebound methods, two
from the medical field and two used previously in machine learning
research. We evaluate whether the bounds indeed contain the relevant
operating point on the “true” ROC curve with a confidence of
1− . We then evaluate pointwise confidence bounds on the region
where the future performance of a model is expected to lie. For
evaluation we use a synthetic world representing “binormal”
distributions–the classification scores for positive and negative
instances are drawn from (separate) normal distributions. For the
“true-curve” bounds, all methods are sensitive to how well
the distributions are separated, which corresponds directly to the area
under the ROC curve. One method produces bounds that are universally too
loose, another universally too tight, and the remaining two are close to
the desired containment although containment breaks down at the extremes
of the ROC curve. As would be expected, all methods fail when used to
contain “future” ROC curves. Widening the bounds to account
for the increased uncertainty yields identical qualitative results to
the “true-curve” evaluation. We conclude by recommending a
simple, very efficient method (vertical averaging) for large sample
sizes and a more computationally expensive method (kernel estimation)
for small sample sizes. |
| URI: | http://hdl.handle.net/2451/27808 |
| Appears in Collections: | CeDER Published Papers
|
All items in Faculty Digital Archive are protected by copyright, with all rights reserved.
|