Skip navigation

Machine Learning from Imbalanced Data Sets 101

Authors: Provost, Foster
Issue Date: 17-Nov-2008
Series/Report no.: CeDER-PP-2000-02
Abstract: For research to progress most effectively, we first should establish common ground regarding just what is the problem that imbalanced data sets present to machine learning systems. Why and when should imbalanced data sets be problematic? When is the problem simply an artifact of easily rectified design choices? I will try to pick the low-hanging fruit and share them with the rest of the workshop participants. Specifically, I would like to discuss what the problem is not. I hope this will lead to a profitable discussion of what the problem indeed is, and how it might be addressed most effectively.
Description: Invited paper for the AAAI'2000 Workshop on Imbalanced Data Sets.
Appears in Collections:CeDER Published Papers

Files in This Item:
File Description SizeFormat 
CPP-02-00.pdf18.64 kBAdobe PDFView/Open

Items in FDA are protected by copyright, with all rights reserved, unless otherwise indicated.