Skip navigation

RE-EM Trees: A New Data Mining Approach for Longitudinal Data

Authors: Sela, Rebecca J.
Simonoff, Jeffrey S.
Issue Date: 10-Jun-2009
Series/Report no.: SOR-2009-03
Abstract: Longitudinal data refer to the situation where repeated observations are available for each sampled individual. Methodologies that take this structure into account allow for systematic differences between individuals that are not related to covariates. A standard methodology in the statistics literature for this type of data is the random effects model, where these differences between individuals are represented by so-called “effects” that are estimated from the data. This paper presents a methodology that combines the flexibility of tree-based estimation methods with the structure of random effects models for longitudinal data. We apply the resulting estimation method, called the RE-EM tree, to pricing in online transactions, showing that the RE-EM tree is less sensitive to parametric assumptions and provides improved predictive power compared to linear models with random effects and regression trees without random effects. We also perform extensive simulation experiments to show that the estimator improves predictive performance relative to regression trees without random effects and is comparable or superior to using linear models with random effects in more general situations.
Appears in Collections:IOMS: Statistics Working Papers

Files in This Item:
File Description SizeFormat 
Trees_with_Random_Effects 3-28-11.pdf337.25 kBAdobe PDFView/Open

Items in FDA are protected by copyright, with all rights reserved, unless otherwise indicated.