|
Archive@NYU >
Stern School of Business >
IOMS: Information Systems Working Papers >
Please use this identifier to cite or link to this item:
http://hdl.handle.net/2451/14118
|
| Title: | ACORA: Distribution-Based Aggregation for Relational Learning from
Identifier Attributes |
| Authors: | Perlich, Claudia Provost, Foster |
| Issue Date: | Feb-2005 |
| Publisher: | Stern School of Business, New York University |
| Series/Report no.: | CeDER-04-04 |
| Abstract: | Feature construction through aggregation plays an essential role in
modeling relational domains with one-to-many relationships between
tables. One-to-many relationships lead to bags (multisets) of related
entities, from which predictive information must be captured. This paper
focuses on aggregation from categorical attributes that can take many
values (e.g., object identifiers). We present a novel aggregation method
as part of a relational learning system ACORA, that combines the use of
vector distance and meta-data about the class-conditional distributions
of attribute values. We provide a theoretical foundation for this
approach deriving a "relational fixed-effect" model within a
Bayesian framework, and discuss the implications of identifier
aggregation on the expressive power of the induced model. One advantage
of using identifier attributes is the circumvention of limitations
caused either by missing/unobserved object properties or by independence
assumptions. Finally, we show empirically that the novel aggregators can
generalize in the presence of identi- fier (and other high-dimensional)
attributes, and also explore the limitations of the applicability of the methods. |
| URI: | http://hdl.handle.net/2451/14118 |
| Appears in Collections: | CeDER Working Papers IOMS: Information Systems Working Papers
|
All items in Faculty Digital Archive are protected by copyright, with all rights reserved.
|