Faculty Digital Archive

Archive@NYU >
Stern School of Business >
IOMS: Information Systems Working Papers >

Please use this identifier to cite or link to this item: http://hdl.handle.net/2451/14130

Title: The Relational Vector-space Model
Authors: Bernstein, Abraham
Clearwater, Scott
Provost, Foster
Keywords: Relational Data Mining
Vector-space models
lndustry Classification
Issue Date: 1-Jan-2003
Publisher: Stern School of Business, New York University
Series/Report no.: IS-03-02
Abstract: This paper addresses the classification of linked entities. We introduce a relational vector (VS) model (in analogy to the VS model used in information retrieval) that abstracts the linked structure, representing entities by vectors of weights. Given labeled data as background knowledge training data, classification procedures can be defined for this model, including a straightforward, "direct" model using weighted adjacency vectors. Using a large set of tasks from the domain of company affiliation identification, we demonstrate that such classification procedures can be effective. We then examine the method in more detail, showing that as expected the classification performance correlates with the- relational auto correlation of the data set. We then turn the tables and use the relational VS scores as a way to analyze/visualize the relational autocorrelation present in a complex linked structure. The main contribution of the paper 1s to introduce the relational VS model as a potentially useful addition to the toolkit for relational data mining. It could provide useful constructed features for domains with low to moderate relational autocorrelation; it may be effective by itself for domains with high levels of relational autocorrelation, and it provides a useful abstraction for analyzing the properties of linked data.
URI: http://hdl.handle.net/2451/14130
Appears in Collections:IOMS: Information Systems Working Papers

Files in This Item:

File Description SizeFormat
IS-03-02.pdf3.31 MBAdobe PDFView/Open

Items in Faculty Digital Archive are protected by copyright, with all rights reserved, unless otherwise indicated.


The contents of the FDA may be subject to copyright, be offered under a Creative Commons license, or be in the public domain.
Please check items for rights statements. For information about NYU’s copyright policy, see http://www.nyu.edu/footer/copyright-and-fair-use.html 
Valid XHTML 1.0 | CSS