|
Archive@NYU >
Stern School of Business >
IOMS: Information Systems Working Papers >
Please use this identifier to cite or link to this item:
http://hdl.handle.net/2451/14157
|
| Title: | Discovering Knowledge from Relational Data Extracted from Business News |
| Authors: | Bernstein, Abraham Clearwater, Scott Hill, Shawndra Perlich, Claudia Provost, Foster |
| Issue Date: | 2002 |
| Publisher: | Stern School of Business, New York University |
| Series/Report no.: | IS-02-03 |
| Abstract: | Thousands of business news stories (including press releases, earnings
reports, general business news, etc.) are released each day. Recently,
information technology advances have partially automated the processing
of documents, reducing the amount of text that must be read. Current
techniques (e.g., text classification and information extraction) for
full-text analysis for the most part are limited to discovering
information that can be found in single documents. Often, however,
important information does not reside in a single document, but in the
relationships between information distributed over multiple documents.
This paper reports on an investigation into whether knowledge can be
discovered automatically from relational data extracted from large
corpora of business news stories. We use a combination of information
extraction, network analysis, and statistical techniques. We show that
relationally interlinked patterns distributed over multiple documents
can indeed be extracted, and (specifically) that knowledge about
companiesÃÂÃÂÃÂâÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂ
interrelationships can be discovered. We evaluate the extracted
relationships in several ways: we give a broad visualization of related
companies, showing intuitive industry clusters; we use network analysis
to ask who are the central players, and finally, we show that the
extracted interrelationships can be used for important tasks, such as
for classifying companies by industry membership. |
| URI: | http://hdl.handle.net/2451/14157 |
| Appears in Collections: | IOMS: Information Systems Working Papers
|
All items in Faculty Digital Archive are protected by copyright, with all rights reserved.
|