Skip navigation
Full metadata record
DC FieldValueLanguage
dc.contributor.authorBernstein, Abraham-
dc.contributor.authorClearwater, Scott-
dc.contributor.authorHill, Shawndra-
dc.contributor.authorPerlich, Claudia-
dc.contributor.authorProvost, Foster-
dc.date.accessioned2005-11-29T20:37:37Z-
dc.date.available2005-11-29T20:37:37Z-
dc.date.issued2002-
dc.identifier.urihttp://hdl.handle.net/2451/14147-
dc.description.abstractThousands of business news stories (including press releases, earnings reports, general business news, etc.) are released each day. Recently, information technology advances have partially automated the processing of documents, reducing the amount of text that must be read. Current techniques (e.g., text classification and information extraction) for full-text analysis for the most part are limited to discovering information that can be found in single documents. Often, however, important information does not reside in a single document, but in the relationships between information distributed over multiple documents. This paper reports on an investigation into whether knowledge can be discovered automatically from relational data extracted from large corpora of business news stories. We use a combination of information extraction, network analysis, and statistical techniques. We show that relationally interlinked patterns distributed over multiple documents can indeed be extracted, and (specifically) that knowledge about companies’ interrelationships can be discovered. We evaluate the extracted relationships in several ways: we give a broad visualization of related companies, showing intuitive industry clusters; we use network analysis to ask who are the central players, and finally, we show that the extracted interrelationships can be used for important tasks, such as for classifying companies by industry membership.en
dc.format.extent1026159 bytes-
dc.format.mimetypeapplication/pdf-
dc.languageEnglishEN
dc.language.isoen_US-
dc.publisherStern School of Business, New York Universityen
dc.relation.ispartofseriesIS-02-03-
dc.titleDiscovering Knowledge from Relational Data Extracted from Business Newsen
dc.typeWorking Paperen
dc.description.seriesInformation Systems Working Papers SeriesEN
Appears in Collections:IOMS: Information Systems Working Papers

Files in This Item:
File Description SizeFormat 
IS-02-03.pdf1 MBAdobe PDFView/Open


Items in FDA are protected by copyright, with all rights reserved, unless otherwise indicated.