Skip navigation
Full metadata record
DC FieldValueLanguage
dc.contributor.authorIpeirotis, Panagiotis-
dc.contributor.authorNtoulas, Alexandros-
dc.contributor.authorCho, Junghoo-
dc.contributor.authorGravano, Luis-
dc.date.accessioned2008-12-09T15:57:12Z-
dc.date.available2008-12-09T15:57:12Z-
dc.date.issued2007-08-
dc.identifier.citationACM Transactions on Database Systems (TODS), vol. 32, no. 3, article 14, August 2007en
dc.identifier.urihttp://hdl.handle.net/2451/27822-
dc.description.abstractLarge amounts of (often valuable) information are stored in web-accessible text databases. “Metasearchers” provide unified interfaces to query multiple such databases at once. For efficiency, metasearchers rely on succinct statistical summaries of the database contents to select the best databases for each query. So far, database selection research has largely assumed that databases are static, so the associated statistical summaries do not evolve over time. However, databases are rarely static and the statistical summaries that describe their contents need to be updated periodically to reflect content changes. In this article, we first report the results of a study showing how the content summaries of 152 real web databases evolved over a period of 52 weeks. Then, we show how to use “survival analysis” techniques in general, and Cox’s proportional hazards regression in particular, to model database changes over time and predict when we should update each content summary. Finally, we exploit our change model to devise update schedules that keep the summaries up to date by contacting databases only when needed, and then we evaluate the quality of our schedules experimentally over real web databases.en
dc.description.sponsorshipNYU, Stern School of Business, IOMS Department, Center for Digital Economy Researchen
dc.format.extent622063 bytes-
dc.format.mimetypeapplication/pdf-
dc.language.isoen_USen
dc.publisherACM Transactions on Database Systemsen
dc.relation.ispartofseriesCeDER-PP-2007-14en
dc.subjectmetasearchingen
dc.subjecttext database selectionen
dc.subjectdistributed information retrievalen
dc.titleModeling and Managing Changes in Text Databasesen
dc.typeArticleen
Appears in Collections:CeDER Published Papers

Files in This Item:
File Description SizeFormat 
CeDER-PP-2007-14.pdf607.48 kBAdobe PDFView/Open


Items in FDA are protected by copyright, with all rights reserved, unless otherwise indicated.