Skip navigation
Title: 

Investigating Multilingual, Multi-script Support in Lucene/Solr Library Applications

Authors: Barnett, Jeffrey
Lovins, Daniel
Novak, Audrey
Riley, Charles
Suzuki, Keiko
Keywords: Text processing (Computer science);Character sets (Data processing);Cataloging of foreign language publications--Data processing
Issue Date: 3-Jun-2010
Citation: Barnett, Jeffrey and Daniel Lovins, Audrey Novak, Charles Riley, Keiko Suzuki. "Investigating Multilingual, Multi-script Support in Lucene/Solr Library Applications" (2010)
Abstract: Yale has developed over many years a highly-structured, high-quality multilingual catalog of bibliographic data. Almost 50% of the collection represents non-English materials in over 650 languages, and includes many different non-Roman scripts. Faculty, students, researchers, and staff would like to make full use of this original script content for resource discovery. While the underlying textual data are in place, effective indexing, retrieval and display functionality for the non-Roman script content is not available within our bibliographic discovery applications, Orbis and Yufind. Opportunities now exist in the Unicode, Lucene/Solr computing environment to bridge the functionality gap and achieve internationalization of the Yale Library catalog. While most parts of this study focus on the Yale environment, in the absence of other such studies it is hoped that the findings will be of interest to a much larger community.
URI: http://hdl.handle.net/2451/38726
Rights: This is a public version of an internal report, for which, to the best of our knowledge, no rights have been reserved
Appears in Collections:Library Collection

Files in This Item:
File Description SizeFormat 
Yale_Arcadia_Report_on_Multilingual_Lucene.pdf2.41 MBAdobe PDFView/Open


Items in FDA are protected by copyright, with all rights reserved, unless otherwise indicated.