|
|
Localization of the Open Source Full-text Retrival Engine Based on Lucene |
Wu Pengfei1 Ma Fengjuan2 Li Wenge1 Guo Peng1 |
1(Library of Shijiazhuang University, Shijiazhuang 050035,China)
2(School of Humanities and Social Sciences,Shijiazhuang University of Economics, Shijiazhuang 050031,China) |
|
|
Abstract This paper introduces the system architecture, indexing and retrieval process, and language analyzer of Lucene. According to the disadvantage of Lucene that it can only make one-word and two-word segmentation, this paper develops a Chinese-English language analyzer — ZH_CNAnalyzer. At last, an indexing and retrieval example of ZH_CNAnalyzer is given.
|
Received: 03 March 2009
Published: 25 April 2009
|
|
Corresponding Authors:
Pengfei Wu
E-mail: wupengfei_2000@163.com
|
About author:: Wu Pengfei ,Ma Fengjuan, Li Wenge,Guo Peng |
[1] The Apache Jakarta Project:Lucene[EB/OL].[2008-09-10].http://jakarta.apache.org/lucene/.
[2] Gospodnetic O,Hatcher E.Lucene in Action[M].Manning Publications Co.,2005:10.
[3] 向晖,郭一平,王亮.基于Lucene的中文字典分词模块的设计与实现[J].现代图书情报技术,2006(8):46-50.
[4] 王继明,杨国林.基于Lucene的中文文本分词[J].内蒙古工业大学学报,2007(3):185-188.
[5] 李庆虎,陈玉健,孙家广.一种中文分词词典新机制——双字哈希机制[J].中文信息学报,2003(4):13-18.
[6] 李振星,余泽平,唐卫清,等.全二分最大匹配快速分词算法[J].计算机工程与应用,2002(11):106-109.
[7] Luke[EB/OL].[2008-12-10].http://www.getopt.org/luke/. |
|
Viewed |
|
|
|
Full text
|
|
|
|
|
Abstract
|
|
|
|
|
Cited |
|
|
|
|
|
Shared |
|
|
|
|
|
Discussed |
|
|
|
|