|
|
Design and Implementation of Chinese Words Dictionary Segmentation Module Based on Lucene |
Xiang Hui1 Guo Yiping2 Wang Liang1 |
1(Department of Control Science and Engineering,Huazhong University of Science and Technology, Wuhan 430074,China)
2(Huazhong University of Science and Technology Library,Wuhan 430074,China) |
|
|
Abstract This paper introduces the construction of language analyzer in Lucene, designs and implements Chinese words segmentation module which uses forwards maximum match algorithm (FMM). This module can disposes Chinese information well and efficiently in the search engine based on Lucene.
|
Received: 19 May 2006
Published: 25 August 2006
|
|
Corresponding Authors:
Xiang Hui
E-mail: xcaids@126.com
|
About author:: Xiang Hui,Guo Yiping,Wang Liang |
1赵汀,孟祥武.基于Lucene API的中文全文数据库设计与实现.计算机工程与应用,2003(20):179-181
2高琰,谷士文,谭立球,费耀平.基于Lucene的搜索引擎设计与实现.微机发展,2004,14(10):27-30
3刘迁,贾惠波.中文信息处理中自动分词技术的研究与展望.计算机工程与应用,2006(3):175-177,182
4郭辉,苏中义,王文,崔 俊.一种改进的MM分词算法.微型电脑应用,2002,18(1):13-15
5李庆虎,陈玉健,孙家广.一种中文分词词典新机制——双字哈希机制.中文信息学报,2002,17(4):13-18 |
|
Viewed |
|
|
|
Full text
|
|
|
|
|
Abstract
|
|
|
|
|
Cited |
|
|
|
|
|
Shared |
|
|
|
|
|
Discussed |
|
|
|
|