Please wait a minute...
New Technology of Library and Information Service  2006, Vol. 1 Issue (8): 46-50    DOI: 10.11925/infotech.1003-3513.2006.08.10
Current Issue | Archive | Adv Search |
Design and Implementation of Chinese Words Dictionary Segmentation Module Based on Lucene
Xiang Hui1    Guo Yiping2    Wang Liang
1(Department of Control Science and Engineering,Huazhong University of Science and Technology, Wuhan  430074,China)
2(Huazhong University of Science and Technology  Library,Wuhan  430074,China)
Download:
Export: BibTeX | EndNote (RIS)      
Abstract  

This paper introduces the construction of language analyzer in Lucene, designs and implements Chinese words segmentation module which uses forwards maximum match algorithm (FMM). This module can disposes Chinese information well and efficiently in the search engine based on Lucene.

Key wordsSearch engine      Lucene      Chinese words segmentation      Forwards Maximum match algorithm     
Received: 19 May 2006      Published: 25 August 2006
: 

G254

 
Corresponding Authors: Xiang Hui     E-mail: xcaids@126.com
About author:: Xiang Hui,Guo Yiping,Wang Liang

Cite this article:

Xiang Hui,Guo Yiping,Wang Liang . Design and Implementation of Chinese Words Dictionary Segmentation Module Based on Lucene. New Technology of Library and Information Service, 2006, 1(8): 46-50.

URL:

https://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/10.11925/infotech.1003-3513.2006.08.10     OR     https://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/Y2006/V1/I8/46

1赵汀,孟祥武.基于Lucene API的中文全文数据库设计与实现.计算机工程与应用,2003(20):179-181
2高琰,谷士文,谭立球,费耀平.基于Lucene的搜索引擎设计与实现.微机发展,2004,14(10):27-30
3刘迁,贾惠波.中文信息处理中自动分词技术的研究与展望.计算机工程与应用,2006(3):175-177,182
4郭辉,苏中义,王文,崔  俊.一种改进的MM分词算法.微型电脑应用,2002,18(1):13-15
5李庆虎,陈玉健,孙家广.一种中文分词词典新机制——双字哈希机制.中文信息学报,2002,17(4):13-18

[1] Liu Tong,Ni Weijian,Liu Mei. Identifying Terminology from Search Engine Query Logs[J]. 现代图书情报技术, 2016, 32(2): 25-33.
[2] Tong Guoping, Sun Jianjun. User Behavior Analysis Based on Search Engine Log[J]. 现代图书情报技术, 2015, 31(7-8): 80-88.
[3] Wang Xiwei, Zhao Dan, Yang Mengqing, Wei Junwei. Indices and Empirical Research on Search Engine Optimization of the Industry Websites: An Analysis from the Perspective of Information Ecology[J]. 现代图书情报技术, 2015, 31(3): 75-83.
[4] Chen Yong, Li Honglian, Lv Xueqiang. Analysis for the Search Behavior of Web Users[J]. 现代图书情报技术, 2014, 30(12): 10-17.
[5] Li Wenjiang, Chen Shiqin. Application of AIMLBot Intelligent Robot in Real-time Virtual Reference Service[J]. 现代图书情报技术, 2012, 28(7): 127-132.
[6] Xian Guojian, Zhao Ruixue, Zhu Liang, Kou Yuantao. Conversion and Consumption of Chinese Agricultural Thesaurus as SKOS[J]. 现代图书情报技术, 2012, (10): 16-20.
[7] Zhang Liyi, Chen Mingying. Research on the Sensitivity and Specificity of Search Engines[J]. 现代图书情报技术, 2011, 27(7/8): 41-46.
[8] Xian Guojian, Zhao Ruixue. Research and Implementation of Chinese Agricultural Journals’ Abstracts Retrieval System Based on Solr[J]. 现代图书情报技术, 2011, 27(6): 51-58.
[9] Wang Jimin, Lilei Mingzi, Zhang Peng. Co-authorship Network Analysis in the Research Field of Search Engine’s Log Mining[J]. 现代图书情报技术, 2011, 27(4): 58-63.
[10] Zhang Hongbin, Cao Yiqin. A New Classifier Design in a Topic Search Engine by Combining Multi-layer Classifier with Naive Bayes Classification Model[J]. 现代图书情报技术, 2011, 27(3): 73-79.
[11] Zhou Zhicheng. Real-Time Search Suggestions Based on the Clustering of the User’ s Query Intent[J]. 现代图书情报技术, 2011, 27(2): 87-93.
[12] Ke Qing, Cheng Ying, Zheng Yanning, Pan Yuntao. Construction of the Usability Evaluation Indicators on Search Engine[J]. 现代图书情报技术, 2011, (11): 24-30.
[13] Jing Jing, Hong Ying, Jiang Yuanyuan, Gao Xiaofeng. Study on Web Retrieval Query Fusion Based on Relevance Feedback[J]. 现代图书情报技术, 2011, 27(1): 57-62.
[14] Zeng Xinhong Huang Huajun Lin Weiming. Research on Retrieval and Reasoning of Ultra-Large-Scale OntoThesaurus[J]. 现代图书情报技术, 2010, 26(7/8): 58-65.
[15] Qian Hongli Ma Ziwei Li Gaohu. Design and Technology Implementation of Local Digital Resource System Based on Open Source Environment[J]. 现代图书情报技术, 2010, 26(7/8): 102-109.
  Copyright © 2016 Data Analysis and Knowledge Discovery   Tel/Fax:(010)82626611-6626,82624938   E-mail:jishu@mail.las.ac.cn