|
|
The Improvement in a Chinese Word Segmentation Based on Hash Algorism |
Yao Xingshan |
(Department of Information Management,Nanjing University,Nanjing 210093,China) |
|
|
Abstract A new algorithm for Chinese word segmentation is introduced in this paper, which is based on the new data structure for Chinese dictionary. Theory and experimets show that the above data structure achieves much more efficiency.
|
Received: 28 November 2007
Published: 25 March 2008
|
|
Corresponding Authors:
Yao Xingshan
E-mail: ywhavoc@126.com
|
About author:: Yao Xingshan |
[1] 揭春雨,刘源,梁南元. 汉语自动分词实用系统CASS的设计和实现[J].中文信息学报,1991,5(4):27-34.
[2] 孙膑.现代汉语文本的词语切分技术[R]. 北京:北京大学计算语言学研究所,2002.
[3] 赵曾贻.一种基于语词的分词方法[J].苏州大学学报,2002,18(3):44-48.
[4] 陈桂林.一种改进的快速分词算法[J].计算机研究与发展,2000,37(4):418-424.
[5] 李振星,徐泽平,唐卫清.全二分最大匹配快速分词算法[J].计算机工程与应用,2002,38(11):106-109.
[6] 李庆虎.陈玉健.一种中文分词词典新机制一双字哈希机制[J].中文信息学报,2003,17(4):13-18.
[7] 殷人昆.数据结构(用面向对象方法与C++描述)[M]. 北京:清华大学出版社, 2005.
[8] Sahni S.Data Structures Algorithms and Application in ++[M]. 北京:机械工业出版社,2006.
[9] 张海藩.软件工程[M]. 北京:人民邮电出版社, 2002:86-103.
[10] 孙茂松,左正平,黄昌宁.汉语自动分词词典机制的实验研究[J].中文信息学报,2000,14(1):1-7.
[11] 费晓洪,康松林,朱小娟,等.基于词频统计的中文分词的研究[J].计算机工程与应用,2005,41(7):67-68,100.
[12] 陈玉忠,李保利,俞士汶.藏文自动分词系统的设计与实现[J].中文信息学报,2003,17(3):15-20,65. |
|
Viewed |
|
|
|
Full text
|
|
|
|
|
Abstract
|
|
|
|
|
Cited |
|
|
|
|
|
Shared |
|
|
|
|
|
Discussed |
|
|
|
|