Please wait a minute...
New Technology of Library and Information Service  2008, Vol. 24 Issue (3): 78-81    DOI: 10.11925/infotech.1003-3513.2008.03.14
Current Issue | Archive | Adv Search |
The Improvement in a Chinese Word Segmentation Based on Hash Algorism
Yao Xingshan
(Department of Information Management,Nanjing University,Nanjing 210093,China)
Download: PDF(475 KB)   HTML  
Export: BibTeX | EndNote (RIS)      
Abstract  

A new algorithm for Chinese word segmentation is introduced in this paper, which is based on the new data structure for Chinese dictionary. Theory and experimets show that the above data structure achieves much more efficiency.

Key words Chinese word segmentation      Chinese information processing      Data structure      Hash algorithm     
Received: 28 November 2007      Published: 25 March 2008
: 

TP393

 
Corresponding Authors: Yao Xingshan     E-mail: ywhavoc@126.com
About author:: Yao Xingshan

Cite this article:

Yao Xingshan. The Improvement in a Chinese Word Segmentation Based on Hash Algorism. New Technology of Library and Information Service, 2008, 24(3): 78-81.

URL:

http://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/10.11925/infotech.1003-3513.2008.03.14     OR     http://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/Y2008/V24/I3/78

[1] 揭春雨,刘源,梁南元. 汉语自动分词实用系统CASS的设计和实现[J].中文信息学报,1991,5(4):27-34.
[2] 孙膑.现代汉语文本的词语切分技术[R]. 北京:北京大学计算语言学研究所,2002.
[3] 赵曾贻.一种基于语词的分词方法[J].苏州大学学报,2002,18(3):44-48.
[4] 陈桂林.一种改进的快速分词算法[J].计算机研究与发展,2000,37(4):418-424.
[5] 李振星,徐泽平,唐卫清.全二分最大匹配快速分词算法[J].计算机工程与应用,2002,38(11):106-109.
[6] 李庆虎.陈玉健.一种中文分词词典新机制一双字哈希机制[J].中文信息学报,2003,17(4):13-18.
[7] 殷人昆.数据结构(用面向对象方法与C++描述)[M]. 北京:清华大学出版社, 2005.
[8] Sahni S.Data Structures Algorithms and Application in ++[M]. 北京:机械工业出版社,2006.
[9] 张海藩.软件工程[M]. 北京:人民邮电出版社, 2002:86-103.
[10] 孙茂松,左正平,黄昌宁.汉语自动分词词典机制的实验研究[J].中文信息学报,2000,14(1):1-7.
[11] 费晓洪,康松林,朱小娟,等.基于词频统计的中文分词的研究[J].计算机工程与应用,2005,41(7):67-68,100.
[12] 陈玉忠,李保利,俞士汶.藏文自动分词系统的设计与实现[J].中文信息学报,2003,17(3):15-20,65.

[1] Yufeng Duan,Sisi Huang. Information Extraction from Chinese Plant Species Diversity Description Text[J]. 现代图书情报技术, 2016, 32(1): 87-96.
[2] Deng Shasha, Zhang Pengzhu, Li Xinmiao. A Method for Network Opinion Modeling Based on Governmental Public Decision Domain[J]. 现代图书情报技术, 2012, (9): 69-74.
[3] Jiang Hua, Su Xiaoguang. Chinese High-frequency Words Extraction Algorithm Without Thesaurus[J]. 现代图书情报技术, 2012, 28(6): 50-53.
[4] Zhang Chengzhi,Su Xinning . Recognition Mutually Exclusive Words for Information Retrieval[J]. 现代图书情报技术, 2007, 2(2): 44-48.
[5] Shen Lei . Design and Implementation of a Paper Retrieval System Based on Ontology[J]. 现代图书情报技术, 2007, 2(2): 24-27.
[6] Zhang Chengzhi,Su Xinning . Lexical Knowledge Discovery for Information Retrieval[J]. 现代图书情报技术, 2007, 2(1): 10-14.
[7] Zhai Xikui . Application of Chinese Information Processing in the Digital Library[J]. 现代图书情报技术, 2006, 1(8): 8-11.
[8] Wang Lancheng,Wang Lishuang. Research on a New Text Automatic Indexing Technology Based on Digital Library[J]. 现代图书情报技术, 2006, 1(2): 5-9.
[9] Cao Jinjun . Reconstruction of Network Retrieval System about  PsycINFO(Psychology Abstract)[J]. 现代图书情报技术, 2006, 1(10): 74-77.
[10] Xiao Long,Feng Xiangyun,Shen Yunyun. Study of Structure and Extended Rules in Descriptive Metadata[J]. 现代图书情报技术, 2004, 20(9): 5-7.
[11] Li Guansheng. A Study on the Data Structure and Data Exchange between the UNIX System and Web Database Systems[J]. 现代图书情报技术, 2002, 18(5): 59-61.
[12] Zhang Jiangong,Chen Dingquan. The Key Technique and Realization of the Chinese Characters Full Text Retrieval System[J]. 现代图书情报技术, 2001, 17(2): 16-18.
  Copyright © 2016 Data Analysis and Knowledge Discovery   Tel/Fax:(010)82626611-6626,82624938   E-mail:jishu@mail.las.ac.cn