New Technology of Library and Information Service  2008, Vol. 24 Issue (5): 39-43    DOI: 10.11925/infotech.1003-3513.2008.05.07
The Research of Character-Position-Based Chinese Word Segmentation
Zhang Jinzhu   Zhang Dong   Wang Huilin
(Institute of Scientific and Technical Information of China, Beijing 100038,China)
This paper analyses the actuality and introduces several different representative approaches of Chinese word segmentation, then brings out a character-position-based segmentation method which takes the Chinese character as the least unit.It indicates the probability distribution of a word through the probability distribution of Chinese character,so it plays much better than other approaches in unknown word recognition.This idea takes a machine-learning method called maximum entropy for implementation and two experiments for comparing and analyzing the results.

Key wordsChinese word segmentation      Character-position      Maximum entropy      Unknown word recognition     
Received: 28 December 2007      Published: 25 May 2008



About author:: Zhang Jinzhu,Zhang Dong,Wang Huilin

Cite this article:

Zhang Jinzhu,Zhang Dong,Wang Huilin. The Research of Character-Position-Based Chinese Word Segmentation. New Technology of Library and Information Service, 2008, 24(5): 39-43.

