New Technology of Library and Information Service  2012, Vol. Issue (11): 40-46    DOI: 10.11925/infotech.1003-3513.2012.11.07
Research of Mining the Category Knowledge Based on English-Chinese Humanities and Social Sciences Parallel Corpus in Phrase Level
Wang Dongbo1, Han Pu2, Shen Si2, Wei Xiangqing3
1. College of Information Science and Technology, Nanjing Agricultural University, Nanjing 210095, China;
2. School of Information Management, Nanjing University, Nanjing 210093, China;
3. Bilingual Dictionary Research Center, Nanjing University, Nanjing 210093, China
Abstract  The experiment of mining the category knowledge from English-Chinese humanities and social sciences parallel corpus in phrase level is performed based on the established clustering algorithm. The clustering and morphological conversion algorithms are determined by experimental data and specific research needs. The performance of English-Chinese bilingual word features is better than monolingual word by comparing the performance of the Chinese, English and English-Chinese word level knowledge clustering. The category knowledge is directly applied to knowledge base and machine translation system, and the English and Chinese word's expression is explored in mining the category knowledge.
Key wordsCSSCI      English-Chinese parallel corpus in phrase level      Bisecting K-means clustering algorithm      Category knowledge     
Received: 09 October 2012      Published: 06 February 2013
Cite this article:

Wang Dongbo, Han Pu, Shen Si, Wei Xiangqing. Research of Mining the Category Knowledge Based on English-Chinese Humanities and Social Sciences Parallel Corpus in Phrase Level. New Technology of Library and Information Service, 2012, (11): 40-46.

