New Technology of Library and Information Service  2014, Vol. 30 Issue (1): 28-35    DOI: 10.11925/infotech.1003-3513.2014.01.05
Experimental Study of Multilingual Text Clustering
Deng Sanhong, Wan Jiexi, Wang Hao, Liu Xiwen
School of Information Management,Nanjing University,Nanjing 210093,China
Abstract  [Objective] Analyzing the performance,the crucial points and direction of characteristics translation and LSI in cross-language text clustering. [Methods] Selecting 2736 Sino-British bilingual news text from some bilingual websites,complete the clustering test with these two methods and compare the parameters,such as recall rate,accuracy and F value. [Results] Characteristics translation method improves clustering while the LSI method doesn’t get a good result for its time and space complexity. [Limitations] Samples need to be expanded and the LSI experiment need to be repeated in a high-performance computing environments. [Conclusions] Characteristics translation method need some more effective translation system,and the LSI method need to solve the calculation complexity and the select of the K value,etc.
Key wordsCross-language text clustering      Characteristics translation      LSI     
Received: 14 February 2014      Published: 14 February 2014
:  TP391  

Deng Sanhong,Wan Jiexi,Wang Hao,Liu Xiwen. Experimental Study of Multilingual Text Clustering. New Technology of Library and Information Service, 2014, 30(1): 28-35.

