New Technology of Library and Information Service  2008, Vol. 24 Issue (12): 43-47    DOI: 10.11925/infotech.1003-3513.2008.12.08
Text Clustering Research on the Max Term Contribution Dimension Reduction and Simulated Annealing Algorithm
Lu Guoli  Wang Xiaohua  Wang Rongbo
(Computer Application Technology Laboratory of Hangzhou Dianzi University, Hangzhou 310018, China)
This paper presents a new algorithm for text character extraction and dimension reduction based on the Max Term Contribution. Its main idea is computing the contribution of each term in the high dimension document-base and extracting the maximum contribution terms to construct a low dimension document-base from the high dimension document-base using the search algorithm. Then a modified K-means clustering method based on the Simulated Annealing (SA) is presented to cluster the low dimension document datum which is obtained by MTC. Finally, some experiments show that the new method can improve the cluster precision.

Key wordsText clustering      Max term contribution      Character extraction      Simulated annealing     
Received: 02 September 2008      Published: 25 December 2008


Corresponding Authors: Lu Guoli     E-mail:
About author:: Lu Guoli,Wang Xiaohua,Wang Rongbo

Cite this article:

Lu Guoli,Wang Xiaohua,Wang Rongbo. Text Clustering Research on the Max Term Contribution Dimension Reduction and Simulated Annealing Algorithm. New Technology of Library and Information Service, 2008, 24(12): 43-47.

