Algorithm and Experiment Research of Textual Document Clustering Based on Improved K-means
Cen Yonghua 1,2 Wang Xiaorong2 Ji Yonghui 1
1(Department of Information Management,Nanjing University,Nanjing 210093,China) 2(Department of Information Management,Nanjing University of Science & Technology,Nanjing 210094,China)
After a concise introduction of conotation,functions and general processs of textual document clustering, this paper expotiates the basic mechanism of a kind of improved K-means clustering based on initial centroids selection through minimum-maximum principle, designs its algorithm, implements the clustering system, and conducts several experiments taking 300 academic articles and relative characteristic words for instances, which prove the good performance of the algorithm proposed.
岑咏华,王晓蓉,吉雍慧. 一种基于改进K-means的文档聚类算法的实现研究[J]. 现代图书情报技术, 2008, 24(12): 73-79.
Cen Yonghua,Wang Xiaorong,Ji Yonghui. Algorithm and Experiment Research of Textual Document Clustering Based on Improved K-means. New Technology of Library and Information Service, 2008, 24(12): 73-79.