Application of Improved Information Gain Feature Selection Methodto Text Clustering
Chen Tao1 Song Yan2 Xie Yangqun1
1(Department of Management Science and Engineering, Ningbo, Zhejiang 315211,China) 2(Department of Business Administration,Nanjing,Jiangsu 210093,China)
This paper applies the improved information gain method to the text clustering. Retrieving 250 from the corpus, according to Vector Space Model and the information gain feature selection method,construct the text feature vector;use C-means to automatic clustering, the precision、recall and F-measure are 0.82、0.88、0.83.
陈涛,宋妍,谢阳群. 改进的信息增益特征选择方法在文本聚类中的应用*[J]. 现代图书情报技术, 2004, 20(12): 7-9.
Chen Tao,Song Yan,Xie Yangqun. Application of Improved Information Gain Feature Selection Methodto Text Clustering. New Technology of Library and Information Service, 2004, 20(12): 7-9.