New Technology of Library and Information Service  2004, Vol. 20 Issue (12): 7-9    DOI: 10.11925/infotech.1003-3513.2004.12.02
Application of Improved Information Gain Feature Selection Methodto Text Clustering
Chen Tao1   Song Yan2   Xie Yangqun1
1(Department of Management Science and Engineering, Ningbo, Zhejiang 315211,China)
2(Department of Business Administration,Nanjing,Jiangsu 210093,China)
This paper applies the improved information gain method to the text clustering. Retrieving 250 from the corpus, according to Vector Space Model and the information gain feature selection method,construct the text feature vector;use C-means to automatic clustering, the precision、recall and F-measure are 0.82、0.88、0.83.

Key wordsInformation gain      Feature selection      Clustering     
Received: 07 July 2004      Published: 25 December 2004



Corresponding Authors: Xie Yangqun     E-mail:
About author:: Chen Tao,Song Yan,Xie Yangqun

Cite this article:

Chen Tao,Song Yan,Xie Yangqun. Application of Improved Information Gain Feature Selection Methodto Text Clustering. New Technology of Library and Information Service, 2004, 20(12): 7-9.

