New Technology of Library and Information Service  2013, Vol. Issue (5): 34-39    DOI: 10.11925/infotech.1003-3513.2013.05.04
A Feature Selection Based on Consideration of Multiple Factors
Lu Yonghe, Li Yanfeng
School of Information Management, Sun Yat-Sen University, Guangzhou 510006, China
Abstract  In the process of feature selection, term’s weight determines whether the term can be a feature. But the weight is affected by many factors, the main factors are term’s importance, characteristics and representative. With the consideration of those factors, a new function TW (Term Weight) based on the importance of the feature and the ability of category distinguishing, is brought to be an improved method to select features. After that, experiments on the comparison between term’s CHI, IG and TW validate that TW can increase the weight of special features in a class and can decrease the weight of unimportant features. Finally, the validity of the new algorithm in feature selection is proved by the classification experiments on Chinese classification corpus by three classifiers.
Key wordsText categorization      Feature selection      Class discrimination      TF-IDF     
Received: 16 April 2013      Published: 03 July 2013
Cite this article:

Lu Yonghe, Li Yanfeng. A Feature Selection Based on Consideration of Multiple Factors. New Technology of Library and Information Service, 2013, (5): 34-39.

