New Technology of Library and Information Service  2013, Vol. 29 Issue (10): 27-30    DOI: 10.11925/infotech.1003-3513.2013.10.05
Improved TF-IDF Method in Text Classification
Qin Shian, Li Fayun
School of Public Administration and Policy, Fuzhou University, Fuzhou 350108, China
Abstract  When the count of one class is much more than another class's, the result of IDF in TF-IDF goes the wrong way according to its design idea. This paper solves the problem by using probability to change TF-IDF algorithm. In the end, the experiment proves that the solution mentioned above is good at classifying webpage text through a simple way to cumulative sum the value of characteristic words and the speed is faster and the accuracy rate is promoted.
Key wordsProbability      TF-IDF      Webpage      Text classification     
Received: 17 June 2013      Published: 04 November 2013
:  TP391  

Qin Shian, Li Fayun. Improved TF-IDF Method in Text Classification. New Technology of Library and Information Service, 2013, 29(10): 27-30.

