New Technology of Library and Information Service  2014, Vol. 30 Issue (3): 80-87    DOI: 10.11925/infotech.1003-3513.2014.03.12
The Application of Machine-Learning in the Research on Automatic Categorization of Chinese Periodical Articles
Wang Hao, Ye Peng, Deng Sanhong
School of Information Management, Nanjing University, Nanjing 210093, China
[Objective] Under the computing mode of machine learning, using the methods of feature weighting and shallow-hierarchical classification can effectively achieve Chinese Library Classification (CLC) classification for periodical articles. [Context] The traditional way of artificial classification shows its own limits in the background of "Big Data", and the trend of periodicals electronic makes that automatic classification techniques can effectively relief the pressure of artificial classification jobs. [Methods] This paper introduces the thinking of machine-learning into the field of automatic classification of periodical articles. It analyzes and compares the effects of Support Vector Machine(SVM) and BP Neural Networks Algorithm(BPNN) in the procedure of automatic classification, transforms CLC into another classification system with three levels in the thoughts of hierarchical classification, and sets the weights based the sources of classification features. [Results] The experiments of classification tests show that SVM is more reasonable than BPNN under the condition of large-scale sparse data, the accuracy rates of these three levels reach 95.05%, 92.89% and 89.02%, and the integrated accuracy rate is close to 80%, and the feature weights from mulit-sources can lead to better classification results than single-source. [Conclusions] The study proves that the model of machine-learning with feature weighting and shallow-hierarchical classification in automatic classification of periodical articles has higher feasibility, rationality and effectiveness, and a new idea on automatic classification of periodical articles has been presented.

Key wordsMachine-Learning      Periodical article      Automatic text categorization      Feature weighting      Hierarchy classification     
Received: 02 September 2013      Published: 15 April 2014
:  TP391  

Cite this article:

Wang Hao, Ye Peng, Deng Sanhong. The Application of Machine-Learning in the Research on Automatic Categorization of Chinese Periodical Articles. New Technology of Library and Information Service, 2014, 30(3): 80-87.



