Data Analysis and Knowledge Discovery  0, Vol. Issue (): 1-    DOI: 10.11925/infotech.2096-3467. 2020.0238
Automatic Classification based on Multi-factor Algorithm
Li Jiao1,Huang Yongwen,Luo Tingting,Zhao Ruixue,Xian Guojian
(Agricultural Information Institute of CAAS, Beijing 100081, China)
(Key Laboratory of Agricultural Big Data, Ministry of Agriculture and Rural Affairs, Beijing 100081, China)
[Objective] This paper develops a manpower-saving method of automatic classification indexing with wide appilcability, aiming to support classification management of massive information resource and disclosure of subject area.

[Methods] By analyzing the correspondence between terms, concepts and other keywords representing the subject concept in the literature and classification number, we designed a multi-factor weighted algorithm, and proposed a full-process automatic classification indexing scheme.

[Results] The experiment based on authoritative multi-domain annotated corpora and standards sets shows: For literature with single subject classification number, the precision, recall and F values were 84.1%, 79.8%, and 81.9% respectively. For literature with two subject classification numbers, the precision, recall and F values were 83.4%, 78.8%, and 81%.

[Limitations] The accuracy and completeness of subject classification indexing depends on high-quality annotation corpora, and the indexing of interdisciplinary literature needs to be improved.

[Conclusions] The proposed automatic classification indexing based on multi-factor algorithm has high operability and practical application value.

Key words Automatic Classification      Subject Classification      Multi-factor Algorithm      
Published: 02 September 2020
ZTFLH:  TP393,G250  

Cite this article:

Li Jiao, Huang Yongwen, Luo Tingting, Zhao Ruixue, Xian Guojian. Automatic Classification based on Multi-factor Algorithm . Data Analysis and Knowledge Discovery, 0, (): 1-.

URL: 2020.0238     OR

