|
|
Automatic Classification based on Multi-factor Algorithm
|
Li Jiao1,Huang Yongwen,Luo Tingting,Zhao Ruixue,Xian Guojian
|
(Agricultural Information Institute of CAAS, Beijing 100081, China)
(Key Laboratory of Agricultural Big Data, Ministry of Agriculture and Rural Affairs, Beijing 100081, China)
|
|
|
Abstract
[Objective] This paper develops a manpower-saving method of automatic classification indexing with wide appilcability, aiming to support classification management of massive information resource and disclosure of subject area.
[Methods] By analyzing the correspondence between terms, concepts and other keywords representing the subject concept in the literature and classification number, we designed a multi-factor weighted algorithm, and proposed a full-process automatic classification indexing scheme.
[Results] The experiment based on authoritative multi-domain annotated corpora and standards sets shows: For literature with single subject classification number, the precision, recall and F values were 84.1%, 79.8%, and 81.9% respectively. For literature with two subject classification numbers, the precision, recall and F values were 83.4%, 78.8%, and 81%.
[Limitations] The accuracy and completeness of subject classification indexing depends on high-quality annotation corpora, and the indexing of interdisciplinary literature needs to be improved.
[Conclusions] The proposed automatic classification indexing based on multi-factor algorithm has high operability and practical application value.
|
Published: 02 September 2020
|
|
|
Viewed |
|
|
|
Full text
|
|
|
|
|
Abstract
|
|
|
|
|
Cited |
|
|
|
|
|
Shared |
|
|
|
|
|
Discussed |
|
|
|
|