Please wait a minute...
Data Analysis and Knowledge Discovery  0, Vol. Issue (): 1-    DOI: 10.11925/infotech.2096-3467. 2020.0238
Current Issue | Archive | Adv Search |
Automatic Classification based on Multi-factor Algorithm
Li Jiao1,Huang Yongwen,Luo Tingting,Zhao Ruixue,Xian Guojian
(Agricultural Information Institute of CAAS, Beijing 100081, China)
(Key Laboratory of Agricultural Big Data, Ministry of Agriculture and Rural Affairs, Beijing 100081, China)
Download:
Export: BibTeX | EndNote (RIS)      
Abstract  

[Objective] This paper develops a manpower-saving method of automatic classification indexing with wide appilcability, aiming to support classification management of massive information resource and disclosure of subject area.

[Methods] By analyzing the correspondence between terms, concepts and other keywords representing the subject concept in the literature and classification number, we designed a multi-factor weighted algorithm, and proposed a full-process automatic classification indexing scheme.

[Results] The experiment based on authoritative multi-domain annotated corpora and standards sets shows: For literature with single subject classification number, the precision, recall and F values were 84.1%, 79.8%, and 81.9% respectively. For literature with two subject classification numbers, the precision, recall and F values were 83.4%, 78.8%, and 81%.

[Limitations] The accuracy and completeness of subject classification indexing depends on high-quality annotation corpora, and the indexing of interdisciplinary literature needs to be improved.

[Conclusions] The proposed automatic classification indexing based on multi-factor algorithm has high operability and practical application value.

Key words Automatic Classification      Subject Classification      Multi-factor Algorithm      
Published: 02 September 2020
ZTFLH:  TP393,G250  

Cite this article:

Li Jiao, Huang Yongwen, Luo Tingting, Zhao Ruixue, Xian Guojian. Automatic Classification based on Multi-factor Algorithm . Data Analysis and Knowledge Discovery, 0, (): 1-.

URL:

http://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/10.11925/infotech.2096-3467. 2020.0238     OR     http://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/Y0/V/I/1

[1] Gong Lijuan,Wang Hao,Zhang Zixuan,Zhu Liping. Reducing Dimensions of Custom Declaration Texts with Word2Vec[J]. 数据分析与知识发现, 2020, 4(2/3): 89-100.
[2] Li Jiao,Huang Yongwen,Luo Tingting,Zhao Ruixue,Xian Guojian. Automatic Classification Method Based on Multi-factor Algorithm[J]. 数据分析与知识发现, 2020, 4(11): 43-51.
[3] Deng Sanhong,Fu Yuyangzi,Wang Hao. Multi-Label Classification of Chinese Books with LSTM Model[J]. 数据分析与知识发现, 2017, 1(7): 52-60.
[4] Li Xiangdong,Ba Zhichao,Gao Fan. Review of Digital Documents Automatic Classification Research[J]. 现代图书情报技术, 2016, 32(9): 17-26.
[5] He Lin, Wan Jian, He Juan, Guo Shiyun. Research on Automatic Classification of Chinese Books Based on Social Tagging[J]. 现代图书情报技术, 2014, 30(9): 1-7.
[6] Hu Bing, Zhang Jianli. Research on Chinese Patent Automatic Classification Method Based on Statistical Distribution[J]. 现代图书情报技术, 2013, 29(7/8): 101-106.
[7] Xu Jian, Wen Haosheng. Study on Talents Description Web Page Automatic Recognition System[J]. 现代图书情报技术, 2011, 27(6): 20-26.
[8] Ma Fang. Research of Patent Automatic Classification Based on RBFNN[J]. 现代图书情报技术, 2011, 27(12): 58-63.
[9] Ouyang Jian. Application and Experiment of Book Subject Classification  Navigation of Online Bookstore in OPAC[J]. 现代图书情报技术, 2009, (9): 86-90.
[10] Wang Meiwen. Design and Implementation of Automatic Classification Meta-search Engine Based on Ontology[J]. 现代图书情报技术, 2008, 24(9): 58-63.
[11] Guo Shaoyou. Research on Automatic Classification Based on Term Context Relations[J]. 现代图书情报技术, 2008, 24(5): 44-49.
[12] Qian Aibing,Jiang Lan . Automatic Classification Based on News Titles for Chinese News Web Pages[J]. 现代图书情报技术, 2008, 24(10): 59-68.
[13] Yue Qingling. Automated Folksonomy Research of Tag Resource Based on Synergetic Mechanism[J]. 现代图书情报技术, 2007, 2(9): 58-61.
[14] Luan Fangfang. Automatic Classification Approach and Implement of Multi-media Information Resources[J]. 现代图书情报技术, 2007, 2(7): 83-87.
[15] Fu Liang. A Design of Automatic Classification Based on the Military Information Resources Classification’s Indexing-experience[J]. 现代图书情报技术, 2007, 2(11): 76-79.
  Copyright © 2016 Data Analysis and Knowledge Discovery   Tel/Fax:(010)82626611-6626,82624938   E-mail:jishu@mail.las.ac.cn