Identifying Interdisciplinary Sci-Tech Literature Based on Multi-Label Classification
Wang Weijun1,2,Ning Zhiyuan1,2,Du Yi1,2(),Zhou Yuanchun1,2
1Computer Network Information Center, Chinese Academy of Sciences, Beijing 100190, China 2University of Chinese Academy of Sciences, Beijing 100049, China
[Objective] This paper tries to identify interdisciplinary sci-tech literature, aiming to find emerging interdisciplinary issues. [Methods] We combined the discipline labels of sci-tech literature provided by specialists with labels predicted by text classification algorithms to find interdisciplinary studies. [Results] The F1 value of the proposed method reached 0.45, which was 0.22 higher than those of the model-based predictions. [Limitations] The model had low recall values for identifying the interdisciplinary sci-tech research. [Conclusions] The paper effectively addresses the classification issues of interdisciplinary sci-tech literature, which merits more studies in the future.
王卫军, 宁致远, 杜一, 周园春. 基于多标签分类的科技文献学科交叉研究性质识别*[J]. 数据分析与知识发现, 2023, 7(1): 102-112.
Wang Weijun, Ning Zhiyuan, Du Yi, Zhou Yuanchun. Identifying Interdisciplinary Sci-Tech Literature Based on Multi-Label Classification. Data Analysis and Knowledge Discovery, 2023, 7(1): 102-112.
Klein J T. A Conceptual Vocabulary of Interdisciplinary Science[A]//StehrN, WeingartP. Practising Interdisciplinarity[M]. Toronto: University of Toronto Press, 2000: 3-24.
[2]
Easton D. The Division, Integration, and Transfer of Knowledge[J]. Bulletin of the American Academy of Arts and Sciences, 1991, 44(4): 8-27.
doi: 10.2307/3824130
( Xu Haiyun, Dong Kun, Wei Ling. Research on Interdisciplinary Topics Identification and Prediction Methods[M]. Beijing: Scientific and Technical Documents Publishing House, 2019.)
[4]
魏建香. 学科交叉知识发现及其可视化研究[D]. 南京: 南京大学, 2010.
[4]
( Wei Jianxiang. Interdiscipline Knowledge Discovery and Its Visualization Research[D]. Nanjing: Nanjing University, 2010.)
[5]
Dong K, Xu H Y, Luo R, et al. An Integrated Method for Interdisciplinary Topic Identification and Prediction: A Case Study on Information Science and Library Science[J]. Scientometrics, 2018, 115(2): 849-868.
doi: 10.1007/s11192-018-2694-x
[6]
Ba Z C, Cao Y J, Mao J, et al. A Hierarchical Approach to Analyzing Knowledge Integration Between Two Fields—A Case Study on Medical Informatics and Computer Science[J]. Scientometrics, 2019, 119(3): 1455-1486.
doi: 10.1007/s11192-019-03103-1
( Ruan Guangce, Xia Lei. Research on Interdisciplinary Topics Identification—A Case Study of Library & Information Science and Education[J]. Information Science, 2020, 38(12): 152-157.)
[8]
Deshmukh P R, Borhade B. Support Vector Machine Classifier for Research Discipline Area Selection[C]// Proceedings of the 2017 International Conference on Intelligent Computing and Control Systems. IEEE, 2017: 462-466.
( Wang Hao, Ye Peng, Deng Sanhong. The Application of Machine-Learning in the Research on Automatic Categorization of Chinese Periodical Articles[J]. New Technology of Library and Information Service, 2014(3): 80-87.)
( Liu Xiaodong, Ni Haoran. Application of Deep Learning Technology in Discipline Integration Research[J]. Frontiers of Data & Computing, 2020(5): 99-109.)
[11]
Xiao M, Qiao Z Y, Fu Y J, et al. Expert Knowledge-Guided Length-Variant Hierarchical Label Generation for Proposal Classification[C]// Proceedings of the 2021 IEEE International Conference on Data Mining. IEEE, 2021: 757-766.
[12]
Kowsari K, Brown D E, Heidarysafa M, et al. HDLTex: Hierarchical Deep Learning for Text Classification[C]// Proceedings of the 16th IEEE International Conference on Machine Learning and Applications. IEEE, 2017: 364-371.
[13]
Haghighian Roudsari A, Afshar J, Lee W, et al. PatentNet: Multi-Label Classification of Patent Documents Using Deep Learning Based Language Understanding[J]. Scientometrics, 2022, 127(1): 207-231.
doi: 10.1007/s11192-021-04179-4
[14]
Xiao M, Qiao Z, Fu Y, et al. Who Should Review Your Proposal? Interdisciplinary Topic Path Detection for Research Proposals[OL]. arXiv Preprint, arXiv: 2203.10922.
( Huang Xuejian, Liu Yuyang, Ma Tinghuai. Classification Model for Scholarly Articles Based on Improved Graph Neural Network[J]. Data Analysis and Knowledge Discovery, 2022, 6(10): 93-102.)
( Wang Dongbo. Identifying Interdisciplinary Social Science Research Based on Article Classification[J]. Data Analysis and Knowledge Discovery, 2018, 2(3): 30-38.)
[17]
Lyutov A, Uygun Y, Hütt M T. Machine Learning Misclassification of Academic Publications Reveals Non-Trivial Interdependencies of Scientific Disciplines[J]. Scientometrics, 2021, 126(2): 1173-1186.
doi: 10.1007/s11192-020-03789-8
[18]
Li Q, Peng H, Li J, et al. A Survey on Text Classification: From Shallow to Deep Learning[OL]. arXiv Preprint, arXiv: 2008.00364.
[19]
Yegros-Yegros A, Rafols I, D'Este P. Does Interdisciplinary Research Lead to Higher Citation Impact? The Different Effect of Proximal and Distal Interdisciplinarity[J]. PLoS One, 2015, 10(8): e0135095.
doi: 10.1371/journal.pone.0135095
[20]
Joulin A, Grave E, Bojanowski P, et al. Bag of Tricks for Efficient Text Classification[C]// Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics. 2017: 427-431.
[21]
Kim Y. Convolutional Neural Networks for Sentence Classification[C]// Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing. 2014: 1746-1751.
[22]
Liu P, Qiu X, Huang X. Recurrent Neural Network for Text Classification with Multi-Task Learning[C]// Proceedings of the 25th International Joint Conference on Artificial Intelligence. 2016.
[23]
Lai S, Xu L, Liu K, et al. Recurrent Convolutional Neural Networks for Text Classification[C]// Proceedings of the 29th AAAI Conference on Artificial Intelligence. 2015.
[24]
Johnson R, Zhang T. Deep Pyramid Convolutional Neural Networks for Text Categorization[C]// Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics. 2017: 562-570.
[25]
Zhou C, Sun C, Liu Z, et al. A C-LSTM Neural Network for Text Classification[OL]. arXiv Preprint, arXiv: 1511.08630.
[26]
Yang Z, Yang D, Dyer C, et al. Hierarchical Attention Networks for Document Classification[C]// Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies. 2016.
[27]
Devlin J, Chang M W, Lee K, et al. BERT: Pre-Training of Deep Bidirectional Transformers for Language Understanding[OL]. arXiv Preprint, arXiv: 1810.04805.
[28]
Vaswani A, Shazeer N, Parmar N, et al. Attention is All You Need[C]// Proceedings of the 31st International Conference on Neural Information Processing Systems. 2017.