School of Management and Economics, North China University of Water Resources and Electric Power, Zhengzhou 450046, China 2School of Information Management, Wuhan University, Wuhan 430072, China 3Center for Science, Technology & Education Assessment (CSTEA), Wuhan University, Wuhan 430072, China 4Department of MSI & ECOOM, KU Leuven, Leuven B-3000, Belgium
[Objective] This paper identifies the literature subjects according to their contents, aiming to meet the needs of interdisciplinary measurement based on the discipline classification of a single paper. [Methods] With the help of the Leuven-Budapest subject classification system, we used machine learning, deep learning, and pre-training language models to classify abstracts from 15 primary disciplines. Then, we used the improved SCIBERT model to conduct interdisciplinary measurement analysis. [Results] The improved SCIBERT model had the best automatic classification performance, with an average F1 score of 81.45%. Some individual categories achieved a classification performance of over 90%. The highest interdisciplinary degree among the 15 primary disciplines was 0.38 for biomedical research, while the lowest was 0.08 for physics. [Limitations] This paper measures the interdisciplinary from the perspective of text content and does not consider multi-dimensional methods for interdisciplinary measurement. [Conclusions] The pre-training model performs best in automatically classifying journal articles, followed by deep learning models. In contrast, machine learning models had the worst performance. Using automatic classification for interdisciplinary measurement based on literature content expanded the current research system and is helpful for a multi-angle and deep understanding of interdisciplinary research.
(Yang Liangbin, Zhou Qiuju, Jin Bihui. The Interdisciplinary Measure and Empirical Research Based on Bibliometrics[J]. Library and Information Service, 2009, 53(10): 87-90, 115.)
(Yang Chenyuyan, Fan Shaoping, Cai Rong, et al. Relationship Between Interdisciplinarity and Impact of Papers in Medical Field and Establishment of Its Measurement Model[J]. Chinese Journal of Medical Library and Information Science, 2020, 29(11): 24-30.)
(Zeng Yueliang, Si Li. Interdisciplinary Research Collaboration: Background, Theoretical Research and Practice Progress[J]. Library and Information Service, 2021, 65(10): 127-140.)
doi: 10.13266/j.issn.0252-3116.2021.10.013
(Wang Hong, Jia Huibo, Xu Duanyi. Literature Automatic Categorization of Chinese Academic Journals Based on the Manual Labeling[J]. Journal of Tsinghua University(Science and Technology), 2002, 42(6): 787-790.)
(Wang Haopeng, Wang Weidong, Li Sen. A Methods Based on Metadata for Technical Literature Categorization[J]. Journal of Shandong Normal University(Natural Science), 2008, 23(3): 41-43.)
(Wang Hao, Ye Peng, Deng Sanhong. The Application of Machine-Learning in the Research on Automatic Categorization of Chinese Periodical Articles[J]. New Technology of Library and Information Service, 2014(3): 80-87.)
Xue Feng, Hu Yue, Xia Shuai, et al. Research on Short Text Classification Based on Paper Title and Abstract[J]. Journal of Hefei University of Technology(Natural Science), 2018, 41(10): 1343-1349.)
[10]
Hu J M, Zhang Y. Measuring the Interdisciplinarity of Big Data Research: A Longitudinal Study[J]. Online Information Review, 2018, 42(5): 681-696.
doi: 10.1108/OIR-12-2016-0361
[11]
Porter A L, Cohen A S, Roessner J D, et al. Measuring Researcher Interdisciplinarity[J]. Scientometrics, 2007, 72(1): 117-147.
doi: 10.1007/s11192-007-1700-5
[12]
Rafols I, Meyer M. Diversity and Network Coherence as Indicators of Interdisciplinarity: Case Studies in Bionanoscience[J]. Scientometrics, 2010, 82(2): 263-287.
doi: 10.1007/s11192-009-0041-y
[13]
Stirling A. A General Framework for Analysing Diversity in Science, Technology and Society[J]. Journal of the Royal Society, Interface, 2007, 4(15): 707-719.
pmid: 17327202
[14]
Porter A L, Chubin D E. An Indicator of Cross-Disciplinary Research[J]. Scientometrics, 1985, 8(3): 161-176.
doi: 10.1007/BF02016934
[15]
Bromham L, Dinnage R, Hua X. Interdisciplinary Research Has Consistently Lower Funding Success[J]. Nature, 2016, 534(7609): 684-687.
doi: 10.1038/nature18315
[16]
Zhang L, Rousseau R, Glänzel W. Diversity of References as an Indicator of the Interdisciplinarity of Journals: Taking Similarity Between Subject Fields into Account[J]. Journal of the Association for Information Science and Technology, 2016, 67(5): 1257-1265.
doi: 10.1002/asi.2016.67.issue-5
[17]
del Carmen Calatrava Moreno M, Auzinger T, Werthner H. On the Uncertainty of Interdisciplinarity Measurements Due to Incomplete Bibliographic Data[J]. Scientometrics, 2016, 107(1): 213-232.
doi: 10.1007/s11192-016-1842-4
[18]
Leydesdorff L, Wagner C S, Bornmann L. Interdisciplinarity as Diversity in Citation Patterns among Journals: Rao-Stirling Diversity, Relative Variety, and the Gini Coefficient[J]. Journal of Informetrics, 2019, 13(1): 255-269.
doi: 10.1016/j.joi.2018.12.006
(Huang Ying, Zhang Lin, Sun Beibei, et al. Interdisciplinarity Measurement: External Knowledge Integration,Internal Information Convergence and Research Activity Pattern[J]. Studies in Science of Science, 2019, 37(1): 25-35.)
[20]
Huang L, Cai Y J, Zhao E D, et al. Measuring the Interdisciplinarity of Information and Library Science Interactions Using Citation Analysis and Semantic Analysis[J]. Scientometrics, 2022, 127(11): 6733-6761.
doi: 10.1007/s11192-022-04401-x
[21]
Zhang L, Sun B B, Chinchilla-Rodríguez Z, et al. Interdisciplinarity and Collaboration: On the Relationship between Disciplinary Diversity in Departmental Affiliations and Reference Lists[J]. Scientometrics, 2018, 117(1): 271-291.
doi: 10.1007/s11192-018-2853-0
[22]
Xu H Y, Guo T, Yue Z H, et al. Interdisciplinary Topics of Information Science: A Study Based on the Terms Interdisciplinarity Index Series[J]. Scientometrics, 2016, 106(2): 583-601.
doi: 10.1007/s11192-015-1792-2
(Bai Xiaoming, Qiu Taorong. Science and Technology Text Auto Sort Study Base of SVM and KNN Algorithm[J]. Microcomputer Information, 2006, 22(36): 275-276, 65.)
[25]
Zhang M L, Zhou Z H. ML-KNN: A Lazy Learning Approach to Multi-label Learning[J]. Pattern Recognition, 2007, 40(7): 2038-2048.
doi: 10.1016/j.patcog.2006.12.019
[26]
Eckle-Kohler J, Nghiem T D, Gurevych I. Automatically Assigning Research Methods to Journal Articles in the Domain of Social Sciences[J]. Proceedings of the American Society for Information Science and Technology, 2013, 50(1): 1-8.
Zeng Limei. Categorization of Master Thesis Based on Text Data Mining[J]. Journal of Chongqing University of Posts and Telecommunications(Natural Science Edition), 2010, 22(5):669-672, 682.)
[28]
Kim Y. Convolutional Neural Networks for Sentence Classification[OL]. arXiv Preprint, arXiv: 1408.5882.
(Kong Jie. Research on Automatic Literature Classification System Based on Deep Learning and Chinese Library Classification[J]. New Century Library, 2021(5): 51-56.)
[30]
Devlin J, Chang M, Lee K, et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding[OL]. arXiv Preprint, arXiv:1810.04805.
(Zhao Yang, Zhang Zhixiong, Liu Huan. A Research on Automatic Classification of Chinese Medical Literature Based on Hierarchical Classification[J]. Research on Library Science, 2021(21): 49-55, 61.)
(Ou Shiyan, Chen Jiawen. The Research on Automatic Recognition of Moves in Full-Text Scientific Papers[J]. Journal of Modern Information, 2021, 41(11): 3-11.)
doi: 10.3969/j.issn.1008-0821.2021.11.001
(Wang Mo, Cui Yunpeng, Chen Li, et al. A Deep Learning-Based Method of Argumentative Zoning for Research Articles[J]. Data Analysis and Knowledge Discovery, 2020, 4(6): 60-68.)
[34]
Bu Y, Li M Y, Gu W Y, et al. Topic Diversity: A Discipline Scheme-Free Diversity Measurement for Journals[J]. Journal of the Association for Information Science and Technology, 2021, 72(5): 523-539.
doi: 10.1002/asi.v72.5
(Liu Liu, Wang Dongbo. Identifying Interdisciplinary Social Science Research Based on Article Classification[J]. Data Analysis and Knowledge Discovery, 2018, 2(3): 30-38.)