Identifying Structural Function of Scientific Literature Abstracts Based on Deep Active Learning
Mao Jin1,2(),Chen Ziyang1,2
1Center for Studies of Information Resources, Wuhan University, Wuhan 430072, China 2School of Information Management, Wuhan University, Wuhan 430072, China
[Objective] This paper explores different DeepAL methods for identifying the structural function of scientific literature abstracts and their labeling costs. [Methods] Firstly, we constructed a SciBERT-BiLSTM-CRF model for the abstracts (SBCA), which utilized the contextual sequence information between sentences. Then, we developed an uncertainty active learning strategy for single sentences and full text of the abstracts. Finally, we conducted experiments on the PubMed 20K dataset. [Results] The SBCA model showed the best recognition performance and increased the F1 value by 11.93%, compared to the SciBERT model without sequence information. Using the Least Confidence strategy based on the abstracts, our SBCA model achieved its optimal F1 value with 60% of the experimental data. Using the Least Confidence strategy based on sentences, the SBCA model achieved optimal F1 value with 65% of the experimental data. [Limitations] In the future, we need to examine different active learning strategies in more fields or multi-language datasets. [Conclusions] The new model based on deep active learning could identify the structural function of scientific literature with a lower annotation cost.
毛进, 陈子洋. 基于深度主动学习的科技文献摘要结构功能识别研究*[J]. 数据分析与知识发现, 2024, 8(6): 44-55.
Mao Jin, Chen Ziyang. Identifying Structural Function of Scientific Literature Abstracts Based on Deep Active Learning. Data Analysis and Knowledge Discovery, 2024, 8(6): 44-55.
(Chen Yuetong, Wang Hao, Li Yueyan, et al. An Academic Articles Evaluation Method Oriented to Content Differentiation[J]. Journal of Information Resources Management, 2022, 12(4): 56-69.)
doi: 10.13365/j.jirm.2022.04.056
[2]
Balcan M F, Beygelzimer A, Langford J. Agnostic Active Learning[J]. Journal of Computer and System Sciences, 2009, 75(1): 78-89.
[3]
Settles B. Active Learning Literature Survey[D]. Madison: University of Wisconsin-Madison Department of Computer Sciences, 2006.
[4]
Yu K, Zhu S H, Xu W, et al. trNon-Greedy Active Learning for Text Categorization Using Convex Ansductive Experimental Design[C]// Proceedings of the 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, 2008: 635-642.
[5]
Figueroa R L, Zeng-Treitler Q, Ngo L H, et al. Active Learning for Clinical Text Classification: Is It Better than Random Sampling?[J]. Journal of the American Medical Informatics Association, 2012, 19(5): 809-816.
doi: 10.1136/amiajnl-2011-000648
pmid: 22707743
[6]
Schröder C, Niekler A. A Survey of Active Learning for Text Classification Using Deep Neural Networks[OL]. arXiv Preprint, arXiv: 2008.07267.
[7]
Tong S, Koller D. Support Vector Machine Active Learning with Applications to Text Classification[C]// Proceedings of the 17th International Conference on Machine Learning. ACM, 2000: 999-1006.
(Duan Youxiang, Zhang Xiaotian. Research on SVM Review Content Classification Algorithm Based on Active Learning[J]. Computer & Digital Engineering, 2022, 50(3): 608-612.)
[9]
Shen Y Y, Yun H, Lipton Z C, et al. Deep Active Learning for Named Entity Recognition[OL]. arXiv Preprint, arXiv: 1707.05928.
[10]
Haffari G, Sarkar A. Active Learning for Multilingual Statistical Machine Translation[C]// Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP. 2009: 181-189.
[11]
Howard J, Ruder S. Universal Language Model Fine-Tuning for Text Classification[C]// Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics. 2018: 328-339.
[12]
Zhang Y, Lease M, Wallace B. Active Discriminative Text Representation Learning[C]// Proceedings of the 31st AAAI Conference on Artificial Intelligence. 2017: 3386-3392.
[13]
Hu R, Mac Namee B, Delany S J. Active Learning for Text Classification with Reusability[J]. Expert Systems with Applications, 2016, 45: 438-449.
[14]
Lu J H, Mac Namee B. Investigating the Effectiveness of Representations Based on Pretrained Transformer-Based Language Models in Active Learning for Labelling Text Datasets[OL]. arXiv Preprint, arXiv: 2004.13138.
[15]
Chen Y K, Lasko T A, Mei Q Z, et al. A Study of Active Learning Methods for Named Entity Recognition in Clinical Text[J]. Journal of Biomedical Informatics, 2015, 58: 11-18.
doi: S1532-0464(15)00203-8
pmid: 26385377
(Shi Jiaoxiang, Zhu Lijun, Wei Chao, et al. FinTech Named Entity Recognition Based on Transfer Learning and Active Learning[J]. China Science & Technology Resources Review, 2022, 54(2): 35-45.)
(Jing Shenqi, Zhao Youlin. Recognizing Clinical Named Entity from Chinese Electronic Medical Record Texts Based on Semi-Supervised Deep Learning[J]. Journal of Information Resources Management, 2021, 11(6): 105-115.)
doi: 10.13365/j.jirm.2021.06.105
(Wang Mo, Cui Yunpeng, Chen Li, et al. A Deep Learning-Based Method of Argumentative Zoning for Research Articles[J]. Data Analysis and Knowledge Discovery, 2020, 4(6): 60-68.)
(Cao Yan, Mu Aipeng. The Characteristics of Academic Words Across Different Abstract Moves of English Scientific and Technical Journals[J]. Foreign Language Research, 2011(3): 46-49.)
(Song Donghuan, Li Chenying, Liu Ziyu, et al. Semantic Feature Dictionary Construction of Abstract in English Scientific Journals[J]. Library and Information Service, 2020, 64(6): 108-119.)
doi: 10.13266/j.issn.0252-3116.2020.06.013
(Shen Si, Hu Haotian, Ye Wenhao, et al. Research on Abstract Structure Function Automatic Recognition Based on Full Character Semantics[J]. Journal of the China Society for Scientific and Technical Information, 2019, 38(1): 79-88.)
(Li Xiangdong, Sun Qianru, Shi Jian. Automatic Classification of Product Review Texts Combining Short Text Extension and BERT[J]. Journal of Information Resources Management, 2023, 13(1): 129-139.)
doi: 10.13365/j.jirm.2023.01.129
(Zhang Zhixiong, Liu Huan, Ding Liangping, et al. Identifying Moves of Research Abstracts with Deep Learning Methods[J]. Data Analysis and Knowledge Discovery, 2019, 3(12): 1-9.)
[27]
Beltagy I, Lo K, Cohan A. SciBERT: A Pretrained Language Model for Scientific Text[OL]. arXiv Preprint, arXiv: 1903.10676.
[28]
Dernoncourt F, Lee J Y. PubMed 200k RCT: A Dataset for Sequential Sentence Classification in Medical Abstracts[C]// Proceedings of the 8th International Joint Conference on Natural Language Processing. 2017: 308-313.
(Zhao Yang, Zhang Zhixiong, Liu Huan, et al. Design and Implementation of the Move Recognition System for Fund Project Abstract[J]. Information Studies: Theory & Application, 2022, 45(8): 162-168.)
(Zhao Yang, Zhang Zhixiong, Li Jie. The Construction of Move Recognition Corpus for Project Application Abstract[J]. Library and Information Service, 2022, 66(21): 97-106.)
doi: 10.13266/j.issn.0252-3116.2022.21.011
[31]
Jin D, Szolovits P. Hierarchical Neural Networks for Sequential Sentence Classification in Medical Scientific Abstracts[C]// Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. 2018: 3100-3109.
[32]
Dernoncourt F, Lee J Y, Szolovits P. Neural Networks for Joint Sentence Classification in Medical Paper Abstracts[OL]. arXiv Preprint, arXiv: 1612.05251.
[33]
Cohan A, Beltagy I, King D, et al. Pretrained Language Models for Sequential Sentence Classification[C]// Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing. 2019: 3693-3699.
[34]
Brack A, Entrup E, Stamatakis M, et al. Sequential Sentence Classification in Research Papers Using Cross-Domain Multi-Task Learning[OL]. arXiv Preprint, arXiv:2102.06008.
[35]
Shang X C, Ma Q L, Lin Z X, et al. A Span-Based Dynamic Local Attention Model for Sequential Sentence Classification[C]// Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing. 2021: 198-203.
[36]
Devlin J, Chang M W, Lee K, et al. BERT: Pre-Training of Deep Bidirectional Transformers for Language Understanding[OL]. arXiv Preprint, arXiv: 1810.04805.
[37]
Hochreiter S, Schmidhuber J. Long Short-Term Memory[J]. Neural Computation, 1997, 9(8): 1735-1780.
doi: 10.1162/neco.1997.9.8.1735
pmid: 9377276
[38]
Lafferty J, McCallum A, Pereira F. Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data[C]// Proceedings of the 18th International Conference on Machine Learning. 2001: 282-289.
[39]
David D L, William A G. A Sequential Algorithm for Training Text Classifier[C]// Proceedings of the 17th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval.Dublin, Ireland.Berlin, Germany: Springer, 1994: 3-12.
[40]
Gotmare A, Keskar N S, Xiong C, et al. A Closer Look at Deep Learning Heuristics: Learning Rate Restarts, Warmup and Distillation[OL]. arXiv Preprint, arXiv: 1810.13243.