|
|
Identifying Structural Function of Scientific Literature Abstracts Based on Deep Active Learning |
Mao Jin1,2(),Chen Ziyang1,2 |
1Center for Studies of Information Resources, Wuhan University, Wuhan 430072, China 2School of Information Management, Wuhan University, Wuhan 430072, China |
|
|
Abstract [Objective] This paper explores different DeepAL methods for identifying the structural function of scientific literature abstracts and their labeling costs. [Methods] Firstly, we constructed a SciBERT-BiLSTM-CRF model for the abstracts (SBCA), which utilized the contextual sequence information between sentences. Then, we developed an uncertainty active learning strategy for single sentences and full text of the abstracts. Finally, we conducted experiments on the PubMed 20K dataset. [Results] The SBCA model showed the best recognition performance and increased the F1 value by 11.93%, compared to the SciBERT model without sequence information. Using the Least Confidence strategy based on the abstracts, our SBCA model achieved its optimal F1 value with 60% of the experimental data. Using the Least Confidence strategy based on sentences, the SBCA model achieved optimal F1 value with 65% of the experimental data. [Limitations] In the future, we need to examine different active learning strategies in more fields or multi-language datasets. [Conclusions] The new model based on deep active learning could identify the structural function of scientific literature with a lower annotation cost.
|
Received: 12 May 2023
Published: 08 January 2024
|
|
Fund:National Natural Science Foundation of China(72174154);Major Projects of Education Ministry’s Key Research Base for Humanities and Social Sciences(22JJD870005) |
Corresponding Authors:
Mao Jin,ORCID:0000-0001-9572-6709,E-mail:danveno@163.com。
|
[1] |
陈玥彤, 王昊, 李跃艳, 等. 一种面向内容差异的学术论文评价方法[J]. 信息资源管理学报, 2022, 12(4): 56-69.
doi: 10.13365/j.jirm.2022.04.056
|
[1] |
(Chen Yuetong, Wang Hao, Li Yueyan, et al. An Academic Articles Evaluation Method Oriented to Content Differentiation[J]. Journal of Information Resources Management, 2022, 12(4): 56-69.)
doi: 10.13365/j.jirm.2022.04.056
|
[2] |
Balcan M F, Beygelzimer A, Langford J. Agnostic Active Learning[J]. Journal of Computer and System Sciences, 2009, 75(1): 78-89.
|
[3] |
Settles B. Active Learning Literature Survey[D]. Madison: University of Wisconsin-Madison Department of Computer Sciences, 2006.
|
[4] |
Yu K, Zhu S H, Xu W, et al. trNon-Greedy Active Learning for Text Categorization Using Convex Ansductive Experimental Design[C]// Proceedings of the 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, 2008: 635-642.
|
[5] |
Figueroa R L, Zeng-Treitler Q, Ngo L H, et al. Active Learning for Clinical Text Classification: Is It Better than Random Sampling?[J]. Journal of the American Medical Informatics Association, 2012, 19(5): 809-816.
doi: 10.1136/amiajnl-2011-000648
pmid: 22707743
|
[6] |
Schröder C, Niekler A. A Survey of Active Learning for Text Classification Using Deep Neural Networks[OL]. arXiv Preprint, arXiv: 2008.07267.
|
[7] |
Tong S, Koller D. Support Vector Machine Active Learning with Applications to Text Classification[C]// Proceedings of the 17th International Conference on Machine Learning. ACM, 2000: 999-1006.
|
[8] |
段友祥, 张晓天. 基于主动学习的SVM评论内容分类算法的研究[J]. 计算机与数字工程, 2022, 50(3): 608-612.
|
[8] |
(Duan Youxiang, Zhang Xiaotian. Research on SVM Review Content Classification Algorithm Based on Active Learning[J]. Computer & Digital Engineering, 2022, 50(3): 608-612.)
|
[9] |
Shen Y Y, Yun H, Lipton Z C, et al. Deep Active Learning for Named Entity Recognition[OL]. arXiv Preprint, arXiv: 1707.05928.
|
[10] |
Haffari G, Sarkar A. Active Learning for Multilingual Statistical Machine Translation[C]// Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP. 2009: 181-189.
|
[11] |
Howard J, Ruder S. Universal Language Model Fine-Tuning for Text Classification[C]// Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics. 2018: 328-339.
|
[12] |
Zhang Y, Lease M, Wallace B. Active Discriminative Text Representation Learning[C]// Proceedings of the 31st AAAI Conference on Artificial Intelligence. 2017: 3386-3392.
|
[13] |
Hu R, Mac Namee B, Delany S J. Active Learning for Text Classification with Reusability[J]. Expert Systems with Applications, 2016, 45: 438-449.
|
[14] |
Lu J H, Mac Namee B. Investigating the Effectiveness of Representations Based on Pretrained Transformer-Based Language Models in Active Learning for Labelling Text Datasets[OL]. arXiv Preprint, arXiv: 2004.13138.
|
[15] |
Chen Y K, Lasko T A, Mei Q Z, et al. A Study of Active Learning Methods for Named Entity Recognition in Clinical Text[J]. Journal of Biomedical Informatics, 2015, 58: 11-18.
doi: S1532-0464(15)00203-8
pmid: 26385377
|
[16] |
石教祥, 朱礼军, 魏超, 等. 融合迁移学习与主动学习的金融科技实体识别方法[J]. 中国科技资源导刊, 2022, 54(2): 35-45.
|
[16] |
(Shi Jiaoxiang, Zhu Lijun, Wei Chao, et al. FinTech Named Entity Recognition Based on Transfer Learning and Active Learning[J]. China Science & Technology Resources Review, 2022, 54(2): 35-45.)
|
[17] |
景慎旗, 赵又霖. 面向中文电子病历文书的医学命名实体识别研究——一种基于半监督深度学习的方法[J]. 信息资源管理学报, 2021, 11(6): 105-115.
doi: 10.13365/j.jirm.2021.06.105
|
[17] |
(Jing Shenqi, Zhao Youlin. Recognizing Clinical Named Entity from Chinese Electronic Medical Record Texts Based on Semi-Supervised Deep Learning[J]. Journal of Information Resources Management, 2021, 11(6): 105-115.)
doi: 10.13365/j.jirm.2021.06.105
|
[18] |
王末, 崔运鹏, 陈丽, 等. 基于深度学习的学术论文语步结构分类方法研究[J]. 数据分析与知识发现, 2020, 4(6): 60-68.
|
[18] |
(Wang Mo, Cui Yunpeng, Chen Li, et al. A Deep Learning-Based Method of Argumentative Zoning for Research Articles[J]. Data Analysis and Knowledge Discovery, 2020, 4(6): 60-68.)
|
[19] |
丁良萍, 张智雄, 刘欢. 影响支持向量机模型语步自动识别效果的因素研究[J]. 数据分析与知识发现, 2019, 3(11): 16-23.
|
[19] |
(Ding Liangping, Zhang Zhixiong, Liu Huan. Factors Affecting Rhetorical Move Recognition with SVM Model[J]. Data Analysis and Knowledge Discovery, 2019, 3(11): 16-23.)
|
[20] |
Teufel S, Moens M. Summarizing Scientific Articles: Experiments with Relevance and Rhetorical Status[J]. Computational Linguistics, 2002, 28(4): 409-445.
|
[21] |
Swales J M. Genre Analysis: English in Academic and Research Settings[D]. Cambridge: Cambridge University Press, 1990.
|
[22] |
曹雁, 牟爱鹏. 科技期刊英文摘要学术词汇的语步特点研究[J]. 外语学刊, 2011(3): 46-49.
|
[22] |
(Cao Yan, Mu Aipeng. The Characteristics of Academic Words Across Different Abstract Moves of English Scientific and Technical Journals[J]. Foreign Language Research, 2011(3): 46-49.)
|
[23] |
宋东桓, 李晨英, 刘子瑜, 等. 英文科技论文摘要的语义特征词典构建[J]. 图书情报工作, 2020, 64(6): 108-119.
doi: 10.13266/j.issn.0252-3116.2020.06.013
|
[23] |
(Song Donghuan, Li Chenying, Liu Ziyu, et al. Semantic Feature Dictionary Construction of Abstract in English Scientific Journals[J]. Library and Information Service, 2020, 64(6): 108-119.)
doi: 10.13266/j.issn.0252-3116.2020.06.013
|
[24] |
沈思, 胡昊天, 叶文豪, 等. 基于全字语义的摘要结构功能自动识别研究[J]. 情报学报, 2019, 38(1): 79-88.
|
[24] |
(Shen Si, Hu Haotian, Ye Wenhao, et al. Research on Abstract Structure Function Automatic Recognition Based on Full Character Semantics[J]. Journal of the China Society for Scientific and Technical Information, 2019, 38(1): 79-88.)
|
[25] |
李湘东, 孙倩茹, 石健. 结合短文本扩展和BERT的商品评论文本自动分类[J]. 信息资源管理学报, 2023, 13(1): 129-139.
doi: 10.13365/j.jirm.2023.01.129
|
[25] |
(Li Xiangdong, Sun Qianru, Shi Jian. Automatic Classification of Product Review Texts Combining Short Text Extension and BERT[J]. Journal of Information Resources Management, 2023, 13(1): 129-139.)
doi: 10.13365/j.jirm.2023.01.129
|
[26] |
张智雄, 刘欢, 丁良萍, 等. 不同深度学习模型的科技论文摘要语步识别效果对比研究[J]. 数据分析与知识发现, 2019, 3(12): 1-9.
|
[26] |
(Zhang Zhixiong, Liu Huan, Ding Liangping, et al. Identifying Moves of Research Abstracts with Deep Learning Methods[J]. Data Analysis and Knowledge Discovery, 2019, 3(12): 1-9.)
|
[27] |
Beltagy I, Lo K, Cohan A. SciBERT: A Pretrained Language Model for Scientific Text[OL]. arXiv Preprint, arXiv: 1903.10676.
|
[28] |
Dernoncourt F, Lee J Y. PubMed 200k RCT: A Dataset for Sequential Sentence Classification in Medical Abstracts[C]// Proceedings of the 8th International Joint Conference on Natural Language Processing. 2017: 308-313.
|
[29] |
赵旸, 张智雄, 刘欢, 等. 基金项目摘要的语步识别系统设计与实现[J]. 情报理论与实践, 2022, 45(8): 162-168.
|
[29] |
(Zhao Yang, Zhang Zhixiong, Liu Huan, et al. Design and Implementation of the Move Recognition System for Fund Project Abstract[J]. Information Studies: Theory & Application, 2022, 45(8): 162-168.)
|
[30] |
赵旸, 张智雄, 李婕. 项目申请书摘要文本的语步识别语料构建[J]. 图书情报工作, 2022, 66(21): 97-106.
doi: 10.13266/j.issn.0252-3116.2022.21.011
|
[30] |
(Zhao Yang, Zhang Zhixiong, Li Jie. The Construction of Move Recognition Corpus for Project Application Abstract[J]. Library and Information Service, 2022, 66(21): 97-106.)
doi: 10.13266/j.issn.0252-3116.2022.21.011
|
[31] |
Jin D, Szolovits P. Hierarchical Neural Networks for Sequential Sentence Classification in Medical Scientific Abstracts[C]// Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. 2018: 3100-3109.
|
[32] |
Dernoncourt F, Lee J Y, Szolovits P. Neural Networks for Joint Sentence Classification in Medical Paper Abstracts[OL]. arXiv Preprint, arXiv: 1612.05251.
|
[33] |
Cohan A, Beltagy I, King D, et al. Pretrained Language Models for Sequential Sentence Classification[C]// Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing. 2019: 3693-3699.
|
[34] |
Brack A, Entrup E, Stamatakis M, et al. Sequential Sentence Classification in Research Papers Using Cross-Domain Multi-Task Learning[OL]. arXiv Preprint, arXiv:2102.06008.
|
[35] |
Shang X C, Ma Q L, Lin Z X, et al. A Span-Based Dynamic Local Attention Model for Sequential Sentence Classification[C]// Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing. 2021: 198-203.
|
[36] |
Devlin J, Chang M W, Lee K, et al. BERT: Pre-Training of Deep Bidirectional Transformers for Language Understanding[OL]. arXiv Preprint, arXiv: 1810.04805.
|
[37] |
Hochreiter S, Schmidhuber J. Long Short-Term Memory[J]. Neural Computation, 1997, 9(8): 1735-1780.
doi: 10.1162/neco.1997.9.8.1735
pmid: 9377276
|
[38] |
Lafferty J, McCallum A, Pereira F. Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data[C]// Proceedings of the 18th International Conference on Machine Learning. 2001: 282-289.
|
[39] |
David D L, William A G. A Sequential Algorithm for Training Text Classifier[C]// Proceedings of the 17th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval.Dublin, Ireland.Berlin, Germany: Springer, 1994: 3-12.
|
[40] |
Gotmare A, Keskar N S, Xiong C, et al. A Closer Look at Deep Learning Heuristics: Learning Rate Restarts, Warmup and Distillation[OL]. arXiv Preprint, arXiv: 1810.13243.
|
|
Viewed |
|
|
|
Full text
|
|
|
|
|
Abstract
|
|
|
|
|
Cited |
|
|
|
|
|
Shared |
|
|
|
|
|
Discussed |
|
|
|
|