Please wait a minute...
Advanced Search
数据分析与知识发现  0, Vol. Issue (): 1-     https://doi.org/10.11925/infotech.2020.0364
  本期目录 | 过刊浏览 | 高级检索 |
基于层次注意力网络模型的学术文本结构功能识别
秦成磊,章成志
南京理工大学经济管理学院信息管理系,南京,210094
Using Hierarchical Attention Network Model to Recognize Structure Functions of Academic Articles
Chenglei Qin,Chengzhi Zhang
Department of Information Management, School of Economics and Management, Nanjing University of Science and Technology, Nanjing 210094
全文: PDF (1598 KB)  
输出: BibTeX | EndNote (RIS)      
摘要 

[目的]实现对学术文本章节功能类型的自动判定。[方法]首先构建能够捕获章节结构信息的不同粒度的层次注意力网络模型,对比分析使用不同文本特征向量的传统机器学习模型、Bert模型与层次注意力网络模型在Plos四种期刊规范数据集上的学术文本结构功能的识别结果,以获取最佳模型;随后,使用获取的最佳模型识别Atmospheric Chemistry and Physics (ACP, IF 5.6)期刊中章节标题命名缺乏规范且人工标注结构功能一致性较低的章节的结构功能,提出使用参考文献分布相似、动词线索词分布相似评估识别结果;最后,对所构建的层次注意力网络模型的领域适应性进行分析。[结果]以Bi-Lstm+Attention为编码器的句子级层次注意力网络模型识别效果优于其他模型,Macro-F1值为0.8661;其次,存在领域适应问题,在差异较大的领域中模型识别性能下降明显,Macro-F1值最低为0.4554。[局限]不能识别具有混合结构的章节的功能;模型中未考虑文章结构之间的逻辑关系。[结论]句子级层次注意力网络模型能够较好的识别章节的结构功能,引入学术文本结构信息能够丰富和拓展基于学术论文全文本相关研究的研究内容与范围。

服务
把本文推荐给朋友
加入引用管理器
E-mail Alert
RSS
作者相关文章
关键词 学术文本结构功能识别层次注意力网络IMRaD领域适应性分析     
Abstract

[Objective]The goal of the functional recognition of academic text structure is to automatically recognize the function of the academic text section. [Methods]We construct different-grained hierarchical attention network model and use multiple deep learning models as encoder to automatically identify the function of academic text structure. In addition, the effect of the traditional machine learning models with different text feature vectors and Bert model in the functional recognition of academic text structure are analyzed. And then, we used the distribution similarity of the references, and the similarity of cue word distribution to evaluate the effect of the model in real data. The domain adaptability of the hierarchical attention network model is also analyzed. [Results]The hierarchical attention network model at the sentence level with Bi-Lstm+Attention as the encoder outperforms other methods,the value of Macro-F1 is 0.8661; Secondly, the performance of model classification has dropped significantly in the fields with great differences, Macro-F1 has a minimum value of 0.4554. [Limitations] The function of section with mixed structure can not be recognized, and the logical relationship in article structures is not used in the HAN model. [Conclusions] Sentence level HAN model can better recognize the structure function, and incorporating of academic text structure information can enrich and expand the research content and scope based on the whole text of academic papers

Key words Function recognition of academic text structure    Hierarchical attention network    IMRaD;Domain adaptability analysis
     出版日期: 2020-08-03
ZTFLH:  TP393,G250  
引用本文:   
秦成磊, 章成志. 基于层次注意力网络模型的学术文本结构功能识别 [J]. 数据分析与知识发现, 0, (): 1-.
Chenglei Qin, Chengzhi Zhang. Using Hierarchical Attention Network Model to Recognize Structure Functions of Academic Articles . Data Analysis and Knowledge Discovery, 0, (): 1-.
链接本文:  
http://manu44.magtech.com.cn/Jwk_infotech_wk3/CN/10.11925/infotech.2020.0364      或      http://manu44.magtech.com.cn/Jwk_infotech_wk3/CN/Y0/V/I/1
[1] 秦成磊,章成志. 基于层次注意力网络模型的学术文本结构功能识别*[J]. 数据分析与知识发现, 2020, 4(11): 26-42.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
版权所有 © 2015 《数据分析与知识发现》编辑部
地址:北京市海淀区中关村北四环西路33号 邮编:100190
电话/传真:(010)82626611-6626,82624938
E-mail:jishu@mail.las.ac.cn