Please wait a minute...
Advanced Search
数据分析与知识发现  2022, Vol. 6 Issue (8): 75-83     https://doi.org/10.11925/infotech.2096-3467.2021.1162
  研究论文 本期目录 | 过刊浏览 | 高级检索 |
融合神经网络与全局推理的实体共指消解算法*
周宁(),靳高雅,石雯茜
兰州交通大学电子与信息工程学院 兰州 730070
Algorithm for Entity Coreference Resolution with Neural Network and Global Reasoning
Zhou Ning(),Jin Gaoya,Shi Wenqian
School of Electronic and Information Engineering, Lanzhou Jiaotong University, Lanzhou 730070, China
全文: PDF (858 KB)   HTML ( 16
输出: BibTeX | EndNote (RIS)      
摘要 

【目的】 提出融合神经网络与全局推理的实体共指消解模型,解决文本内实体信息复杂,指代信息具有不明确性且分布稀疏的问题,探索更有效的共指消解算法。【方法】 首先,利用神经网络模型抽取出文档中的实体和其前指词;其次,结合句子的上下文信息进行全局推理,将此推理结果加入神经网络模型中,从而提高实体共指消解的精确度。【结果】 在OntoNotes 5.0数据集上进行实体共指消解实验,结果验证了所提算法的有效性。融合神经网络与全局推理的实体共指消解算法能有效地提高共指消解性能和更好地理解文本语义信息,最终在CoNLL评测标准下F1值达到74.76%。【局限】 需加入更精确的知识推理。【结论】 与近几年其他的共指消解模型对比实验结果证明了所提算法的实用性与有效性。

服务
把本文推荐给朋友
加入引用管理器
E-mail Alert
RSS
作者相关文章
周宁
靳高雅
石雯茜
关键词 神经网络共指消解实体消歧全局推理    
Abstract

[Objective] This paper proposes a model for entity coreference resolution, which integrates neural network and global reasoning. It tries to address the issues of complex entity information in the text as well as the ambiguity and sparse distribution of referential information. [Methods] First, we used the neural network model to extract the entities and their antecedents from the documents. Then, we combined the context information of the sentence to perform global reasoning. Finally, we added the reasoning results to the neural network model to improve the accuracy of entity coreference resolution. [Results] We examined our new model on the OntoNotes 5.0 dataset. The new model’s F1 score reached 74.76% under the CoNLL evaluation standard. [Limitations] More precise knowledge reasoning needs to be added. [Conclusions] Compared with the existing models, the proposed algorithm improves the coreference resolution performance and better understand text semantic information.

Key wordsNeural Network    Coreference Resolution    Eentity Disambiguation    Global Reasoning
收稿日期: 2021-10-14      出版日期: 2022-09-23
ZTFLH:  TP391  
基金资助:*国家自然科学基金项目(61650207);兰州交通大学天佑创新团队的研究成果之一(TY202003)
通讯作者: 周宁,ORCID:0000-0001-7466-8925     E-mail: zhouning@mail.lzjtu.cn
引用本文:   
周宁, 靳高雅, 石雯茜. 融合神经网络与全局推理的实体共指消解算法*[J]. 数据分析与知识发现, 2022, 6(8): 75-83.
Zhou Ning, Jin Gaoya, Shi Wenqian. Algorithm for Entity Coreference Resolution with Neural Network and Global Reasoning. Data Analysis and Knowledge Discovery, 2022, 6(8): 75-83.
链接本文:  
https://manu44.magtech.com.cn/Jwk_infotech_wk3/CN/10.11925/infotech.2096-3467.2021.1162      或      https://manu44.magtech.com.cn/Jwk_infotech_wk3/CN/Y2022/V6/I8/75
Fig.1  模型框架
Fig.2  Bi-LSTM模型结构
OntoNotes5.0
数据类型
英文 中文 阿拉伯文
训练集 验证集 测试集 训练集 验证集 测试集 训练集 验证集 测试集
单词(*103 1 300 160 170 750 110 90 240 30 30
文档(篇) 2 802 343 348 1 810 252 218 359 44 44
实体(*103 35.1 4.5 4.5 28.2 3.8 3.5 8.3 0.9 0.9
表述(*103 155.5 19.1 19.7 102.8 14.1 12.8 27.5 3.3 3.2
指代链(*103 120.4 4.6 15.2 74.5 10.3 9.2 19.2 2.3 2.2
Table 1  OntoNotes 5.0数据集规模
参数名称 数值
学习率 0.001
词向量维度 300
字符向量维度 8
最大先行词数 50
最大句子数 50
最小先行词数 30
Bi-LSTM隐藏层维度 200
FFNN隐藏层维度 150
Dropout 0.2
Table 2  模型参数设置
Bi-LSTM层数 准确率/% 召回率/% F1/%
1 71.72 54.57 61.89
2 71.62 47.15 56.70
3 73.07 73.95 73.69
4 68.52 57.19 63.32
Table 3  Bi-LSTM层数实验结果对比
激活函数 准确率/% 召回率/% F1/%
tanh 70.62 57.62 63.43
Sigmoid 73.07 73.95 73.69
ReLU 72.35 52.51 60.79
Table 4  不同激活函数实验结果对比
λ取值 准确率/% 召回率/% F1/%
0.1 69.86 33.50 45.17
0.2 68.77 45.49 54.68
0.3 68.29 55.58 61.26
0.4 73.07 73.95 73.69
0.5 72.19 50.87 59.61
Table 5  不同阈值实验结果对比
模型 MUC B-CUBED CEAF 平均F1/%
准确率/% 召回率/% F1/% 准确率/% 召回率/% F1/% 准确率/% 召回率/% F1/%
Wiseman等[12] 76.20 69.30 72.60 66.20 55.80 60.50 59.40 54.90 57.10 63.40
Clark等[18] 76.10 69.40 72.60 65.60 56.00 60.40 59.40 53.00 56.00 63.00
Wiseman等[15] 77.50 69.80 63.40 66.80 57.00 61.50 62.10 53.90 57.70 64.20
Clark等[19] 79.20 70.40 74.60 69.90 58.00 63.40 63.50 55.50 59.20 65.70
Clark等[20] 79.90 69.30 74.20 71.00 56.50 63.00 63.80 54.30 58.70 65.30
Lee等[13] 78.40 73.40 75.80 68.60 61.80 65.00 62.70 59.00 60.80 67.20
+Feature 80.43 76.00 78.17 72.15 64.81 68.59 64.82 63.01 63.90 70.22
+全局推理 80.65 81.66 81.20 70.81 72.38 72.08 67.75 67.82 67.80 73.69
Joshi等[24] 80.20 82.40 81.30 69.60 73.80 71.60 69.00 68.60 68.80 73.90
Joshi等[24]+全局推理 84.48 78.60 81.42 69.86 76.74 73.14 72.08 67.35 69.73 74.76
Table 6  实验结果对比
[1] Hobbs J R. Resolving Pronoun References[J]. Lingua, 1978, 44(4): 311-338.
doi: 10.1016/0024-3841(78)90006-2
[2] Brennan S E, Friedman M W, Pollard C J. A Centering Approach to Pronouns[C]// Proceedings of the 25th Annual Meeting on Association for Computational Linguistics. 1987: 155-162.
[3] Lappin S, Leass H J. An Algorithm for Pronominal Anaphora Resolution[J]. Computational Linguistics, 1994, 20(4): 535-561.
[4] Lee H, Chang A, Peirsman Y, et al. Deterministic Coreference Resolution Based on Entity-Centric, Precision-Ranked Rules[J]. Computational Linguistics, 2013, 39(4): 885-916.
doi: 10.1162/COLI_a_00152
[5] Grosz B J, Weinstein S, Joshi A K. Centering: A Framework for Modeling the Local Coherence of Discourse[J]. Computational Linguistics, 1995, 21(2): 203-225.
[6] Soon W M, Ng H T, Lim D C Y. A Machine Learning Approach to Coreference Resolution of Noun Phrases[J]. Computational Linguistics, 2001, 27(4): 521-544.
doi: 10.1162/089120101753342653
[7] Ng V, Cardie C. Improving Machine Learning Approaches to Coreference Resolution[C]// Proceedings of the 40th Annual Meeting on Association for Computational Linguistics. 2002: 104-111.
[8] Lee H, Surdeanu M, Jurafsky D. A Scaffolding Approach to Coreference Resolution Integrating Statistical and Rule Based Models[J]. Natural Language Engineering, 2017, 23(5): 733-762.
doi: 10.1017/S1351324917000109
[9] 钱伟, 郭以昆, 周雅倩, 等. 基于最大熵模型的英文名词短语指代消解[J]. 计算机研究与发展, 2003, 40(9): 1337-1343.
[9] (Qian Wei, Guo Yikun, Zhou Yaqian, et al. English Noun Phrase Coreference Resolution via a Maximum Entropy Model[J]. Journal of Computer Research and Development, 2003, 40(9): 1337-1343.)
[10] Mitkov R, Evans R, Orasan C. A New, Fully Automatic Version of Mitkov’s Knowledge Poor Pronoun Resolution Method[C]// Proceedings of the 3rd International Conference on Intelligent Text Processing and Computational Linguistics. 2002:168-186.
[11] Ni Y G, Hale J, Eugene C. A Statistical Approach to Anaphora Resolution[C]// Proceedings of the 6th Workshop on Very Large Corpora. 1998: 161-170.
[12] Wiseman S, Rush A M, Shieber S, et al. Learning Anaphoricity and Antecedent Ranking Features for Coreference Resolution[C]// Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing. 2015: 1416-1426.
[13] Lee K, He L H, Lewis M, et al. End-to-End Neural Coreference Resolution[C]// Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing. 2017:188-197.
[14] Zhang R, dos Santos C N, Yasunaga M, et al. Neural Coreference Resolution with Deep Biaffine Attention by Joint Mention Detection and Mention Clustering[C]// Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics(Volume 2:Short Papers). 2018: 102-107.
[15] Wiseman S, Rush A M, Shieber S M. Learning Global Features for Coreference Resolution[OL]. arXivPreprint,arXiv:1604.03035.
[16] 滕佳月, 李培峰, 朱巧明, 等. 基于全局优化的中文事件同指消解方法[J]. 北京大学学报(自然科学版), 2016, 52(1): 97-103.
[16] (Teng Jiayue, Li Peifeng, Zhu Qiaoming, et al. Global Inference for Co-reference Resolution Between Chinese Events[J]. Acta Scientiarum Naturalium Universitatis Pekinensis, 2016, 52(1): 97-103.)
[17] 仲伟峰, 杨航, 陈玉博, 等. 基于联合标注和全局推理的篇章级事件抽取[J]. 中文信息学报, 2019, 33(9): 88-95.
[17] (Zhong Weifeng, Yang Hang, Chen Yubo, et al. Document-Level Event Extraction Based on Joint Labeling and Global Reasoning[J]. Journal of Chinese Information Processing, 2019, 33(9): 88-95.)
[18] Clark K, Manning C D. Entity-Centric Coreference Resolution with Model Stacking[C]// Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1:Long Papers). 2015: 1405-1415.
[19] Clark K, Manning C D. Deep Reinforcement Learning for Mention-Ranking Coreference Models[C]// Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing. 2016: 2256-2262.
[20] Clark K, Manning C D. Improving Coreference Resolution by Learning Entity Level Distributed Representations[C]// Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics. 2016: 643-653.
[21] Zhang Z Y, Han X, Liu Z Y, et al. ERNIE: Enhanced Language Representation with Informative Entities[C]// Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. 2019: 1441-1451.
[22] Lee K, He L H, Zettlemoyer L. Higher-Order Coreference Resolution with Coarse-to-fine Inference[C]// Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies, Volume 2 (Short Papers). 2018: 687-692.
[23] Khosla S, Rose C. Using Type Information to Improve Entity Coreference Resolution[OL]. arXiv Preprint, arXiv:2010.05738.
[24] Joshi M, Levy O, Zettlemoyer L, et al. BERT for Coreference Resolution: Baselines and Analysis[C]// Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing. 2019: 5803-5808.
[25] Joshi M, Chen D Q, Liu Y H, et al. SpanBERT: Improving Pre-training by Representing and Predicting Spans[J]. Transactions of the Association for Computational Linguistics, 2020, 8: 64-77.
doi: 10.1162/tacl_a_00300
[26] Aralikatte R, Lamm M, Hardt D, et al. Ellipsis and Coreference Resolution as Question Answering[OL]. arXiv Preprint, arXiv:1908.11141.
[27] Wu W, Wang F, Yuan A, et al. CorefQA: Coreference Resolution as Query-Based Span Prediction[C]// Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. 2020: 6953-6963.
[28] Dobrovolskii V. Word-Level Coreference Resolution[C]// Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing. 2021:7670-7675.
[29] Pennington J, Socher R, Manning C. GloVe: Global Vectors for Word Representation[C]// Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing. 2014: 1532-1543.
[30] Peters M E, Neumann M, Iyyer M, et al. Deep Contextualized Word Representations[OL]. arXiv Preprint, arXiv:1802.05365.
[31] 周飞燕, 金林鹏, 董军. 卷积神经网络研究综述[J]. 计算机学报, 2017, 40(6): 1229-1251.
[31] (Zhou Feiyan, Jin Linpeng, Dong Jun. Review of Convolutional Neural Network[J]. Chinese Journal of Computers, 2017, 40(6): 1229-1251.)
[32] 金宸, 李维华, 姬晨, 等. 基于双向LSTM神经网络模型的中文分词[J]. 中文信息学报, 2018, 32(2): 29-37.
[32] (Jin Chen, Li Weihua, Ji Chen, et al. Bi-directional Long Short-term Memory Neural Networks for Chinese Word Segmentation[J]. Journal of Chinese Information Processing, 2018, 32(2): 29-37.)
[33] Vaswani A, Shazeer N, Parmar N, et al. Attention is All You Need[OL]. arXiv Preprint, arXiv:1706.03762.
[34] Hirschman L, Robinson P, Burger J D, et al. Automating Coreference: The Roal of Annotated Training Data[C]// Proceedings of the AAAI Spring Symposium on Applying Machine Learning to Discourse Processing. 1997: 118-121.
[35] Vilain M, Burger J, Aberdeen J, et al. A Model-Theoretic Coreference Scoring Scheme[C]// Proceedings of the 6th Conference on Message Understanding Coreference. 1995: 45-52.
[36] Doddington G R, Mitchell A, Praybocki M A, et al. The Automatic Content Extraction (ACE) Program-Tasks, Data, and Evaluation[C]// Proceedings of the 4th International Conference on Language Resources and Evaluation, 2004: 837-840.
[37] Weichedel R, Palmer M, Marcus M, et al. Linguistic Data Consortium. OntoNotes Release 5.0 LDC2013T19[DS/OL]. 2013. https://catalog.ldc.upenn.edu/LDC2013T19.
[38] Bagga A, Baldwin B. Algorithms for Scoring Coreference Chains[C]// Proceedings of the 1st International Conference on Language Resources and Evaluation Workshop on Linguistics Coreference. 1998: 563-566.
[39] Luo X Q. On Coreference Resolution Performance Metrics[C]// Proceedings of the Conference on Human Language Technology and Conference on Empirical Methods in Natural Language. 2005: 25-32.
[40] Kingma D P, Ba J. Adam: A Method for Stochastic Optimization[C]// Proceedings of the 3rd International Conference on Learning Representations. 2015: 1-13.
[1] 赵鹏武, 李志义, 林小琦. 基于注意力机制和卷积神经网络的中文人物关系抽取与识别*[J]. 数据分析与知识发现, 2022, 6(8): 41-51.
[2] 杨文丽, 李娜娜. 基于对抗网络的文本对齐跨语言情感分类方法*[J]. 数据分析与知识发现, 2022, 6(7): 141-151.
[3] 张若琦, 申建芳, 陈平华. 结合GNN、Bi-GRU及注意力机制的会话序列推荐*[J]. 数据分析与知识发现, 2022, 6(6): 46-54.
[4] 郭樊容, 黄孝喜, 王荣波, 谌志群, 胡创, 谢一敏, 司博宇. 基于Transformer和图卷积神经网络的隐喻识别*[J]. 数据分析与知识发现, 2022, 6(4): 120-129.
[5] 韦婷婷, 江涛, 郑舒玲, 张建桃. 融合LSTM与逻辑回归的中文专利关键词抽取*[J]. 数据分析与知识发现, 2022, 6(2/3): 308-317.
[6] 商容轩, 张斌, 米加宁. 基于BRNN的政务APP评论端到端方面级情感分析方法*[J]. 数据分析与知识发现, 2022, 6(2/3): 364-375.
[7] 王楠, 李海荣, 谭舒孺. 基于舆情事件演化分析及改进KE-SMOTE算法的舆情反转预测研究*[J]. 数据分析与知识发现, 2022, 6(2/3): 396-408.
[8] 范少萍,赵雨宣,安新颖,吴清强. 基于卷积神经网络的医学实体关系分类模型研究*[J]. 数据分析与知识发现, 2021, 5(9): 75-84.
[9] 范涛,王昊,吴鹏. 基于图卷积神经网络和依存句法分析的网民负面情感分析研究*[J]. 数据分析与知识发现, 2021, 5(9): 97-106.
[10] 顾耀文, 张博文, 郑思, 杨丰春, 李姣. 基于图注意力网络的药物ADMET分类预测模型构建方法*[J]. 数据分析与知识发现, 2021, 5(8): 76-85.
[11] 张乐, 冷基栋, 吕学强, 崔卓, 王磊, 游新冬. RLCPAR:一种基于强化学习的中文专利摘要改写模型*[J]. 数据分析与知识发现, 2021, 5(7): 59-69.
[12] 孟镇,王昊,虞为,邓三鸿,张宝隆. 基于特征融合的声乐分类研究*[J]. 数据分析与知识发现, 2021, 5(5): 59-70.
[13] 韩普,张展鹏,张明淘,顾亮. 基于多特征融合的中文疾病名称归一化研究*[J]. 数据分析与知识发现, 2021, 5(5): 83-94.
[14] 王楠,李海荣,谭舒孺. 基于改进SMOTE算法与集成学习的舆情反转预测研究*[J]. 数据分析与知识发现, 2021, 5(4): 37-48.
[15] 李丹阳, 甘明鑫. 基于多源信息融合的音乐推荐方法 *[J]. 数据分析与知识发现, 2021, 5(2): 94-105.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
版权所有 © 2015 《数据分析与知识发现》编辑部
地址:北京市海淀区中关村北四环西路33号 邮编:100190
电话/传真:(010)82626611-6626,82624938
E-mail:jishu@mail.las.ac.cn