Please wait a minute...
Data Analysis and Knowledge Discovery  2022, Vol. 6 Issue (8): 75-83    DOI: 10.11925/infotech.2096-3467.2021.1162
Current Issue | Archive | Adv Search |
Algorithm for Entity Coreference Resolution with Neural Network and Global Reasoning
Zhou Ning(),Jin Gaoya,Shi Wenqian
School of Electronic and Information Engineering, Lanzhou Jiaotong University, Lanzhou 730070, China
Download: PDF (858 KB)   HTML ( 16
Export: BibTeX | EndNote (RIS)      
Abstract  

[Objective] This paper proposes a model for entity coreference resolution, which integrates neural network and global reasoning. It tries to address the issues of complex entity information in the text as well as the ambiguity and sparse distribution of referential information. [Methods] First, we used the neural network model to extract the entities and their antecedents from the documents. Then, we combined the context information of the sentence to perform global reasoning. Finally, we added the reasoning results to the neural network model to improve the accuracy of entity coreference resolution. [Results] We examined our new model on the OntoNotes 5.0 dataset. The new model’s F1 score reached 74.76% under the CoNLL evaluation standard. [Limitations] More precise knowledge reasoning needs to be added. [Conclusions] Compared with the existing models, the proposed algorithm improves the coreference resolution performance and better understand text semantic information.

Key wordsNeural Network      Coreference Resolution      Eentity Disambiguation      Global Reasoning     
Received: 14 October 2021      Published: 23 September 2022
ZTFLH:  TP391  
Fund:National Natural Science Foundation of China(61650207);Tianyou Innovation Team of Lanzhou Jiaotong University(TY202003)
Corresponding Authors: Zhou Ning,ORCID:0000-0001-7466-8925     E-mail: zhouning@mail.lzjtu.cn

Cite this article:

Zhou Ning, Jin Gaoya, Shi Wenqian. Algorithm for Entity Coreference Resolution with Neural Network and Global Reasoning. Data Analysis and Knowledge Discovery, 2022, 6(8): 75-83.

URL:

https://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/10.11925/infotech.2096-3467.2021.1162     OR     https://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/Y2022/V6/I8/75

The Framework of Model
Bi-LSTM Cell Structure
OntoNotes5.0
数据类型
英文 中文 阿拉伯文
训练集 验证集 测试集 训练集 验证集 测试集 训练集 验证集 测试集
单词(*103 1 300 160 170 750 110 90 240 30 30
文档(篇) 2 802 343 348 1 810 252 218 359 44 44
实体(*103 35.1 4.5 4.5 28.2 3.8 3.5 8.3 0.9 0.9
表述(*103 155.5 19.1 19.7 102.8 14.1 12.8 27.5 3.3 3.2
指代链(*103 120.4 4.6 15.2 74.5 10.3 9.2 19.2 2.3 2.2
OntoNotes 5.0 Dataset Size
参数名称 数值
学习率 0.001
词向量维度 300
字符向量维度 8
最大先行词数 50
最大句子数 50
最小先行词数 30
Bi-LSTM隐藏层维度 200
FFNN隐藏层维度 150
Dropout 0.2
Parameter Setting
Bi-LSTM层数 准确率/% 召回率/% F1/%
1 71.72 54.57 61.89
2 71.62 47.15 56.70
3 73.07 73.95 73.69
4 68.52 57.19 63.32
Bi-LSTM Layers Experimental Results
激活函数 准确率/% 召回率/% F1/%
tanh 70.62 57.62 63.43
Sigmoid 73.07 73.95 73.69
ReLU 72.35 52.51 60.79
Activation Functions Experimental Results
λ取值 准确率/% 召回率/% F1/%
0.1 69.86 33.50 45.17
0.2 68.77 45.49 54.68
0.3 68.29 55.58 61.26
0.4 73.07 73.95 73.69
0.5 72.19 50.87 59.61
Threshold λ Experimental Results
模型 MUC B-CUBED CEAF 平均F1/%
准确率/% 召回率/% F1/% 准确率/% 召回率/% F1/% 准确率/% 召回率/% F1/%
Wiseman等[12] 76.20 69.30 72.60 66.20 55.80 60.50 59.40 54.90 57.10 63.40
Clark等[18] 76.10 69.40 72.60 65.60 56.00 60.40 59.40 53.00 56.00 63.00
Wiseman等[15] 77.50 69.80 63.40 66.80 57.00 61.50 62.10 53.90 57.70 64.20
Clark等[19] 79.20 70.40 74.60 69.90 58.00 63.40 63.50 55.50 59.20 65.70
Clark等[20] 79.90 69.30 74.20 71.00 56.50 63.00 63.80 54.30 58.70 65.30
Lee等[13] 78.40 73.40 75.80 68.60 61.80 65.00 62.70 59.00 60.80 67.20
+Feature 80.43 76.00 78.17 72.15 64.81 68.59 64.82 63.01 63.90 70.22
+全局推理 80.65 81.66 81.20 70.81 72.38 72.08 67.75 67.82 67.80 73.69
Joshi等[24] 80.20 82.40 81.30 69.60 73.80 71.60 69.00 68.60 68.80 73.90
Joshi等[24]+全局推理 84.48 78.60 81.42 69.86 76.74 73.14 72.08 67.35 69.73 74.76
The Comparison of Experimental Results
[1] Hobbs J R. Resolving Pronoun References[J]. Lingua, 1978, 44(4): 311-338.
doi: 10.1016/0024-3841(78)90006-2
[2] Brennan S E, Friedman M W, Pollard C J. A Centering Approach to Pronouns[C]// Proceedings of the 25th Annual Meeting on Association for Computational Linguistics. 1987: 155-162.
[3] Lappin S, Leass H J. An Algorithm for Pronominal Anaphora Resolution[J]. Computational Linguistics, 1994, 20(4): 535-561.
[4] Lee H, Chang A, Peirsman Y, et al. Deterministic Coreference Resolution Based on Entity-Centric, Precision-Ranked Rules[J]. Computational Linguistics, 2013, 39(4): 885-916.
doi: 10.1162/COLI_a_00152
[5] Grosz B J, Weinstein S, Joshi A K. Centering: A Framework for Modeling the Local Coherence of Discourse[J]. Computational Linguistics, 1995, 21(2): 203-225.
[6] Soon W M, Ng H T, Lim D C Y. A Machine Learning Approach to Coreference Resolution of Noun Phrases[J]. Computational Linguistics, 2001, 27(4): 521-544.
doi: 10.1162/089120101753342653
[7] Ng V, Cardie C. Improving Machine Learning Approaches to Coreference Resolution[C]// Proceedings of the 40th Annual Meeting on Association for Computational Linguistics. 2002: 104-111.
[8] Lee H, Surdeanu M, Jurafsky D. A Scaffolding Approach to Coreference Resolution Integrating Statistical and Rule Based Models[J]. Natural Language Engineering, 2017, 23(5): 733-762.
doi: 10.1017/S1351324917000109
[9] 钱伟, 郭以昆, 周雅倩, 等. 基于最大熵模型的英文名词短语指代消解[J]. 计算机研究与发展, 2003, 40(9): 1337-1343.
[9] (Qian Wei, Guo Yikun, Zhou Yaqian, et al. English Noun Phrase Coreference Resolution via a Maximum Entropy Model[J]. Journal of Computer Research and Development, 2003, 40(9): 1337-1343.)
[10] Mitkov R, Evans R, Orasan C. A New, Fully Automatic Version of Mitkov’s Knowledge Poor Pronoun Resolution Method[C]// Proceedings of the 3rd International Conference on Intelligent Text Processing and Computational Linguistics. 2002:168-186.
[11] Ni Y G, Hale J, Eugene C. A Statistical Approach to Anaphora Resolution[C]// Proceedings of the 6th Workshop on Very Large Corpora. 1998: 161-170.
[12] Wiseman S, Rush A M, Shieber S, et al. Learning Anaphoricity and Antecedent Ranking Features for Coreference Resolution[C]// Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing. 2015: 1416-1426.
[13] Lee K, He L H, Lewis M, et al. End-to-End Neural Coreference Resolution[C]// Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing. 2017:188-197.
[14] Zhang R, dos Santos C N, Yasunaga M, et al. Neural Coreference Resolution with Deep Biaffine Attention by Joint Mention Detection and Mention Clustering[C]// Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics(Volume 2:Short Papers). 2018: 102-107.
[15] Wiseman S, Rush A M, Shieber S M. Learning Global Features for Coreference Resolution[OL]. arXivPreprint,arXiv:1604.03035.
[16] 滕佳月, 李培峰, 朱巧明, 等. 基于全局优化的中文事件同指消解方法[J]. 北京大学学报(自然科学版), 2016, 52(1): 97-103.
[16] (Teng Jiayue, Li Peifeng, Zhu Qiaoming, et al. Global Inference for Co-reference Resolution Between Chinese Events[J]. Acta Scientiarum Naturalium Universitatis Pekinensis, 2016, 52(1): 97-103.)
[17] 仲伟峰, 杨航, 陈玉博, 等. 基于联合标注和全局推理的篇章级事件抽取[J]. 中文信息学报, 2019, 33(9): 88-95.
[17] (Zhong Weifeng, Yang Hang, Chen Yubo, et al. Document-Level Event Extraction Based on Joint Labeling and Global Reasoning[J]. Journal of Chinese Information Processing, 2019, 33(9): 88-95.)
[18] Clark K, Manning C D. Entity-Centric Coreference Resolution with Model Stacking[C]// Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1:Long Papers). 2015: 1405-1415.
[19] Clark K, Manning C D. Deep Reinforcement Learning for Mention-Ranking Coreference Models[C]// Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing. 2016: 2256-2262.
[20] Clark K, Manning C D. Improving Coreference Resolution by Learning Entity Level Distributed Representations[C]// Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics. 2016: 643-653.
[21] Zhang Z Y, Han X, Liu Z Y, et al. ERNIE: Enhanced Language Representation with Informative Entities[C]// Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. 2019: 1441-1451.
[22] Lee K, He L H, Zettlemoyer L. Higher-Order Coreference Resolution with Coarse-to-fine Inference[C]// Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies, Volume 2 (Short Papers). 2018: 687-692.
[23] Khosla S, Rose C. Using Type Information to Improve Entity Coreference Resolution[OL]. arXiv Preprint, arXiv:2010.05738.
[24] Joshi M, Levy O, Zettlemoyer L, et al. BERT for Coreference Resolution: Baselines and Analysis[C]// Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing. 2019: 5803-5808.
[25] Joshi M, Chen D Q, Liu Y H, et al. SpanBERT: Improving Pre-training by Representing and Predicting Spans[J]. Transactions of the Association for Computational Linguistics, 2020, 8: 64-77.
doi: 10.1162/tacl_a_00300
[26] Aralikatte R, Lamm M, Hardt D, et al. Ellipsis and Coreference Resolution as Question Answering[OL]. arXiv Preprint, arXiv:1908.11141.
[27] Wu W, Wang F, Yuan A, et al. CorefQA: Coreference Resolution as Query-Based Span Prediction[C]// Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. 2020: 6953-6963.
[28] Dobrovolskii V. Word-Level Coreference Resolution[C]// Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing. 2021:7670-7675.
[29] Pennington J, Socher R, Manning C. GloVe: Global Vectors for Word Representation[C]// Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing. 2014: 1532-1543.
[30] Peters M E, Neumann M, Iyyer M, et al. Deep Contextualized Word Representations[OL]. arXiv Preprint, arXiv:1802.05365.
[31] 周飞燕, 金林鹏, 董军. 卷积神经网络研究综述[J]. 计算机学报, 2017, 40(6): 1229-1251.
[31] (Zhou Feiyan, Jin Linpeng, Dong Jun. Review of Convolutional Neural Network[J]. Chinese Journal of Computers, 2017, 40(6): 1229-1251.)
[32] 金宸, 李维华, 姬晨, 等. 基于双向LSTM神经网络模型的中文分词[J]. 中文信息学报, 2018, 32(2): 29-37.
[32] (Jin Chen, Li Weihua, Ji Chen, et al. Bi-directional Long Short-term Memory Neural Networks for Chinese Word Segmentation[J]. Journal of Chinese Information Processing, 2018, 32(2): 29-37.)
[33] Vaswani A, Shazeer N, Parmar N, et al. Attention is All You Need[OL]. arXiv Preprint, arXiv:1706.03762.
[34] Hirschman L, Robinson P, Burger J D, et al. Automating Coreference: The Roal of Annotated Training Data[C]// Proceedings of the AAAI Spring Symposium on Applying Machine Learning to Discourse Processing. 1997: 118-121.
[35] Vilain M, Burger J, Aberdeen J, et al. A Model-Theoretic Coreference Scoring Scheme[C]// Proceedings of the 6th Conference on Message Understanding Coreference. 1995: 45-52.
[36] Doddington G R, Mitchell A, Praybocki M A, et al. The Automatic Content Extraction (ACE) Program-Tasks, Data, and Evaluation[C]// Proceedings of the 4th International Conference on Language Resources and Evaluation, 2004: 837-840.
[37] Weichedel R, Palmer M, Marcus M, et al. Linguistic Data Consortium. OntoNotes Release 5.0 LDC2013T19[DS/OL]. 2013. https://catalog.ldc.upenn.edu/LDC2013T19.
[38] Bagga A, Baldwin B. Algorithms for Scoring Coreference Chains[C]// Proceedings of the 1st International Conference on Language Resources and Evaluation Workshop on Linguistics Coreference. 1998: 563-566.
[39] Luo X Q. On Coreference Resolution Performance Metrics[C]// Proceedings of the Conference on Human Language Technology and Conference on Empirical Methods in Natural Language. 2005: 25-32.
[40] Kingma D P, Ba J. Adam: A Method for Stochastic Optimization[C]// Proceedings of the 3rd International Conference on Learning Representations. 2015: 1-13.
[1] Zhao Pengwu, Li Zhiyi, Lin Xiaoqi. Identifying Relationship of Chinese Characters with Attention Mechanism and Convolutional Neural Network[J]. 数据分析与知识发现, 2022, 6(8): 41-51.
[2] Yang Wenli, Li Nana. A Text-Aligned Cross-Language Sentiment Classification Method Based on Adversarial Networks[J]. 数据分析与知识发现, 2022, 6(7): 141-151.
[3] Guo Fanrong, Huang Xiaoxi, Wang Rongbo, Chen Zhiqun, Hu Chuang, Xie Yimin, Si Boyu. Identifying Metaphor with Transformer and Graph Convolutional Network[J]. 数据分析与知识发现, 2022, 6(4): 120-129.
[4] Wei Tingting, Jiang Tao, Zheng Shuling, Zhang Jiantao. Extracting Chinese Patent Keywords with LSTM and Logistic Regression[J]. 数据分析与知识发现, 2022, 6(2/3): 308-317.
[5] Wang Nan, Li Hairong, Tan Shuru. Predicting Public Opinion Reversal Based on Evolution Analysis of Events and Improved KE-SMOTE Algorithm[J]. 数据分析与知识发现, 2022, 6(2/3): 396-408.
[6] Tan Ying, Tang Yifei. Extracting Citation Contents with Coreference Resolution[J]. 数据分析与知识发现, 2021, 5(8): 25-33.
[7] Gu Yaowen, Zhang Bowen, Zheng Si, Yang Fengchun, Li Jiao. Predicting Drug ADMET Properties Based on Graph Attention Network[J]. 数据分析与知识发现, 2021, 5(8): 76-85.
[8] Zhang Le, Leng Jidong, Lv Xueqiang, Cui Zhuo, Wang Lei, You Xindong. RLCPAR: A Rewriting Model for Chinese Patent Abstracts Based on Reinforcement Learning[J]. 数据分析与知识发现, 2021, 5(7): 59-69.
[9] Han Pu,Zhang Zhanpeng,Zhang Mingtao,Gu Liang. Normalizing Chinese Disease Names with Multi-feature Fusion[J]. 数据分析与知识发现, 2021, 5(5): 83-94.
[10] Wang Nan,Li Hairong,Tan Shuru. Predicting of Public Opinion Reversal with Improved SMOTE Algorithm and Ensemble Learning[J]. 数据分析与知识发现, 2021, 5(4): 37-48.
[11] Li Danyang, Gan Mingxin. Music Recommendation Method Based on Multi-Source Information Fusion[J]. 数据分析与知识发现, 2021, 5(2): 94-105.
[12] Yu Bengong, Zhang Shuwen. Aspect-Level Sentiment Analysis Based on BAGCNN[J]. 数据分析与知识发现, 2021, 5(12): 37-47.
[13] Ding Hao, Ai Wenhua, Hu Guangwei, Li Shuqing, Suo Wei. A Personalized Recommendation Model with Time Series Fluctuation of User Interest[J]. 数据分析与知识发现, 2021, 5(11): 45-58.
[14] Yin Haoran,Cao Jinxuan,Cao Luzhe,Wang Guodong. Identifying Emergency Elements Based on BiGRU-AM Model with Extended Semantic Dimension[J]. 数据分析与知识发现, 2020, 4(9): 91-99.
[15] Qiu Erli,He Hongwei,Yi Chengqi,Li Huiying. Research on Public Policy Support Based on Character-level CNN Technology[J]. 数据分析与知识发现, 2020, 4(7): 28-37.
  Copyright © 2016 Data Analysis and Knowledge Discovery   Tel/Fax:(010)82626611-6626,82624938   E-mail:jishu@mail.las.ac.cn