Please wait a minute...
Advanced Search
数据分析与知识发现  2020, Vol. 4 Issue (5): 46-53     https://doi.org/10.11925/infotech.2096-3467.2019.1321
  研究论文 本期目录 | 过刊浏览 | 高级检索 |
基于动态语义注意力的指代消解方法
邓思艺,乐小虬()
中国科学院文献情报中心 北京 100190
中国科学院大学经济与管理学院图书情报与档案管理系 北京 100190
Coreference Resolution Based on Dynamic Semantic Attention
Deng Siyi,Le Xiaoqiu()
National Science Library, Chinese Academy of Sciences, Beijing 100190, China
Department of Library, Information and Archives Management, School of Economics and Management, University of Chinese Academy of Sciences, Beijing 100190, China
全文: PDF (839 KB)   HTML ( 15
输出: BibTeX | EndNote (RIS)      
摘要 

【目的】 针对先行表述复杂、指代词语义不明的问题,探索更有效的指代消解方法。【方法】 采用端到端的框架,使用打分排序法识别指代关系。先对文本段中的连续词序列进行“提及”打分,判断是否为“提及”;然后利用筛选出的候选“提及”对指代关系打分。其中词序列建模采用动态语义注意力机制,引入更匹配当前指代关系的外部词语义,并使用内部注意力编码,突出先行表述中与指代词关联的部分;综合两部分打分排序得到识别结果。【结果】 在基于OntoNotes5.0语料库的CoNLL-2012共享任务英语数据上进行实验,同参数情况下,准确率、召回率、F1值分别比基准模型提高2.02%、0.42%、1.14%。【局限】 外部语义表征的来源语料不够丰富,有待补充。训练语料皆为新闻、脱口秀或者网络日志等通用文本,可考虑加入科技文献语料,构造更为丰富的指代情境,并评估模型在各种指代情境下的表现。【结论】 动态语义注意力模块可在构建词序列表示时注入更有利于当前指代关系识别的语义特征,动态的、有选择性的外部语义注入更有利于指代关系的识别。

服务
把本文推荐给朋友
加入引用管理器
E-mail Alert
RSS
作者相关文章
邓思艺
乐小虬
关键词 指代消解动态语义注意力打分排序模型深度学习    
Abstract

[Objective] This paper tries to more effectively identify the coreference, aiming to address the issues of ambiguous anaphor meaning and complex antecedent structure.[Methods] We established an end-to-end framework and used score ranking to identify the coreference relationships. Firstly, we calculated scores of all spans to retrieve the “mentions”. Then, we used scores of the candidate mention pairs to determine coreference relationship. We also built span representation with external multiple semantic representations. Finally, we combined scores of the two parts to generate the final list.[Results] We examined our model with the OntoNotes benchmark datasets. The precision, recall and F1 values of our model were 2.02%, 0.42% and 1.14% higher than those of the SOTA model.[Limitations] The training data sets only collected news, talk shows, or weblogs. More sci-tech literature is needed to further improve the model’s performance.[Conclusions] The proposed model could more effectively identify coreferences.

Key wordsCoreference Resolution    Dynamic Semantic Attention    Ranking Model    Deep Learning
收稿日期: 2019-11-20      出版日期: 2020-06-15
ZTFLH:  G35  
通讯作者: 乐小虬     E-mail: lexq@mail.las.ac.cn
引用本文:   
邓思艺,乐小虬. 基于动态语义注意力的指代消解方法[J]. 数据分析与知识发现, 2020, 4(5): 46-53.
Deng Siyi,Le Xiaoqiu. Coreference Resolution Based on Dynamic Semantic Attention. Data Analysis and Knowledge Discovery, 2020, 4(5): 46-53.
链接本文:  
http://manu44.magtech.com.cn/Jwk_infotech_wk3/CN/10.11925/infotech.2096-3467.2019.1321      或      http://manu44.magtech.com.cn/Jwk_infotech_wk3/CN/Y2020/V4/I5/46
Fig.1  基于动态语义注意力的指代消解模型
模型 平均准确率(%) 平均召回率(%) 平均F1值(%)
E2E模型[5] 72.58 65.12 68.64
本文模型 74.60 65.54 69.78
Δ +2.02 +0.42 +1.14
Table 1  模型性能对比
[1] Steinberger J, Poesio M, Kabadjov M A, et al. Two Uses of Anaphora Resolution in Summarization[J]. Information Processing and Management, 2007,43(6):1663-1680.
[2] Gabbard R, Freedman M, Weischedel R . Coreference for Learning to Extract Relations: Yes, Virginia, Coreference Matters [C]// Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies. 2011.
[3] Mitkov R, Evans R, Orăsan C , et al. Coreference Resolution: To What Extent Does It Help NLP Applications? [C]// Proceedings of the 2012 International Conference on Text, Speech and Dialogue. 2012.
[4] Kilicoglu H, Fiszman M, Demnerfushman D . Interpreting Consumer Health Questions: The Role of Anaphora and Ellipsis [C]// Proceedings of the 2013 Workshop on Biomedical Natural Language Processing. 2013.
[5] Lee K, He L, Lewis M, et al. End-to-End Neural Coreference Resolution[OL]. arXiv Preprint, arXiv: 1707.07045, 2017.
[6] Lappin S, Leass H J. An Algorithm for Pronominal Anaphora Resolution[J]. Computational Linguistics, 1994,20(4):535-561.
[7] Ng V. Machine Learning for Coreference Resolution: From Local Classification to Global Ranking[C]//Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics (ACL’05). 2005.
[8] Ng V . Supervised Ranking for Pronoun Resolution: Some Recent Improvements [C]// Proceedings of the 20th National Conference on Artificial Intelligence. 2005.
[9] Li D, Miller T, Schuler W . A Pronoun Anaphora Resolution System Based on Factorial Hidden Markov Models [C]// Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies. 2011.
[10] Zhang H, Song Y, Song Y Q. Incorporating Context and External Knowledge for Pronoun Coreference Resolution[OL]. arXiv Preprint, arXiv:1905.10238, 2019.
[11] Subramanian S, Roth D. Improving Generalization in Coreference Resolution via Adversarial Training[OL]. arXiv Preprint, arXiv:1908.04728, 2019.
[12] Zhang R, Santos C N D, Yasunaga M, et al. Neural Coreference Resolution with Deep Biaffine Attention by Joint Mention Detection and Mention Clustering[OL]. arXiv Preprint, arXiv:1805. 04893, 2018.
[13] Peng H, Khashabi D, Roth D. Solving Hard Coreference Problems[OL]. arXiv Preprint, arXiv: 1907. 05524, 2019.
[14] Jindal P, Roth D . End-to-End Coreference Resolution for Clinical Narratives [C]// Proceedings of the 23rd International Joint Conference on Artificial Intelligence. 2013.
[15] Trieu L, Nguyen N, Miwa M , et al. Investigating Domain-Specific Information for Neural Coreference Resolution on Biomedical Texts [C]// Proceedings of the 2018 Workshop on Biomedical Natural Language Processing. 2018.
[16] Rahman A, Ng V . Coreference Resolution with World Knowledge [C]// Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies. 2011.
[17] Zhang H, Song Y, Song Y, et al. Knowledge-Aware Pronoun Coreference Resolution[OL]. arXiv Preprint, arXiv:1907.03663, 2019.
[18] Joshi M, Chen D, Liu Y, et al. SpanBERT: Improving Pre-training by Representing and Predicting Spans[OL]. arXiv Preprint, arXiv: 1907.10529, 2019.
[19] Song Y, Shi S . Complementary Learning of Word Embeddings [C]// Proceedings of the 27th International Joint Conference on Artificial Intelligence. 2018.
[20] Song Y, Shi S, Li J , et al. Directional Skip-Gram: Explicitly Distinguishing Left and Right Context for Word Embeddings [C]// Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics. 2018.
[21] Hochreiter S, Schmidhuber J. Long Short-Term Memory[J]. Neural Computation, 1997,9(8):1735-1780.
[22] Bahdanau D, Cho K, Bengio Y . Neural Machine Translation by Jointly Learning to Align and Translate[OL]. arXiv Preprint, arXiv:1409.0473, 2014.
[23] Pennington J, Socher R, Manning C . GloVe: Global Vectors for Word Representation [C]// Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing. 2014.
[24] Peters M E, Neumann M, Iyyer M, et al. Deep Contextualized Word Representations[OL]. arXiv Preprint, arXiv:1802.05365, 2018.
[25] Zhang X, Zhao J, Lecun Y. Character-level Convolutional Networks for Text Classification[OL]. arXiv Preprint, arXiv:1509.01626, 2015.
[26] Pradhan S, Moschitti A, Xue N , et al. CoNLL-2012 Shared Task: Modeling Multilingual Unrestricted Coreference in OntoNotes [C]// Proceedings of Joint Conference on EMNLP and CoNLL-Shared Task. 2012.
[27] Vilain M B, Burger J D, Aberdeen J S , et al. A Model-Theoretic Coreference Scoring Scheme [C]// Proceedings of the 6th Conference on Message Understanding. 1995.
[28] Bagga A, Baldwin B . Algorithms for Scoring Coreference Chains [C]// Proceedings of the 1st International Conference on Language Resources and Evaluation Workshop on Linguistics Coreference. 1998.
[29] Luo X . On Coreference Resolution Performance Metrics [C]// Proceedings of the 2005 Conference on Human Language Technology and Empirical Methods in Natural Language Processing. 2005.
[30] Nair V, Hinton G E. Rectified Linear Units Improve Restricted Boltzmann Machines[C]//Proceedings of the 27th International Conference on Machine Learning (ICML-10). 2010.
[31] Kingma D P, Ba J. Adam: A Method for Stochastic Optimization[OL]. arXiv Preprint, arXiv:1412.6980, 2014.
[1] 黄露,周恩国,李岱峰. 融合特定任务信息注意力机制的文本表示学习模型*[J]. 数据分析与知识发现, 2020, 4(9): 111-122.
[2] 赵旸, 张智雄, 刘欢, 丁良萍. 基于BERT模型的中文医学文献分类研究*[J]. 数据分析与知识发现, 2020, 4(8): 41-49.
[3] 徐晨飞, 叶海影, 包平. 基于深度学习的方志物产资料实体自动识别模型构建研究*[J]. 数据分析与知识发现, 2020, 4(8): 86-97.
[4] 余传明, 王曼怡, 林虹君, 朱星宇, 黄婷婷, 安璐. 基于深度学习的词汇表示模型对比研究*[J]. 数据分析与知识发现, 2020, 4(8): 28-40.
[5] 王鑫芸,王昊,邓三鸿,张宝隆. 面向期刊选择的学术论文内容分类研究 *[J]. 数据分析与知识发现, 2020, 4(7): 96-109.
[6] 焦启航,乐小虬. 对比关系句子生成方法研究[J]. 数据分析与知识发现, 2020, 4(6): 43-50.
[7] 王末,崔运鹏,陈丽,李欢. 基于深度学习的学术论文语步结构分类方法研究*[J]. 数据分析与知识发现, 2020, 4(6): 60-68.
[8] 余传明,原赛,朱星宇,林虹君,张普亮,安璐. 基于深度学习的热点事件主题表示研究*[J]. 数据分析与知识发现, 2020, 4(4): 1-14.
[9] 苏传东,黄孝喜,王荣波,谌志群,毛君钰,朱嘉莹,潘宇豪. 基于词嵌入融合和循环神经网络的中英文隐喻识别*[J]. 数据分析与知识发现, 2020, 4(4): 91-99.
[10] 刘彤,倪维健,孙宇健,曾庆田. 基于深度迁移学习的业务流程实例剩余执行时间预测方法*[J]. 数据分析与知识发现, 2020, 4(2/3): 134-142.
[11] 余传明,李浩男,王曼怡,黄婷婷,安璐. 基于深度学习的知识表示研究:网络视角*[J]. 数据分析与知识发现, 2020, 4(1): 63-75.
[12] 张梦吉,杜婉钰,郑楠. 引入新闻短文本的个股走势预测模型[J]. 数据分析与知识发现, 2019, 3(5): 11-18.
[13] 裴晶晶,乐小虬. 篇章级并列关系文本块识别方法研究[J]. 数据分析与知识发现, 2019, 3(5): 51-56.
[14] 张智雄,刘欢,丁良萍,吴朋民,于改红. 不同深度学习模型的科技论文摘要语步识别效果对比研究 *[J]. 数据分析与知识发现, 2019, 3(12): 1-9.
[15] 余丽,钱力,付常雷,赵华茗. 基于深度学习的文本中细粒度知识元抽取方法研究*[J]. 数据分析与知识发现, 2019, 3(1): 38-45.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
版权所有 © 2015 《数据分析与知识发现》编辑部
地址:北京市海淀区中关村北四环西路33号 邮编:100190
电话/传真:(010)82626611-6626,82624938
E-mail:jishu@mail.las.ac.cn