Please wait a minute...
Advanced Search
数据分析与知识发现  2022, Vol. 6 Issue (2/3): 117-128     https://doi.org/10.11925/infotech.2096-3467.2021.0965
  专辑 本期目录 | 过刊浏览 | 高级检索 |
基于多任务深度学习的实体和事件联合抽取模型*
余传明(),林虹君,张贞港
中南财经政法大学信息与安全工程学院 武汉 430073
Joint Extraction Model for Entities and Events with Multi-task Deep Learning
Yu Chuanming(),Lin Hongjun,Zhang Zhengang
School of Information and Safety Engineering, Zhongnan University of Economics and Law,Wuhan 430073, China
全文: PDF (1010 KB)   HTML ( 13
输出: BibTeX | EndNote (RIS)      
摘要 

【目的】 利用命名实体识别与事件检测任务之间的相关性,同时提升命名实体识别和事件检测模型的性能。【方法】 提出基于多任务学习的实体和事件联合抽取模型MDL-J3E,该模型分为共享层、私有层和解码层。其中,共享层生成通用特征;私有层由命名实体识别部分和事件检测部分组成,在通用特征的基础上分别提取两个子任务的私有特征;解码层将子任务的特征解码输出为符合约束规则的标签序列。【结果】 在ACE2005数据集上开展实证研究,所提模型在命名实体识别任务上的F1值为84.15%,在事件检测任务上的F1值为70.96%。【局限】 未将多任务模型应用到更多的信息抽取场景中。【结论】 与单任务模型相比,多任务模型在命名实体识别任务和事件检测任务中具有更好的效果。

服务
把本文推荐给朋友
加入引用管理器
E-mail Alert
RSS
作者相关文章
余传明
林虹君
张贞港
关键词 命名实体识别事件检测多任务学习深度学习    
Abstract

[Objective] The study tries to improve the performance of entity and event extraction with the help of their correlation. [Methods] Based on the multi-task deep learning, we proposed a joint entity and event extraction model (MDL-J3E), which had the shared layer, the private layer, and the decoding layer. The shared layer generated common features. The private layer had the named entity recognition and event detection modules, which extracted features of the two subtasks based on their general features. The decoding layer analyzed features of each task and generated tag sequence following the constraint rules. [Results] We examined our model with the ACE2005 dataset. The F1 values were 84.15% in the named entity recognition task and 70.96% in the event detection task. [Limitations] We did not evaluate the proposed model with other information extraction scenarios. [Conclusions] Compared with the single task model, our multi-task model has better performance in both named entity recognition and event detection tasks.

Key wordsNamed Entity Recognition    Event Detection    Multi-task Learning    Deep Learning
收稿日期: 2021-08-31      出版日期: 2022-04-14
ZTFLH:  TP393  
基金资助:*国家自然科学基金面上项目(71974202);国家自然科学基金重大课题的研究成果之一(71790612)
通讯作者: 余传明,ORCID:0000-0001-7099-0853     E-mail: yucm@zuel.edu.cn
引用本文:   
余传明, 林虹君, 张贞港. 基于多任务深度学习的实体和事件联合抽取模型*[J]. 数据分析与知识发现, 2022, 6(2/3): 117-128.
Yu Chuanming, Lin Hongjun, Zhang Zhengang. Joint Extraction Model for Entities and Events with Multi-task Deep Learning. Data Analysis and Knowledge Discovery, 2022, 6(2/3): 117-128.
链接本文:  
https://manu44.magtech.com.cn/Jwk_infotech_wk3/CN/10.11925/infotech.2096-3467.2021.0965      或      https://manu44.magtech.com.cn/Jwk_infotech_wk3/CN/Y2022/V6/I2/3/117
Fig.1  基于多任务深度学习的实体与事件联合抽取模型
超参数设置 参数名 参数值
训练参数设置 batch_size 16
learning_rate 1e-5
crf_lr_multiplier 1 000
optimizer Adam
共享层模块 L 12
A 12
命名实体识别子模块 N 4
M 3
dilation_rate 1,2,4
事件检测子模块 lstm_dim 128
dropout 0.5
kernel_size 1,2,3
Table 1  参数设置
模型 命名实体识别 事件检测
P/% R/% F1/% P/% R/% F1/%
Layered-BiLSTM-CRF 74.20 70.30 72.20 - - -
GEANN 77.10 73.30 75.20 - - -
BiFlaG 75.00 75.20 75.10 - - -
Merge and Label [ELMO] 79.70 78.00 78.90 - - -
Merge and Label [BERT] 82.70 82.10 82.40 - - -
JOINTEVENTENTITY - - - 75.10 63.30 68.70
DMCNN - - - 75.60 63.60 69.10
FN-ANN - - - 79.50 60.70 68.80
BDLSTM-TNNs - - - 75.30 63.40 68.90
JRNN - - - 66.00 73.00 69.30
TD-DMN - - - 65.80 65.90 65.60
RNN_AL - - - 77.40 61.30 67.80
GAIL - - - 74.20 65.30 69.50
Conv-BiLSTM - - - 74.70 64.90 69.50
ANN-Gold2 - - - 81.40 66.90 73.40
HNN-EE 84.00 82.50 83.20 74.40 67.30 70.60
单任务NER(MDL-J3E) 83.86 84.10 83.98 - - -
单任务ED(MDL-J3E) - - - 66.67 74.25 70.25
MDL-J3E 83.48 84.83 84.15 69.16 72.85 70.96
Table 2  多任务学习模型结果对比
模型 命名实体识别 事件检测
P/% R/% F1/% P/% R/% F1/%
CRF&LSTM-CRF 83.52 84.56 84.04 69.04 71.93 70.46
CRF&LSTM-ATT-CRF 83.90 83.71 83.81 68.93 68.45 70.70
CRF&RACNN-CRF 82.43 84.89 83.64 69.41 70.53 69.97
DGCNN(1)-CRF&RACNN-CRF 83.83 83.94 83.89 72.93 67.52 70.12
IDCNN-CRF&RACNN-CRF 84.10 83.19 83.64 68.89 69.37 69.13
LSTM-CRF&RACNN-CRF 83.88 83.74 83.81 73.43 67.98 70.60
IDGCNN-CRF&RACNN-CRF(独立) 83.70 84.46 84.08 66.46 74.48 70.24
IDGCNN-CRF&RACNN-CRF(MDL-J3E) 83.48 84.83 84.15 69.16 72.85 70.96
IDGCNN-CRF&RACNN-CRF(dropout=0.2) 81.42 81.84 81.63 69.50 70.30 69.90
IDGCNN-CRF&RACNN-CRF(dropout=0.5) 83.52 84.07 83.80 69.66 71.93 70.78
Table 3  模型架构对多任务学习模型的影响
损失函数比例 P/% R/% F1/%
1:1 69.16 72.85 70.96
1:3 67.17 72.62 69.79
1:5 68.07 71.23 69.62
1:7 70.53 70.53 70.53
1:9 67.74 73.09 70.31
1:13 71.08 67.29 69.13
Table 4  损失函数中超参数对模型的影响
模型 命名实体识别 事件检测
P/% R/% F1/% P/% R/% F1/%
LSTM-CRF1-CRF2 83.12 84.43 83.77 69.20 69.84 69.52
IDGCNN-CRF1-CRF2 84.16 84.10 84.13 68.55 73.32 70.85
RACNN-CRF1-CRF2 82.79 84.66 83.71 70.00 69.84 69.92
IDGCNN-CRF1&RACNN-CRF2(MDL-J3E) 83.48 84.83 84.15 69.16 72.85 70.96
Table 5  私有层参数共享对多任务学习模型的影响
[1] 宗成庆. 统计自然语言处理[M]. 北京: 清华大学出版社, 2008: 379-388.
[1] ( Zong Chengqing. Statistical Natural Language Processing[M]. Beijing: Tsinghua University Press, 2008: 379-388.)
[2] 郭剑毅, 薛征山, 余正涛, 等. 基于层叠条件随机场的旅游领域命名实体识别[J]. 中文信息学报, 2009, 23(5):47-52.
[2] ( Guo Jianyi, Xue Zhengshan, Yu Zhengtao, et al. Named Entity Recognition for the Tourism Domain Based on Cascaded Conditional Random Fields[J]. Journal of Chinese Information Processing, 2009, 23(5):47-52.)
[3] 冯元勇, 孙乐, 李文波, 等. 基于单字提示特征的中文命名实体识别快速算法[J]. 中文信息学报, 2008, 22(1):104-110.
[3] ( Feng Yuanyong, Sun Le, Li Wenbo, et al. A Rapid Algorithm to Chinese Named Entity Recognition Based on Single Character Hints[J]. Journal of Chinese Information Processing, 2008, 22(1):104-110.)
[4] 陈美杉, 夏晨曦. 肝癌患者在线提问的命名实体识别研究: 一种基于迁移学习的方法[J]. 数据分析与知识发现, 2019, 3(12):61-69.
[4] ( Chen Meishan, Xia Chenxi. Identifying Entities of Online Questions from Cancer Patients Based on Transfer Learning[J]. Data Analysis and Knowledge Discovery, 2019, 3(12):61-69.)
[5] 喻雪寒, 何琳, 徐健. 基于RoBERTa-CRF的古文历史事件抽取方法研究[J]. 数据分析与知识发现, 2021, 5(7):26-35.
[5] ( Yu Xuehan, He Lin, Xu Jian. Extracting Events from Ancient Books Based on RoBERTa-CRF[J]. Data Analysis and Knowledge Discovery, 2021, 5(7):26-35.)
[6] Zhang Y, Yang Q. An Overview of Multi-task Learning[J]. National Science Review, 2018, 5(1):30-43.
doi: 10.1093/nsr/nwx105
[7] Dai J F, He K M, Sun J. Instance-aware Semantic Segmentation via Multi-task Network Cascades[C]// Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition. 2016: 3150-3158.
[8] Misra I, Shrivastava A, Gupta A, et al. Cross-stitch Networks for Multi-task Learning[C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.IEEE, 2016: 3994-4003.
[9] Cipolla R, Gal Y, Kendall A. Multi-task Learning Using Uncertainty to Weigh Losses for Scene Geometry and Semantics[C]// Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. IEEE, 2018: 7482-7491.
[10] Li Q, Ji H, Huang L. Joint Event Extraction via Structured Prediction with Global Features[C]// Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics. 2013: 73-82.
[11] Liu J, Chen Y B, Liu K, et al. Event Detection via Gated Multilingual Attention Mechanism[C]// Proceedings of the 32nd AAAI Conference on Artificial Intelligence. 2018: 4865-4872.
[12] 王吉地, 郭军军, 黄于欣, 等. 融合依存信息和卷积神经网络的越南语新闻事件检测[J]. 南京大学学报(自然科学), 2020, 56(1):125-131.
[12] ( Wang Jidi, Guo Junjun, Huang Yuxin, et al. Vietnamese News Event Detection Based on Converge Dependent Information and Convolutional Neural Networks[J]. Journal of Nanjing University (Natural Science), 2020, 56(1):125-131.)
[13] Nguyen T H, Grishman R. Graph Convolutional Networks with Argument-Aware Pooling for Event Detection[C]// Proceedings of the 32nd AAAI Conference on Artificial Intelligence. 2018: 5900-5907.
[14] Liu S L, Chen Y B, Liu K, et al. Exploiting Argument Information to Improve Event Detection via Supervised Attention Mechanisms[C]// Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics. 2017: 1789-1798.
[15] Ji Y Z, Lin Y F, Gao J W, et al. Exploiting the Entity Type Sequence to Benefit Event Detection[C]// Proceedings of the 23rd Conference on Computational Natural Language Learning. 2019: 613-623.
[16] 贺瑞芳, 段绍杨. 基于多任务学习的中文事件抽取联合模型[J]. 软件学报, 2019, 30(4):1015-1030.
[16] ( He Ruifang, Duan Shaoyang. Joint Chinese Event Extraction Based Multi-Task Learning[J]. Journal of Software, 2019, 30(4):1015-1030.)
[17] 仲伟峰, 杨航, 陈玉博, 等. 基于联合标注和全局推理的篇章级事件抽取[J]. 中文信息学报, 2019, 33(9):88-95, 106.
[17] ( Zhong Weifeng, Yang Hang, Chen Yubo, et al. Document-Level Event Extraction Based on Joint Labeling and Global Reasoning[J]. Journal of Chinese Information Processing, 2019, 33(9):88-95, 106.)
[18] 曹晓民, 史瑞刚. 多任务神经网络药物不良反应检测算法[J]. 控制工程, 2020, 27(7):1151-1156.
[18] ( Cao Xiaomin, Shi Ruigang. Multi-Task Based Neural Network Algorithm for Detection of Drug Adverse Event[J]. Control Engineering of China, 2020, 27(7):1151-1156.)
[19] 张贺, 刘茂福, 胡慧君, 等. 基于信息单元融合的新闻原子事件抽取[J]. 武汉大学学报(理学版), 2015, 61(2):139-144.
[19] ( Zhang He, Liu Maofu, Hu Huijun, et al. Atomic Event Extraction Based on Information Unit Fusion[J]. Journal of Wuhan University (Natural Science Edition), 2015, 61(2):139-144.)
[20] Lin Y, Yang S Q, Stoyanov V, et al. A Multi-lingual Multi-task Architecture for Low-resource Sequence Labeling[C]// Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics. 2018: 799-809.
[21] Wang J, Kulkarni M, Preotiuc-Pietro D. Multi-domain Named Entity Recognition with Genre-aware and Agnostic Inference[C]// Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. 2020: 8476-8488.
[22] 杨晓辉, 毕雪华, 张琳琳, 等. 基于多任务的中文电子病历中命名实体识别研究[J]. 东北师大学报(自然科学版), 2020, 52(1):81-87.
[22] ( Yang Xiaohui, Bi Xuehua, Zhang Linlin, et al. Multi-Task Based Chinese Electronic Medical Record Entity Recognition[J]. Journal of Northeast Normal University (Natural Science Edition), 2020, 52(1):81-87.)
[23] 罗凌, 杨志豪, 宋雅文, 等. 基于笔画ELMo和多任务学习的中文电子病历命名实体识别研究[J]. 计算机学报, 2020, 43(10):1943-1957.
[23] ( Luo Ling, Yang Zhihao, Song Yawen, et al. Chinese Clinical Named Entity Recognition Based on Stroke ELMo and Multi-Task Learning[J]. Chinese Journal of Computers, 2020, 43(10):1943-1957.)
[24] 李青青, 杨志豪, 罗凌, 等. 基于多任务学习的生物医学实体关系抽取[J]. 中文信息学报, 2019, 33(8):84-92.
[24] ( Li Qingqing, Yang Zhihao, Luo Ling, et al. A Multi-Task Learning Approach to Biomedical Entity Relation Extraction[J]. Journal of Chinese Information Processing, 2019, 33(8):84-92.)
[25] 刘宗林, 张梅山, 甄冉冉, 等. 融入罪名关键词的法律判决预测多任务学习模型[J]. 清华大学学报(自然科学版), 2019, 59(7):497-504.
[25] ( Liu Zonglin, Zhang Meishan, Zhen Ranran, et al. Multi-Task Learning Model for Legal Judgment Predictions with Charge Keywords[J]. Journal of Tsinghua University (Science and Technology), 2019, 59(7):497-504.)
[26] 余传明, 李浩男, 安璐. 基于多任务深度学习的文本情感原因分析[J]. 广西师范大学学报(自然科学版), 2019, 37(1):50-61.
[26] ( Yu Chuanming, Li Haonan, An Lu. Analysis of Text Emotion Cause Based on Multi-Task Deep Learning[J]. Journal of Guangxi Normal University (Natural Science Edition), 2019, 37(1):50-61.)
[27] Yang B S, Mitchell T M. Joint Extraction of Events and Entities within a Document Context[OL]. arXiv Preprint, arXiv: 1609.03632.
[28] Kruengkrai C, Nguyen T H, Aljunied S M, et al. Improving Low-Resource Named Entity Recognition Using Joint Sentence and Token Labeling[C]// Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. 2020: 5898-5905.
[29] Martins P H, Marinho Z, Martins A F T. Joint Learning of Named Entity Recognition and Entity Linking[C]// Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics: Student Research Workshop. 2019: 190-196.
[30] 吴文涛, 李培峰, 朱巧明. 基于混合神经网络的实体和事件联合抽取方法[J]. 中文信息学报, 2019, 33(8):77-83.
[30] ( Wu Wentao, Li Peifeng, Zhu Qiaoming. Joint Extraction of Entities and Events by a Hybrid Neural Network[J]. Journal of Chinese Information Processing, 2019, 33(8):77-83.)
[31] Martínez A H, Plank B. When is Multitask Learning Effective? Semantic Sequence Prediction under Varying Data Conditions[C]// Proceedings of the Conference of the 15th European Chapter of the Association for Computational Linguistics. 2017: 44-53.
[32] Strubell E, Verga P, Belanger D, et al. Fast and Accurate Entity Recognition with Iterated Dilated Convolutions[OL]. arXiv Preprint, arXiv: 1702.02098.
[33] Ju M Z, Miwa M, Ananiadou S. A Neural Layered Model for Nested Named Entity Recognition[C]// Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 2018: 1446-1459.
[34] Lin H Y, Lu Y J, Han X P, et al. Gazetteer-Enhanced Attentive Neural Networks for Named Entity Recognition[C]// Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). 2019: 6232-6237.
[35] Luo Y, Zhao H. Bipartite Flat-Graph Network for Nested Named Entity Recognition[C]// Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. 2020: 6408-6418.
[36] Fisher J, Vlachos A. Merge and Label: A Novel Neural Network Architecture for Nested NER[C]// Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. 2019: 5840-5850.
[37] Chen Y B, Xu L H, Liu K, et al. Event Extraction via Dynamic Multi-Pooling Convolutional Neural Networks[C]// Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing. 2015: 167-176.
[38] Liu S L, Chen Y B, He S Z, et al. Leveraging FrameNet to Improve Automatic Event Detection[C]// Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics. 2016: 2134-2143.
[39] Chen Y B, Liu S L, He S Z, et al. Event Extraction via Bidirectional Long Short-Term Memory Tensor Neural Networks[C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2016: 190-203.
[40] Nguyen T H, Cho K, Grishman R. Joint Event Extraction via Recurrent Neural Networks[C]// Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 2016: 300-309.
[41] Liu S B, Cheng R, Yu X M, et al. Exploiting Contextual Information via Dynamic Memory Network for Event Detection[OL]. arXiv Preprint, arXiv: 1810.03449.
[42] 邱盈盈, 洪宇, 周文瑄, 等. 面向事件抽取的深度与主动联合学习方法[J]. 中文信息学报, 2018, 32(6):98-106.
[42] ( Qiu Yingying, Hong Yu, Zhou Wenxuan, et al. Combining Deep Learning and Active Learning for Event Extraction[J]. Journal of Chinese Information Processing, 2018, 32(6):98-106.)
[43] Zhang T T, Ji H, Sil A. Joint Entity and Event Extraction with Generative Adversarial Imitation Learning[J]. Data Intelligence, 2019, 1(2):99-120.
doi: 10.1162/dint_a_00014
[44] 陈斌, 周勇, 刘兵. 基于卷积双向长短期记忆网络的事件触发词抽取[J]. 计算机工程, 2019, 45(1):153-158.
[44] ( Chen Bin, Zhou Yong, Liu Bing. Event Trigger Word Extraction Based on Convolutional Bidirectional Long Short Term Memory Network[J]. Computer Engineering, 2019, 45(1):153-158.)
[45] 余传明, 王峰, 张贞港, 等. 基于表示学习的知识库问答模型研究[J]. 科技情报研究, 2021, 3(1):56-70.
[45] ( Yu Chuanming, Wang Feng, Zhang Zhengang, et al. Research on Knowledge Graph Question Answering Model Based on Representation Learning[J]. Scientific Information Research, 2021, 3(1):56-70.)
[1] 张云秋, 汪洋, 李博诚. 基于RoBERTa-wwm动态融合模型的中文电子病历命名实体识别*[J]. 数据分析与知识发现, 2022, 6(2/3): 242-250.
[2] 张云秋, 李博诚, 陈妍. 面向不平衡数据的电子病历自动分类研究*[J]. 数据分析与知识发现, 2022, 6(2/3): 233-241.
[3] 张芳丛, 秦秋莉, 姜勇, 庄润涛. 基于RoBERTa-WWM-BiLSTM-CRF的中文电子病历命名实体识别研究[J]. 数据分析与知识发现, 2022, 6(2/3): 251-262.
[4] 胡雅敏, 吴晓燕, 陈方. 基于机器学习的技术术语识别研究综述[J]. 数据分析与知识发现, 2022, 6(2/3): 7-17.
[5] 周泽聿,王昊,赵梓博,李跃艳,张小琴. 融合关联信息的GCN文本分类模型构建及其应用研究*[J]. 数据分析与知识发现, 2021, 5(9): 31-41.
[6] 赵丹宁,牟冬梅,白森. 基于深度学习的科技文献摘要结构要素自动抽取方法研究*[J]. 数据分析与知识发现, 2021, 5(7): 70-80.
[7] 杨晗迅, 周德群, 马静, 罗永聪. 基于不确定性损失函数和任务层级注意力机制的多任务谣言检测研究*[J]. 数据分析与知识发现, 2021, 5(7): 101-110.
[8] 徐月梅, 王子厚, 吴子歆. 一种基于CNN-BiLSTM多特征融合的股票走势预测模型*[J]. 数据分析与知识发现, 2021, 5(7): 126-138.
[9] 钟佳娃,刘巍,王思丽,杨恒. 文本情感分析方法及应用综述*[J]. 数据分析与知识发现, 2021, 5(6): 1-13.
[10] 黄名选,蒋曹清,卢守东. 基于词嵌入与扩展词交集的查询扩展*[J]. 数据分析与知识发现, 2021, 5(6): 115-125.
[11] 马莹雪,甘明鑫,肖克峻. 融合标签和内容信息的矩阵分解推荐方法*[J]. 数据分析与知识发现, 2021, 5(5): 71-82.
[12] 张国标,李洁. 融合多模态内容语义一致性的社交媒体虚假新闻检测*[J]. 数据分析与知识发现, 2021, 5(5): 21-29.
[13] 常城扬,王晓东,张胜磊. 基于深度学习方法对特定群体推特的动态政治情感极性分析*[J]. 数据分析与知识发现, 2021, 5(3): 121-131.
[14] 冯勇,刘洋,徐红艳,王嵘冰,张永刚. 融合近邻评论的GRU商品推荐模型*[J]. 数据分析与知识发现, 2021, 5(3): 78-87.
[15] 成彬,施水才,都云程,肖诗斌. 基于融合词性的BiLSTM-CRF的期刊关键词抽取方法[J]. 数据分析与知识发现, 2021, 5(3): 101-108.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
版权所有 © 2015 《数据分析与知识发现》编辑部
地址:北京市海淀区中关村北四环西路33号 邮编:100190
电话/传真:(010)82626611-6626,82624938
E-mail:jishu@mail.las.ac.cn