Please wait a minute...
Data Analysis and Knowledge Discovery  2022, Vol. 6 Issue (2/3): 117-128    DOI: 10.11925/infotech.2096-3467.2021.0965
Current Issue | Archive | Adv Search |
Joint Extraction Model for Entities and Events with Multi-task Deep Learning
Yu Chuanming(),Lin Hongjun,Zhang Zhengang
School of Information and Safety Engineering, Zhongnan University of Economics and Law,Wuhan 430073, China
Download: PDF (1010 KB)   HTML ( 9
Export: BibTeX | EndNote (RIS)      
Abstract  

[Objective] The study tries to improve the performance of entity and event extraction with the help of their correlation. [Methods] Based on the multi-task deep learning, we proposed a joint entity and event extraction model (MDL-J3E), which had the shared layer, the private layer, and the decoding layer. The shared layer generated common features. The private layer had the named entity recognition and event detection modules, which extracted features of the two subtasks based on their general features. The decoding layer analyzed features of each task and generated tag sequence following the constraint rules. [Results] We examined our model with the ACE2005 dataset. The F1 values were 84.15% in the named entity recognition task and 70.96% in the event detection task. [Limitations] We did not evaluate the proposed model with other information extraction scenarios. [Conclusions] Compared with the single task model, our multi-task model has better performance in both named entity recognition and event detection tasks.

Key wordsNamed Entity Recognition      Event Detection      Multi-task Learning      Deep Learning     
Received: 31 August 2021      Published: 14 April 2022
ZTFLH:  TP393  
Fund:National Natural Science Foundation of China(71974202);National Natural Science Foundation of China(71790612)
Corresponding Authors: Yu Chuanming,ORCID:0000-0001-7099-0853     E-mail: yucm@zuel.edu.cn

Cite this article:

Yu Chuanming, Lin Hongjun, Zhang Zhengang. Joint Extraction Model for Entities and Events with Multi-task Deep Learning. Data Analysis and Knowledge Discovery, 2022, 6(2/3): 117-128.

URL:

https://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/10.11925/infotech.2096-3467.2021.0965     OR     https://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/Y2022/V6/I2/3/117

The Architecture of the Multi-task Deep Learning for Joint Extraction of Entity and Event
超参数设置 参数名 参数值
训练参数设置 batch_size 16
learning_rate 1e-5
crf_lr_multiplier 1 000
optimizer Adam
共享层模块 L 12
A 12
命名实体识别子模块 N 4
M 3
dilation_rate 1,2,4
事件检测子模块 lstm_dim 128
dropout 0.5
kernel_size 1,2,3
Parameter Settings
模型 命名实体识别 事件检测
P/% R/% F1/% P/% R/% F1/%
Layered-BiLSTM-CRF 74.20 70.30 72.20 - - -
GEANN 77.10 73.30 75.20 - - -
BiFlaG 75.00 75.20 75.10 - - -
Merge and Label [ELMO] 79.70 78.00 78.90 - - -
Merge and Label [BERT] 82.70 82.10 82.40 - - -
JOINTEVENTENTITY - - - 75.10 63.30 68.70
DMCNN - - - 75.60 63.60 69.10
FN-ANN - - - 79.50 60.70 68.80
BDLSTM-TNNs - - - 75.30 63.40 68.90
JRNN - - - 66.00 73.00 69.30
TD-DMN - - - 65.80 65.90 65.60
RNN_AL - - - 77.40 61.30 67.80
GAIL - - - 74.20 65.30 69.50
Conv-BiLSTM - - - 74.70 64.90 69.50
ANN-Gold2 - - - 81.40 66.90 73.40
HNN-EE 84.00 82.50 83.20 74.40 67.30 70.60
单任务NER(MDL-J3E) 83.86 84.10 83.98 - - -
单任务ED(MDL-J3E) - - - 66.67 74.25 70.25
MDL-J3E 83.48 84.83 84.15 69.16 72.85 70.96
Comparison of Multi-task Learning Model Results
模型 命名实体识别 事件检测
P/% R/% F1/% P/% R/% F1/%
CRF&LSTM-CRF 83.52 84.56 84.04 69.04 71.93 70.46
CRF&LSTM-ATT-CRF 83.90 83.71 83.81 68.93 68.45 70.70
CRF&RACNN-CRF 82.43 84.89 83.64 69.41 70.53 69.97
DGCNN(1)-CRF&RACNN-CRF 83.83 83.94 83.89 72.93 67.52 70.12
IDCNN-CRF&RACNN-CRF 84.10 83.19 83.64 68.89 69.37 69.13
LSTM-CRF&RACNN-CRF 83.88 83.74 83.81 73.43 67.98 70.60
IDGCNN-CRF&RACNN-CRF(独立) 83.70 84.46 84.08 66.46 74.48 70.24
IDGCNN-CRF&RACNN-CRF(MDL-J3E) 83.48 84.83 84.15 69.16 72.85 70.96
IDGCNN-CRF&RACNN-CRF(dropout=0.2) 81.42 81.84 81.63 69.50 70.30 69.90
IDGCNN-CRF&RACNN-CRF(dropout=0.5) 83.52 84.07 83.80 69.66 71.93 70.78
The Influence of Architecture on Multi-task Learning Model
损失函数比例 P/% R/% F1/%
1:1 69.16 72.85 70.96
1:3 67.17 72.62 69.79
1:5 68.07 71.23 69.62
1:7 70.53 70.53 70.53
1:9 67.74 73.09 70.31
1:13 71.08 67.29 69.13
The Influence of Hyperparameters in the Loss Function on the Model
模型 命名实体识别 事件检测
P/% R/% F1/% P/% R/% F1/%
LSTM-CRF1-CRF2 83.12 84.43 83.77 69.20 69.84 69.52
IDGCNN-CRF1-CRF2 84.16 84.10 84.13 68.55 73.32 70.85
RACNN-CRF1-CRF2 82.79 84.66 83.71 70.00 69.84 69.92
IDGCNN-CRF1&RACNN-CRF2(MDL-J3E) 83.48 84.83 84.15 69.16 72.85 70.96
The Influence of Private Layer Shared Parameters on the Multi-task Learning Model
[1] 宗成庆. 统计自然语言处理[M]. 北京: 清华大学出版社, 2008: 379-388.
[1] ( Zong Chengqing. Statistical Natural Language Processing[M]. Beijing: Tsinghua University Press, 2008: 379-388.)
[2] 郭剑毅, 薛征山, 余正涛, 等. 基于层叠条件随机场的旅游领域命名实体识别[J]. 中文信息学报, 2009, 23(5):47-52.
[2] ( Guo Jianyi, Xue Zhengshan, Yu Zhengtao, et al. Named Entity Recognition for the Tourism Domain Based on Cascaded Conditional Random Fields[J]. Journal of Chinese Information Processing, 2009, 23(5):47-52.)
[3] 冯元勇, 孙乐, 李文波, 等. 基于单字提示特征的中文命名实体识别快速算法[J]. 中文信息学报, 2008, 22(1):104-110.
[3] ( Feng Yuanyong, Sun Le, Li Wenbo, et al. A Rapid Algorithm to Chinese Named Entity Recognition Based on Single Character Hints[J]. Journal of Chinese Information Processing, 2008, 22(1):104-110.)
[4] 陈美杉, 夏晨曦. 肝癌患者在线提问的命名实体识别研究: 一种基于迁移学习的方法[J]. 数据分析与知识发现, 2019, 3(12):61-69.
[4] ( Chen Meishan, Xia Chenxi. Identifying Entities of Online Questions from Cancer Patients Based on Transfer Learning[J]. Data Analysis and Knowledge Discovery, 2019, 3(12):61-69.)
[5] 喻雪寒, 何琳, 徐健. 基于RoBERTa-CRF的古文历史事件抽取方法研究[J]. 数据分析与知识发现, 2021, 5(7):26-35.
[5] ( Yu Xuehan, He Lin, Xu Jian. Extracting Events from Ancient Books Based on RoBERTa-CRF[J]. Data Analysis and Knowledge Discovery, 2021, 5(7):26-35.)
[6] Zhang Y, Yang Q. An Overview of Multi-task Learning[J]. National Science Review, 2018, 5(1):30-43.
doi: 10.1093/nsr/nwx105
[7] Dai J F, He K M, Sun J. Instance-aware Semantic Segmentation via Multi-task Network Cascades[C]// Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition. 2016: 3150-3158.
[8] Misra I, Shrivastava A, Gupta A, et al. Cross-stitch Networks for Multi-task Learning[C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.IEEE, 2016: 3994-4003.
[9] Cipolla R, Gal Y, Kendall A. Multi-task Learning Using Uncertainty to Weigh Losses for Scene Geometry and Semantics[C]// Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. IEEE, 2018: 7482-7491.
[10] Li Q, Ji H, Huang L. Joint Event Extraction via Structured Prediction with Global Features[C]// Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics. 2013: 73-82.
[11] Liu J, Chen Y B, Liu K, et al. Event Detection via Gated Multilingual Attention Mechanism[C]// Proceedings of the 32nd AAAI Conference on Artificial Intelligence. 2018: 4865-4872.
[12] 王吉地, 郭军军, 黄于欣, 等. 融合依存信息和卷积神经网络的越南语新闻事件检测[J]. 南京大学学报(自然科学), 2020, 56(1):125-131.
[12] ( Wang Jidi, Guo Junjun, Huang Yuxin, et al. Vietnamese News Event Detection Based on Converge Dependent Information and Convolutional Neural Networks[J]. Journal of Nanjing University (Natural Science), 2020, 56(1):125-131.)
[13] Nguyen T H, Grishman R. Graph Convolutional Networks with Argument-Aware Pooling for Event Detection[C]// Proceedings of the 32nd AAAI Conference on Artificial Intelligence. 2018: 5900-5907.
[14] Liu S L, Chen Y B, Liu K, et al. Exploiting Argument Information to Improve Event Detection via Supervised Attention Mechanisms[C]// Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics. 2017: 1789-1798.
[15] Ji Y Z, Lin Y F, Gao J W, et al. Exploiting the Entity Type Sequence to Benefit Event Detection[C]// Proceedings of the 23rd Conference on Computational Natural Language Learning. 2019: 613-623.
[16] 贺瑞芳, 段绍杨. 基于多任务学习的中文事件抽取联合模型[J]. 软件学报, 2019, 30(4):1015-1030.
[16] ( He Ruifang, Duan Shaoyang. Joint Chinese Event Extraction Based Multi-Task Learning[J]. Journal of Software, 2019, 30(4):1015-1030.)
[17] 仲伟峰, 杨航, 陈玉博, 等. 基于联合标注和全局推理的篇章级事件抽取[J]. 中文信息学报, 2019, 33(9):88-95, 106.
[17] ( Zhong Weifeng, Yang Hang, Chen Yubo, et al. Document-Level Event Extraction Based on Joint Labeling and Global Reasoning[J]. Journal of Chinese Information Processing, 2019, 33(9):88-95, 106.)
[18] 曹晓民, 史瑞刚. 多任务神经网络药物不良反应检测算法[J]. 控制工程, 2020, 27(7):1151-1156.
[18] ( Cao Xiaomin, Shi Ruigang. Multi-Task Based Neural Network Algorithm for Detection of Drug Adverse Event[J]. Control Engineering of China, 2020, 27(7):1151-1156.)
[19] 张贺, 刘茂福, 胡慧君, 等. 基于信息单元融合的新闻原子事件抽取[J]. 武汉大学学报(理学版), 2015, 61(2):139-144.
[19] ( Zhang He, Liu Maofu, Hu Huijun, et al. Atomic Event Extraction Based on Information Unit Fusion[J]. Journal of Wuhan University (Natural Science Edition), 2015, 61(2):139-144.)
[20] Lin Y, Yang S Q, Stoyanov V, et al. A Multi-lingual Multi-task Architecture for Low-resource Sequence Labeling[C]// Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics. 2018: 799-809.
[21] Wang J, Kulkarni M, Preotiuc-Pietro D. Multi-domain Named Entity Recognition with Genre-aware and Agnostic Inference[C]// Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. 2020: 8476-8488.
[22] 杨晓辉, 毕雪华, 张琳琳, 等. 基于多任务的中文电子病历中命名实体识别研究[J]. 东北师大学报(自然科学版), 2020, 52(1):81-87.
[22] ( Yang Xiaohui, Bi Xuehua, Zhang Linlin, et al. Multi-Task Based Chinese Electronic Medical Record Entity Recognition[J]. Journal of Northeast Normal University (Natural Science Edition), 2020, 52(1):81-87.)
[23] 罗凌, 杨志豪, 宋雅文, 等. 基于笔画ELMo和多任务学习的中文电子病历命名实体识别研究[J]. 计算机学报, 2020, 43(10):1943-1957.
[23] ( Luo Ling, Yang Zhihao, Song Yawen, et al. Chinese Clinical Named Entity Recognition Based on Stroke ELMo and Multi-Task Learning[J]. Chinese Journal of Computers, 2020, 43(10):1943-1957.)
[24] 李青青, 杨志豪, 罗凌, 等. 基于多任务学习的生物医学实体关系抽取[J]. 中文信息学报, 2019, 33(8):84-92.
[24] ( Li Qingqing, Yang Zhihao, Luo Ling, et al. A Multi-Task Learning Approach to Biomedical Entity Relation Extraction[J]. Journal of Chinese Information Processing, 2019, 33(8):84-92.)
[25] 刘宗林, 张梅山, 甄冉冉, 等. 融入罪名关键词的法律判决预测多任务学习模型[J]. 清华大学学报(自然科学版), 2019, 59(7):497-504.
[25] ( Liu Zonglin, Zhang Meishan, Zhen Ranran, et al. Multi-Task Learning Model for Legal Judgment Predictions with Charge Keywords[J]. Journal of Tsinghua University (Science and Technology), 2019, 59(7):497-504.)
[26] 余传明, 李浩男, 安璐. 基于多任务深度学习的文本情感原因分析[J]. 广西师范大学学报(自然科学版), 2019, 37(1):50-61.
[26] ( Yu Chuanming, Li Haonan, An Lu. Analysis of Text Emotion Cause Based on Multi-Task Deep Learning[J]. Journal of Guangxi Normal University (Natural Science Edition), 2019, 37(1):50-61.)
[27] Yang B S, Mitchell T M. Joint Extraction of Events and Entities within a Document Context[OL]. arXiv Preprint, arXiv: 1609.03632.
[28] Kruengkrai C, Nguyen T H, Aljunied S M, et al. Improving Low-Resource Named Entity Recognition Using Joint Sentence and Token Labeling[C]// Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. 2020: 5898-5905.
[29] Martins P H, Marinho Z, Martins A F T. Joint Learning of Named Entity Recognition and Entity Linking[C]// Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics: Student Research Workshop. 2019: 190-196.
[30] 吴文涛, 李培峰, 朱巧明. 基于混合神经网络的实体和事件联合抽取方法[J]. 中文信息学报, 2019, 33(8):77-83.
[30] ( Wu Wentao, Li Peifeng, Zhu Qiaoming. Joint Extraction of Entities and Events by a Hybrid Neural Network[J]. Journal of Chinese Information Processing, 2019, 33(8):77-83.)
[31] Martínez A H, Plank B. When is Multitask Learning Effective? Semantic Sequence Prediction under Varying Data Conditions[C]// Proceedings of the Conference of the 15th European Chapter of the Association for Computational Linguistics. 2017: 44-53.
[32] Strubell E, Verga P, Belanger D, et al. Fast and Accurate Entity Recognition with Iterated Dilated Convolutions[OL]. arXiv Preprint, arXiv: 1702.02098.
[33] Ju M Z, Miwa M, Ananiadou S. A Neural Layered Model for Nested Named Entity Recognition[C]// Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 2018: 1446-1459.
[34] Lin H Y, Lu Y J, Han X P, et al. Gazetteer-Enhanced Attentive Neural Networks for Named Entity Recognition[C]// Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). 2019: 6232-6237.
[35] Luo Y, Zhao H. Bipartite Flat-Graph Network for Nested Named Entity Recognition[C]// Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. 2020: 6408-6418.
[36] Fisher J, Vlachos A. Merge and Label: A Novel Neural Network Architecture for Nested NER[C]// Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. 2019: 5840-5850.
[37] Chen Y B, Xu L H, Liu K, et al. Event Extraction via Dynamic Multi-Pooling Convolutional Neural Networks[C]// Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing. 2015: 167-176.
[38] Liu S L, Chen Y B, He S Z, et al. Leveraging FrameNet to Improve Automatic Event Detection[C]// Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics. 2016: 2134-2143.
[39] Chen Y B, Liu S L, He S Z, et al. Event Extraction via Bidirectional Long Short-Term Memory Tensor Neural Networks[C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2016: 190-203.
[40] Nguyen T H, Cho K, Grishman R. Joint Event Extraction via Recurrent Neural Networks[C]// Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 2016: 300-309.
[41] Liu S B, Cheng R, Yu X M, et al. Exploiting Contextual Information via Dynamic Memory Network for Event Detection[OL]. arXiv Preprint, arXiv: 1810.03449.
[42] 邱盈盈, 洪宇, 周文瑄, 等. 面向事件抽取的深度与主动联合学习方法[J]. 中文信息学报, 2018, 32(6):98-106.
[42] ( Qiu Yingying, Hong Yu, Zhou Wenxuan, et al. Combining Deep Learning and Active Learning for Event Extraction[J]. Journal of Chinese Information Processing, 2018, 32(6):98-106.)
[43] Zhang T T, Ji H, Sil A. Joint Entity and Event Extraction with Generative Adversarial Imitation Learning[J]. Data Intelligence, 2019, 1(2):99-120.
doi: 10.1162/dint_a_00014
[44] 陈斌, 周勇, 刘兵. 基于卷积双向长短期记忆网络的事件触发词抽取[J]. 计算机工程, 2019, 45(1):153-158.
[44] ( Chen Bin, Zhou Yong, Liu Bing. Event Trigger Word Extraction Based on Convolutional Bidirectional Long Short Term Memory Network[J]. Computer Engineering, 2019, 45(1):153-158.)
[45] 余传明, 王峰, 张贞港, 等. 基于表示学习的知识库问答模型研究[J]. 科技情报研究, 2021, 3(1):56-70.
[45] ( Yu Chuanming, Wang Feng, Zhang Zhengang, et al. Research on Knowledge Graph Question Answering Model Based on Representation Learning[J]. Scientific Information Research, 2021, 3(1):56-70.)
[1] Zhang Yunqiu, Wang Yang, Li Bocheng. Identifying Named Entities of Chinese Electronic Medical Records Based on RoBERTa-wwm Dynamic Fusion Model[J]. 数据分析与知识发现, 2022, 6(2/3): 242-250.
[2] Zhang Yunqiu, Li Bocheng, Chen Yan. Automatic Classification with Unbalanced Data for Electronic Medical Records[J]. 数据分析与知识发现, 2022, 6(2/3): 233-241.
[3] Zhang Fangcong, Qin Qiuli, Jiang Yong, Zhuang Runtao. Named Entity Recognition for Chinese EMR with RoBERTa-WWM-BiLSTM-CRF[J]. 数据分析与知识发现, 2022, 6(2/3): 251-262.
[4] Hu Yamin, Wu Xiaoyan, Chen Fang. Review of Technology Term Recognition Studies Based on Machine Learning[J]. 数据分析与知识发现, 2022, 6(2/3): 7-17.
[5] Zhou Zeyu,Wang Hao,Zhao Zibo,Li Yueyan,Zhang Xiaoqin. Construction and Application of GCN Model for Text Classification with Associated Information[J]. 数据分析与知识发现, 2021, 5(9): 31-41.
[6] Zhao Danning,Mu Dongmei,Bai Sen. Automatically Extracting Structural Elements of Sci-Tech Literature Abstracts Based on Deep Learning[J]. 数据分析与知识发现, 2021, 5(7): 70-80.
[7] Xu Yuemei, Wang Zihou, Wu Zixin. Predicting Stock Trends with CNN-BiLSTM Based Multi-Feature Integration Model[J]. 数据分析与知识发现, 2021, 5(7): 126-138.
[8] Yang Hanxun, Zhou Dequn, Ma Jing, Luo Yongcong. Detecting Rumors with Uncertain Loss and Task-level Attention Mechanism[J]. 数据分析与知识发现, 2021, 5(7): 101-110.
[9] Huang Mingxuan,Jiang Caoqing,Lu Shoudong. Expanding Queries Based on Word Embedding and Expansion Terms[J]. 数据分析与知识发现, 2021, 5(6): 115-125.
[10] Zhong Jiawa,Liu Wei,Wang Sili,Yang Heng. Review of Methods and Applications of Text Sentiment Analysis[J]. 数据分析与知识发现, 2021, 5(6): 1-13.
[11] Zhang Guobiao,Li Jie. Detecting Social Media Fake News with Semantic Consistency Between Multi-model Contents[J]. 数据分析与知识发现, 2021, 5(5): 21-29.
[12] Chang Chengyang,Wang Xiaodong,Zhang Shenglei. Polarity Analysis of Dynamic Political Sentiments from Tweets with Deep Learning Method[J]. 数据分析与知识发现, 2021, 5(3): 121-131.
[13] Feng Yong,Liu Yang,Xu Hongyan,Wang Rongbing,Zhang Yonggang. Recommendation Model Incorporating Neighbor Reviews for GRU Products[J]. 数据分析与知识发现, 2021, 5(3): 78-87.
[14] Hu Haotian,Ji Jinfeng,Wang Dongbo,Deng Sanhong. An Integrated Platform for Food Safety Incident Entities Based on Deep Learning[J]. 数据分析与知识发现, 2021, 5(3): 12-24.
[15] Zhang Qi,Jiang Chuan,Ji Youshu,Feng Minxuan,Li Bin,Xu Chao,Liu Liu. Unified Model for Word Segmentation and POS Tagging of Multi-Domain Pre-Qin Literature[J]. 数据分析与知识发现, 2021, 5(3): 2-11.
  Copyright © 2016 Data Analysis and Knowledge Discovery   Tel/Fax:(010)82626611-6626,82624938   E-mail:jishu@mail.las.ac.cn