Please wait a minute...
Advanced Search
数据分析与知识发现  2023, Vol. 7 Issue (5): 92-104     https://doi.org/10.11925/infotech.2096-3467.2022.0602
  研究论文 本期目录 | 过刊浏览 | 高级检索 |
基于语言学知识增强的自监督式图卷积网络的事件关系抽取方法*
徐康,余胜男,陈蕾(),王传栋
南京邮电大学计算机学院、软件学院、网络空间安全学院 南京 210003
Linguistic Knowledge-Enhanced Self-Supervised Graph Convolutional Network for Event Relation Extraction
Xu Kang,Yu Shengnan,Chen Lei(),Wang Chuandong
School of Computer Science, Software and Network Security, Nanjing University of Posts and Telecommunications, Nanjing 210003, China
全文: PDF (1002 KB)   HTML ( 8
输出: BibTeX | EndNote (RIS)      
摘要 

【目的】 解决事件关系抽取中因缺少大规模高质量的标注数据以及事件关系复杂的语言表达模式而导致的现有方法难以捕获结构化事件知识的问题。【方法】 提出一种基于语言学知识增强的自监督式图卷积网络模型,利用预训练BERT模型编码文本特征,将其输入图卷积网络中学习词之间的句法依存关系以增强文本表示,引入多头注意力机制对不同依赖特征加以区分,再利用分段最大池化操作提取结构信息,然后组合多个段的池化结果作为事件对的关系特征,使用关系特征进行自适应聚类生成伪类别标签,并将其作为自监督信息,通过迭代的自监督训练模式优化事件关系特征。【结果】 在TACRED和FewRel数据集上进行实验,B3-F1相较于最好的基线模型分别提高了2.1和1.2个百分点。【局限】 该模型将句法依存树当作无向图处理,未考虑边的方向和依赖边的标签信息。【结论】 本文所提基于语言学知识增强的自监督式图卷积网络模型能有效增强文本的表征效果,为缺少标注数据的事件关系抽取提供了一种自监督学习框架。

服务
把本文推荐给朋友
加入引用管理器
E-mail Alert
RSS
作者相关文章
徐康
余胜男
陈蕾
王传栋
关键词 事件关系抽取BERT自监督模型图卷积网络多头注意力机制    
Abstract

[Objective] This paper proposes a Linguistic Knowledge-enhanced Self-Supervised Graph Convolutional Network (LKS-GCN) model, aiming to improve the existing method for event relation extraction. [Methods] First, we used the BERT model to encode the input texts, and learned the syntactic relationships between words with graph convolutional network to enhance text representations. Then, we introduced a multi-head attention mechanism to distinguish different dependency features and utilized segment-level max pooling operation to extract structural information. Next, the pooled results of multiple segments were combined as the relation features of event pairs. We conducted adaptive clustering based on the relation representation features and generated pseudo-labels as the self-supervision information. Finally, we optimized event relation features through iterative self-supervised training. [Results] We evaluated the new model on TACRED and FewRel datasets, which made the B3-F1 2.1% and 1.2% higher than the best baseline methods. [Limitations] The model treated the syntactic dependency tree as an undirected graph and did not consider the edges’ direction and dependency edges’ label information. [Conclusions] The LKS-GCN model could effectively enhance text representation and provide a self-supervised learning framework for event relation extraction with limited labeled data.

Key wordsEvent Relation Extraction    BERT    Self-Supervised Model    Graph Convolutional Network    Multi-Head Attention
收稿日期: 2022-06-12      出版日期: 2022-11-09
ZTFLH:  TP391  
基金资助:*国家自然科学基金项目(62202240);国家自然科学基金项目(61872190);南京邮电大学人才引进项目(NY218118);南京邮电大学人才引进项目的研究成果之一(NY219104)
通讯作者: 陈蕾,ORCID:0000-0002-6071-8888,E-mail:chenlei@njupt.edu.cn。   
引用本文:   
徐康, 余胜男, 陈蕾, 王传栋. 基于语言学知识增强的自监督式图卷积网络的事件关系抽取方法*[J]. 数据分析与知识发现, 2023, 7(5): 92-104.
Xu Kang, Yu Shengnan, Chen Lei, Wang Chuandong. Linguistic Knowledge-Enhanced Self-Supervised Graph Convolutional Network for Event Relation Extraction. Data Analysis and Knowledge Discovery, 2023, 7(5): 92-104.
链接本文:  
https://manu44.magtech.com.cn/Jwk_infotech_wk3/CN/10.11925/infotech.2096-3467.2022.0602      或      https://manu44.magtech.com.cn/Jwk_infotech_wk3/CN/Y2023/V7/I5/92
Fig.1  LKS-GCN模型结构
Fig.2  依存句法树示例
参数名称 参数意义 参数值
emb_dim 词向量嵌入维度 768
max_length 文本最大长度 128
pos_dim 词性向量嵌入维度 30
hidden layer units 隐藏层单元数 300
layer of gcn 图卷积层数 3
number of head head数目 2
dropout 丢失率 0.5
activation function 激活函数 ReLU
optimizer 优化器 Adam
learning_rate 学习率 0.000 1
batch_size 批训练样本数 100
epoch 训练轮数 10
Table 1  超参数设置
数据集 模型 B3 V-measure ARI
Prec. Rec. F1 Hom. Comp. F1
TACRED VAE
RW-HAC
EType+
SelfORE
Ours
0.247
0.426
0.302
0.576
0.526
0.564
0.633
0.803
0.510
0.602
0.343
0.509
0.439
0.541
0.562
0.208
0.469
0.260
0.630
0.619
0.362
0.597
0.607
0.608
0.660
0.264
0.526
0.364
0.619
0.639
0.159
0.281
0.143
0.447
0.419
Table2  各个模型在数据集TACRED上的实验结果
数据集 模型 B3 V-measure ARI
Prec. Rec. F1 Hom. Comp. F1
FewRel VAE
RW-HAC
EType+
SelfORE
Ours
0.309
0.256
0.238
0.672
0.677
0.446
0.492
0.485
0.685
0.705
0.365
0.337
0.319
0.678
0.690
0.448
0.391
0.364
0.779
0.778
0.500
0.485
0.463
0.788
0.803
0.473
0.433
0.408
0.783
0.790
0.291
0.250
0.249
0.647
0.613
Table3  各个模型在数据集FewRel上的实验结果
Fig.3  不同GCN层数的实验结果
数据集 模型 B3-F1 V-measure ARI
Prec. Rec. F1 Hom. Comp. F1
TACRED LKS-GCN 0.526 0.602 0.562 0.619 0.660 0.639 0.419
(w/o)Multi-Head Attention 0.508 0.586 0.544 0.602 0.649 0.624 0.406
(w/o)Piece max pooling 0.523 0.597 0.557 0.611 0.656 0.633 0.414
FewRel LKS-GCN 0.677 0.705 0.690 0.778 0.803 0.790 0.613
(w/o)Multi-Head Attention 0.664 0.683 0.673 0.765 0.779 0.772 0.595
(w/o)Piece max pooling 0.670 0.702 0.686 0.775 0.800 0.787 0.609
Table4  消融实验
头数量 TACRED B3-F1 FewRel B3-F1
1 0.556 0.683
2 0.562 0.690
4 0.558 0.685
Table5  头数量对B3-F1的影响
Fig.4  TACRED数据集文本注意力权重可视化
[1] 庄传志, 靳小龙, 朱伟建, 等. 基于深度学习的关系抽取研究综述[J]. 中文信息学报, 2019, 33(12): 1-18.
[1] (Zhuang Chuanzhi, Jin Xiaolong, Zhu Weijian, et al. Deep Learning Based Relation Extraction: A Survey[J]. Journal of Chinese Information Processing, 2019, 33(12): 1-18.)
[2] Zhou G D, Su J, Zhang J, et al. Exploring Various Knowledge in Relation Extraction[C]// Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics. New York: ACM, 2005: 427-434.
[3] Culotta A, McCallum A, Betz J. Integrating Probabilistic Extraction Models and Data Mining to Discover Relations and Patterns in Text[C]// Proceedings of the 2006 Main Conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics. New York: ACM, 2006: 296-303.
[4] Zelenko D, Aone C, Richardella A. Kernel Methods for Relation Extraction[J]. The Journal of Machine Learning Research, 2003, 3(6): 1083-1106.
[5] Bunescu R C, Mooney R J. A Shortest Path Dependency Kernel for Relation Extraction[C]// Proceedings of the 2005 Conference on Human Language Technology and Empirical Methods in Natural Language Processing. New York: ACM, 2005: 724-731.
[6] 万静, 李浩铭, 严欢春, 等. 基于循环卷积神经网络的实体关系抽取方法研究[J]. 计算机应用研究, 2020, 37(3): 699-703.
[6] (Wan Jing, Li Haoming, Yan Huanchun, et al. Relation Extraction Based on Recurrent Convolutional Neural Network[J]. Application Research of Computers, 2020, 37(3): 699-703.)
[7] Zeng D, Liu K, Lai S, et al. Relation Classification via Convolutional Deep Neural Network[C]// Proceedings of the 25th International Conference on Computational Linguistics. New York: ACM, 2014: 2335-2344.
[8] Socher R, Huval B, Manning C D, et al. Semantic Compositionality Through Recursive Matrix-Vector Spaces[C]// Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning. New York: ACM, 2012: 1201-1211.
[9] Zhang Y H, Qi P, Manning C D. Graph Convolution over Pruned Dependency Trees Improves Relation Extraction[C]// Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. 2018: 2205-2215.
[10] Han X, Zhu H, Yu P F, et al. FewRel: A Large-Scale Supervised Few-Shot Relation Classification Dataset with State-of-the-Art Evaluation[C]// Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. 2018: 4803-4809.
[11] Zhang Y H, Zhong V, Chen D Q, et al. Position-Aware Attention and Supervised Data Improve Slot Filling[C]// Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing. 2017: 35-45.
[12] Xiao M, Liu C. Semantic Relation Classification via Hierarchical Recurrent Neural Network with Attention[C]// Proceedings of the 26th International Conference on Computational Linguistics. New York: ACM, 2016: 1254-1263.
[13] Miwa M, Bansal M. End-to-End Relation Extraction Using LSTMS on Sequences and Tree Structures[C]// Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics. 2016: 1105-1116.
[14] Zhou P, Shi W, Tian J, et al. Attention-Based Bidirectional Long Short-Term Memory Networks for Relation Classification[C]// Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics. 2016: 207-212.
[15] 张宁. 面向特定领域的命名实体识别技术研究[D]. 杭州: 浙江大学, 2018.
[15] (Zhang Ning. Researches on Domain-Specific Named Entity Recognition[D]. Hangzhou: Zhejiang University, 2018.)
[16] Mintz M, Bills S, Snow R, et al. Distant Supervision for Relation Extraction Without Labeled Data[C]// Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP. 2009: 1003-1011.
[17] Bollacker K, Evans C, Paritosh P, et al. Freebase: A Collaboratively Created Graph Database for Structuring Human Knowledge[C]// Proceedings of the 2008 ACM SIGMOD International Conference on Management of Data. New York: ACM, 2008: 1247-1250.
[18] 杨穗珠, 刘艳霞, 张凯文, 等. 远程监督关系抽取综述[J]. 计算机学报, 2021, 44(8): 1636-1660.
[18] (Yang Suizhu, Liu Yanxia, Zhang Kaiwen, et al. Survey on Distantly-Supervised Relation Extraction[J]. Chinese Journal of Computers, 2021, 44(8): 1636-1660.)
[19] Zeng D J, Liu K, Chen Y B, et al. Distant Supervision for Relation Extraction via Piecewise Convolutional Neural Networks[C]// Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing. 2015: 1753-1762.
[20] Jiang X T, Wang Q, Li P, et al. Relation Extraction with Multi-Instance Multilabel Convolutional Neural Networks[C]// Proceedings of the 26th International Conference on Computational Linguistics. New York: ACM, 2016: 1471-1480.
[21] Lin Y K, Shen S Q, Liu Z Y, et al. Neural Relation Extraction with Selective Attention over Instances[C]// Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics. 2016: 2124-2133.
[22] Beltagy I, Lo K, Ammar W. Combining Distant and Direct Supervision for Neural Relation Extraction[C]// Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics (NAACL 2019). 2019: 1858-1867.
[23] Elsahar H, Demidova E, Gottschalk S, et al. Unsupervised Open Relation Extraction[C]// Proceedings of the 2017 European Semantic Web Conference. Springer, 2017: 12-16.
[24] Zhang M, Su J, Wang D M, et al. Discovering Relations Between Named Entities from a Large Raw Corpus Using Tree Similarity-Based Clustering[C]// Proceedings of the 2nd International Joint Conference on Natural Language Processing. Springer, 2005: 378-389.
[25] Jing L L, Tian Y L. Self-Supervised Visual Feature Learning with Deep Neural Networks: A Survey[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2021, 43(11): 4037-4058.
doi: 10.1109/TPAMI.2020.2992393
[26] Wang H, Wang X, Xiong W H, et al. Self-Supervised Learning for Contextualized Extractive Summarization[C]// Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. 2019: 2221-2227.
[27] Hu X M, Wen L J, Xu Y S, et al. SelfORE: Self-Supervised Relational Feature Learning for Open Relational Extraction[C]// Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing. 2020: 3673-3682.
[28] Devlin J, Chang M W, Lee K, et al. BERT: Pre-Training of Deep Bidirectional Transformers for Language Understanding[C]// Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies. 2019: 4171-4186.
[29] Goyal P, Mahajan D, Gupta A, et al. Scaling and Benchmarking Self-Supervised Visual Representation Learning[C]// Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision. 2019: 6391-6400.
[30] Sachan D, Zhang Y H, Qi P, et al. Do Syntax Trees Help Pre-Trained Transformers Extract Information?[C]// Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics. 2021: 2647-2661.
[31] Kipf T N, Welling M. Semi-Supervised Classification with Graph Convolutional Networks[OL]. arXiv Preprint, arXiv: 1609.02907.
[32] Jin H L, Hou L, Li J Z, et al. Fine-Grained Entity Typing via Hierarchical Multi Graph Convolutional Networks[C]// Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing. 2019: 4969-4978.
[33] 吴婷, 孔芳. 基于图注意力卷积神经网络的文档级关系抽取[J]. 中文信息学报, 2021, 35(10): 73-80.
[33] (Wu Ting, Kong Fang. Document-Level Relation Extraction Based on Graph Attention Convolutional Neural Network[J]. Journal of Chinese Information Processing, 2021, 35(10): 73-80.)
[34] Mandya A, Bollegala D, Coenen F. Graph Convolution over Multiple Dependency Sub-Graphs for Relation Extraction[C]// Proceedings of the 28th International Conference on Computational Linguistics. 2020: 6424-6435.
[35] Chen G M, Tian Y H, Song Y. Joint Aspect Extraction and Sentiment Analysis with Directional Graph Convolutional Networks[C]// Proceedings of the 28th International Conference on Computational Linguistics. 2020: 272-279.
[36] 王光, 李鸿宇, 邱云飞, 等. 基于图卷积记忆网络的方面级情感分类[J]. 中文信息学报, 2021, 35(8): 98-106.
[36] (Wang Guang, Li Hongyu, Qiu Yunfei, et al. Aspect-Based Sentiment Classification via Memory Graph Convolutional Network[J]. Journal of Chinese Information Processing, 2021, 35(8): 98-106.)
[37] 任秋彤, 王昊, 熊欣, 等. 融合GCN远距离约束的非遗戏剧术语抽取模型构建及其应用研究[J]. 数据分析与知识发现, 2021, 5(12): 123-136.
[37] (Ren Qiutong, Wang Hao, Xiong Xin, et al. Extracting Drama Terms with GCN Long-Distance Constrain[J]. Data Analysis and Knowledge Discovery, 2021, 5(12): 123-136.)
[38] Xie J Y, Girshick R, Farhadi A. Unsupervised Deep Embedding for Clustering Analysis[C]// Proceedings of the 33rd International Conference on Machine Learning. New York: ACM, 2016: 478-487.
[39] Bagga A, Baldwin B. Algorithms for Scoring Coreference Chains[C]// Proceedings of the 1st International Conference on Language Resources and Evaluation Workshop on Linguistics Conference. 1998: 563-566.
[40] Rosenburg A, Hirschberg J. V-measure: A Conditional Entropy-Based External Cluster Evaluation Measure[C]// Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning. 2007: 410-420.
[41] Hubert L, Arabie P. Comparing Partitions[J]. Journal of Classification, 1985, 2(1): 193-218.
doi: 10.1007/BF01908075
[42] Marcheggiani D, Titov I. Discrete-State Variational Autoencoders for Joint Discovery and Factorization of Relations[J]. Transactions of the Association for Computational Linguistics, 2016, 4: 231-244.
doi: 10.1162/tacl_a_00095
[43] Tran T T, Le P, Ananiadou S. Revisiting Unsupervised Relation Extraction[C]// Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. 2020: 7498-7505.
[44] Zhao J, Gui T, Zhang Q, et al. A Relation-Oriented Clustering Method for Open Relation Extraction[C]// Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing. 2021: 9707-9718.
[1] 本妍妍, 庞雪芹. 融入词性的医疗命名实体识别研究*[J]. 数据分析与知识发现, 2023, 7(5): 123-132.
[2] 韩普, 仲雨乐, 陆豪杰, 马诗雯. 基于对抗性迁移学习的药品不良反应实体识别研究*[J]. 数据分析与知识发现, 2023, 7(3): 131-141.
[3] 苏明星, 吴厚月, 李健, 黄菊, 张顺香. 基于多层交互注意力机制的商品属性抽取*[J]. 数据分析与知识发现, 2023, 7(2): 108-118.
[4] 张贞港, 余传明. 基于实体与关系融合的知识图谱补全模型研究*[J]. 数据分析与知识发现, 2023, 7(2): 15-25.
[5] 赵一鸣, 潘沛, 毛进. 基于任务知识融合与文本数据增强的医学信息查询意图强度识别研究*[J]. 数据分析与知识发现, 2023, 7(2): 38-47.
[6] 王宇飞, 张智雄, 赵旸, 张梦婷, 李雪思. 中文科技论文标题自动生成系统的设计与实现*[J]. 数据分析与知识发现, 2023, 7(2): 61-71.
[7] 张思阳, 魏苏波, 孙争艳, 张顺香, 朱广丽, 吴厚月. 基于多标签Seq2Seq模型的情绪-原因对提取模型*[J]. 数据分析与知识发现, 2023, 7(2): 86-96.
[8] 施运梅, 袁博, 张乐, 吕学强. IMTS:融合图像与文本语义的虚假评论检测方法*[J]. 数据分析与知识发现, 2022, 6(8): 84-96.
[9] 吴江, 刘涛, 刘洋. 在线社区用户画像及自我呈现主题挖掘——以网易云音乐社区为例*[J]. 数据分析与知识发现, 2022, 6(7): 56-69.
[10] 郑洁, 黄辉, 秦永彬. 一种融合法律知识的相似案例匹配模型*[J]. 数据分析与知识发现, 2022, 6(7): 99-106.
[11] 潘慧萍, 李宝安, 张乐, 吕学强. 基于多特征融合的政府工作报告关键词提取研究*[J]. 数据分析与知识发现, 2022, 6(5): 54-63.
[12] 肖悦珺, 李红莲, 张乐, 吕学强, 游新冬. 特征融合的中文专利文本分类方法研究*[J]. 数据分析与知识发现, 2022, 6(4): 49-59.
[13] 杨林, 黄晓硕, 王嘉阳, 丁玲玲, 李子孝, 李姣. 基于BERT-TextCNN的临床试验疾病亚型识别研究*[J]. 数据分析与知识发现, 2022, 6(4): 69-81.
[14] 郭航程, 何彦青, 兰天, 吴振峰, 董诚. 基于Paragraph-BERT-CRF的科技论文摘要语步功能信息识别方法研究*[J]. 数据分析与知识发现, 2022, 6(2/3): 298-307.
[15] 张云秋, 汪洋, 李博诚. 基于RoBERTa-wwm动态融合模型的中文电子病历命名实体识别*[J]. 数据分析与知识发现, 2022, 6(2/3): 242-250.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
版权所有 © 2015 《数据分析与知识发现》编辑部
地址:北京市海淀区中关村北四环西路33号 邮编:100190
电话/传真:(010)82626611-6626,82624938
E-mail:jishu@mail.las.ac.cn