[Objective] This paper proposes a Linguistic Knowledge-enhanced Self-Supervised Graph Convolutional Network (LKS-GCN) model, aiming to improve the existing method for event relation extraction. [Methods] First, we used the BERT model to encode the input texts, and learned the syntactic relationships between words with graph convolutional network to enhance text representations. Then, we introduced a multi-head attention mechanism to distinguish different dependency features and utilized segment-level max pooling operation to extract structural information. Next, the pooled results of multiple segments were combined as the relation features of event pairs. We conducted adaptive clustering based on the relation representation features and generated pseudo-labels as the self-supervision information. Finally, we optimized event relation features through iterative self-supervised training. [Results] We evaluated the new model on TACRED and FewRel datasets, which made the B3-F1 2.1% and 1.2% higher than the best baseline methods. [Limitations] The model treated the syntactic dependency tree as an undirected graph and did not consider the edges’ direction and dependency edges’ label information. [Conclusions] The LKS-GCN model could effectively enhance text representation and provide a self-supervised learning framework for event relation extraction with limited labeled data.
(Zhuang Chuanzhi, Jin Xiaolong, Zhu Weijian, et al. Deep Learning Based Relation Extraction: A Survey[J]. Journal of Chinese Information Processing, 2019, 33(12): 1-18.)
[2]
Zhou G D, Su J, Zhang J, et al. Exploring Various Knowledge in Relation Extraction[C]// Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics. New York: ACM, 2005: 427-434.
[3]
Culotta A, McCallum A, Betz J. Integrating Probabilistic Extraction Models and Data Mining to Discover Relations and Patterns in Text[C]// Proceedings of the 2006 Main Conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics. New York: ACM, 2006: 296-303.
[4]
Zelenko D, Aone C, Richardella A. Kernel Methods for Relation Extraction[J]. The Journal of Machine Learning Research, 2003, 3(6): 1083-1106.
[5]
Bunescu R C, Mooney R J. A Shortest Path Dependency Kernel for Relation Extraction[C]// Proceedings of the 2005 Conference on Human Language Technology and Empirical Methods in Natural Language Processing. New York: ACM, 2005: 724-731.
(Wan Jing, Li Haoming, Yan Huanchun, et al. Relation Extraction Based on Recurrent Convolutional Neural Network[J]. Application Research of Computers, 2020, 37(3): 699-703.)
[7]
Zeng D, Liu K, Lai S, et al. Relation Classification via Convolutional Deep Neural Network[C]// Proceedings of the 25th International Conference on Computational Linguistics. New York: ACM, 2014: 2335-2344.
[8]
Socher R, Huval B, Manning C D, et al. Semantic Compositionality Through Recursive Matrix-Vector Spaces[C]// Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning. New York: ACM, 2012: 1201-1211.
[9]
Zhang Y H, Qi P, Manning C D. Graph Convolution over Pruned Dependency Trees Improves Relation Extraction[C]// Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. 2018: 2205-2215.
[10]
Han X, Zhu H, Yu P F, et al. FewRel: A Large-Scale Supervised Few-Shot Relation Classification Dataset with State-of-the-Art Evaluation[C]// Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. 2018: 4803-4809.
[11]
Zhang Y H, Zhong V, Chen D Q, et al. Position-Aware Attention and Supervised Data Improve Slot Filling[C]// Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing. 2017: 35-45.
[12]
Xiao M, Liu C. Semantic Relation Classification via Hierarchical Recurrent Neural Network with Attention[C]// Proceedings of the 26th International Conference on Computational Linguistics. New York: ACM, 2016: 1254-1263.
[13]
Miwa M, Bansal M. End-to-End Relation Extraction Using LSTMS on Sequences and Tree Structures[C]// Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics. 2016: 1105-1116.
[14]
Zhou P, Shi W, Tian J, et al. Attention-Based Bidirectional Long Short-Term Memory Networks for Relation Classification[C]// Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics. 2016: 207-212.
[15]
张宁. 面向特定领域的命名实体识别技术研究[D]. 杭州: 浙江大学, 2018.
[15]
(Zhang Ning. Researches on Domain-Specific Named Entity Recognition[D]. Hangzhou: Zhejiang University, 2018.)
[16]
Mintz M, Bills S, Snow R, et al. Distant Supervision for Relation Extraction Without Labeled Data[C]// Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP. 2009: 1003-1011.
[17]
Bollacker K, Evans C, Paritosh P, et al. Freebase: A Collaboratively Created Graph Database for Structuring Human Knowledge[C]// Proceedings of the 2008 ACM SIGMOD International Conference on Management of Data. New York: ACM, 2008: 1247-1250.
(Yang Suizhu, Liu Yanxia, Zhang Kaiwen, et al. Survey on Distantly-Supervised Relation Extraction[J]. Chinese Journal of Computers, 2021, 44(8): 1636-1660.)
[19]
Zeng D J, Liu K, Chen Y B, et al. Distant Supervision for Relation Extraction via Piecewise Convolutional Neural Networks[C]// Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing. 2015: 1753-1762.
[20]
Jiang X T, Wang Q, Li P, et al. Relation Extraction with Multi-Instance Multilabel Convolutional Neural Networks[C]// Proceedings of the 26th International Conference on Computational Linguistics. New York: ACM, 2016: 1471-1480.
[21]
Lin Y K, Shen S Q, Liu Z Y, et al. Neural Relation Extraction with Selective Attention over Instances[C]// Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics. 2016: 2124-2133.
[22]
Beltagy I, Lo K, Ammar W. Combining Distant and Direct Supervision for Neural Relation Extraction[C]// Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics (NAACL 2019). 2019: 1858-1867.
[23]
Elsahar H, Demidova E, Gottschalk S, et al. Unsupervised Open Relation Extraction[C]// Proceedings of the 2017 European Semantic Web Conference. Springer, 2017: 12-16.
[24]
Zhang M, Su J, Wang D M, et al. Discovering Relations Between Named Entities from a Large Raw Corpus Using Tree Similarity-Based Clustering[C]// Proceedings of the 2nd International Joint Conference on Natural Language Processing. Springer, 2005: 378-389.
[25]
Jing L L, Tian Y L. Self-Supervised Visual Feature Learning with Deep Neural Networks: A Survey[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2021, 43(11): 4037-4058.
doi: 10.1109/TPAMI.2020.2992393
[26]
Wang H, Wang X, Xiong W H, et al. Self-Supervised Learning for Contextualized Extractive Summarization[C]// Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. 2019: 2221-2227.
[27]
Hu X M, Wen L J, Xu Y S, et al. SelfORE: Self-Supervised Relational Feature Learning for Open Relational Extraction[C]// Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing. 2020: 3673-3682.
[28]
Devlin J, Chang M W, Lee K, et al. BERT: Pre-Training of Deep Bidirectional Transformers for Language Understanding[C]// Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies. 2019: 4171-4186.
[29]
Goyal P, Mahajan D, Gupta A, et al. Scaling and Benchmarking Self-Supervised Visual Representation Learning[C]// Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision. 2019: 6391-6400.
[30]
Sachan D, Zhang Y H, Qi P, et al. Do Syntax Trees Help Pre-Trained Transformers Extract Information?[C]// Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics. 2021: 2647-2661.
[31]
Kipf T N, Welling M. Semi-Supervised Classification with Graph Convolutional Networks[OL]. arXiv Preprint, arXiv: 1609.02907.
[32]
Jin H L, Hou L, Li J Z, et al. Fine-Grained Entity Typing via Hierarchical Multi Graph Convolutional Networks[C]// Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing. 2019: 4969-4978.
(Wu Ting, Kong Fang. Document-Level Relation Extraction Based on Graph Attention Convolutional Neural Network[J]. Journal of Chinese Information Processing, 2021, 35(10): 73-80.)
[34]
Mandya A, Bollegala D, Coenen F. Graph Convolution over Multiple Dependency Sub-Graphs for Relation Extraction[C]// Proceedings of the 28th International Conference on Computational Linguistics. 2020: 6424-6435.
[35]
Chen G M, Tian Y H, Song Y. Joint Aspect Extraction and Sentiment Analysis with Directional Graph Convolutional Networks[C]// Proceedings of the 28th International Conference on Computational Linguistics. 2020: 272-279.
(Wang Guang, Li Hongyu, Qiu Yunfei, et al. Aspect-Based Sentiment Classification via Memory Graph Convolutional Network[J]. Journal of Chinese Information Processing, 2021, 35(8): 98-106.)
(Ren Qiutong, Wang Hao, Xiong Xin, et al. Extracting Drama Terms with GCN Long-Distance Constrain[J]. Data Analysis and Knowledge Discovery, 2021, 5(12): 123-136.)
[38]
Xie J Y, Girshick R, Farhadi A. Unsupervised Deep Embedding for Clustering Analysis[C]// Proceedings of the 33rd International Conference on Machine Learning. New York: ACM, 2016: 478-487.
[39]
Bagga A, Baldwin B. Algorithms for Scoring Coreference Chains[C]// Proceedings of the 1st International Conference on Language Resources and Evaluation Workshop on Linguistics Conference. 1998: 563-566.
[40]
Rosenburg A, Hirschberg J. V-measure: A Conditional Entropy-Based External Cluster Evaluation Measure[C]// Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning. 2007: 410-420.
[41]
Hubert L, Arabie P. Comparing Partitions[J]. Journal of Classification, 1985, 2(1): 193-218.
doi: 10.1007/BF01908075
[42]
Marcheggiani D, Titov I. Discrete-State Variational Autoencoders for Joint Discovery and Factorization of Relations[J]. Transactions of the Association for Computational Linguistics, 2016, 4: 231-244.
doi: 10.1162/tacl_a_00095
[43]
Tran T T, Le P, Ananiadou S. Revisiting Unsupervised Relation Extraction[C]// Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. 2020: 7498-7505.
[44]
Zhao J, Gui T, Zhang Q, et al. A Relation-Oriented Clustering Method for Open Relation Extraction[C]// Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing. 2021: 9707-9718.