Please wait a minute...
Data Analysis and Knowledge Discovery  2021, Vol. 5 Issue (8): 45-53    DOI: 10.11925/infotech.2096-3467.2020.1302
Current Issue | Archive | Adv Search |
Continual Learning for One-to-many Entity Relationship Generation with Small Samples
Jiang Yaren,Le Xiaoqiu()
National Science Library, Chinese Academy of Sciences, Beijing 100190, China
Department of Library, Information and Archives Management, School of Economics and Management, University of Chinese Academy of Sciences, Beijing 100190, China
Download: PDF (746 KB)   HTML ( 2
Export: BibTeX | EndNote (RIS)      
Abstract  

[Objective] This paper tries to recognize the one-to-many entity relationship instances (such as inclusion relation,coordination relation) from sentences using small amount of samples, aiming to realize continual learning with new data. [Methods] First, we generated the one-to-many inclusion and coordination entities from sentences using LaserTagger. Then, with the help of position embedding and weighted loss,our model captured more features with limited data. Finally, the model achieved continual learning by model compression and expansion. [Results] Our approach’s SARI was 1% better than those of the baseline models in all tests. The model compression and expansion can effectively retain the learned knowledge on previous data and the SARI was about 16.92% higher than the performance of baseline models. [Limitations] More research is needed to examine the proposed method with more complex data sets. [Conclusions] Our study could effectively identify entity relationship with small amout of training data from different categories.

Key wordsEntity Relation      Text Generation      Small Samples      Continual Learning     
Received: 28 December 2020      Published: 15 September 2021
ZTFLH:  TP391  
Corresponding Authors: Le Xiaoqiu ORCID:0000-0002-7114-5544     E-mail: lexq@mail.las.ac.cn

Cite this article:

Jiang Yaren, Le Xiaoqiu. Continual Learning for One-to-many Entity Relationship Generation with Small Samples. Data Analysis and Knowledge Discovery, 2021, 5(8): 45-53.

URL:

http://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/10.11925/infotech.2096-3467.2020.1302     OR     http://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/Y2021/V5/I8/45

The Model Framework
The Compression and Expansion of the Model
类别 核心句式特征 例句 一对多实体关系实例
1 E1 include E2, E3 Fuzzy-logic system includes of a fuzzifier, membership functions, fuzzy rule base, inference engine and defuzzifier. Fuzzy-logic system (fuzzifier; membership functions; fuzzy rule base; inference engine; defuzzifier)
2 E1 consist of E2, E3 In general, soft computing methods consist of three essential paradigms: neural networks, fuzzy logic, and GAs soft computing methods (neural networks; fuzzy logic; GAs)
3 E1 be composed of E2, E3 The ADM model is composed of an Eulerian model, a Lagrangian particle dispersion model, and a probabilistic Lagrangian puff model. The ADM model(an Eulerian model; a Lagrangian particle dispersion model; a probabilistic Lagrangian puff model )
4 E1 be comprised of E2, E3 The WPD is comprised of low-pass filter and high-pass filter. The WPD(low-pass filter; high-pass filter)
5 E1 such as E2, E3 He used several classifiers such as SVM, NB, DT. classifier(SVM; NB; DT)
Example of 5 Categories of Experimental Data
参数
Batch Size 4
最大输入文本长度(max_seq_length) 128
学习率(lr) 3e-5
Dropout 0.1
隐藏层节点数 768
词向量维度 512
Model Parameters
项目 配置
操作系统 Ubuntu 16.04.12
GPU GeForce RTX 2080 Ti
内存 64GB
Python Python 3.6.10
TensorFlow TensorFlow-gpu 1.15.0
Environment Configuration
模型 类别1 SARI 类别2 SARI 类别3 SARI 类别4 SARI 类别5 SARI
BERT+Transformer 38.39% 38.50% 20.13% 23.16% 15.49%
LaserTagger 86.03% 87.35% 77.36% 78.99% 69.05%
LaserTagger+加权Loss 86.51% 89.09% 78.21% 80.66% 69.79%
本文模型(LaserTagger+加权Loss+位置编码) 87.77% 89.88% 78.91% 81.96% 70.09%
Model Performance on Small Samples
模型 类别1(SARI) 类别2(SARI) 类别3(SARI) 类别4(SARI) 类别5(SARI)
本文模型_模型不扩展 72.29% 72.91% 64.84% 63.27% 74.48%
本文模型_模型扩展 84.04% 84.24% 73.22% 80.19% 71.67%
Δ 11.75% 11.33% 8.38% 16.92% -2.81%
Continual Learning Performance
语料/方法 实例
一对多实体关系例句 The sludge train included a gravity thickener, an aerobic digester and belt filter presses, with additional power consumptions of 8 and 11.2 kW respectively.
人工标注结果 The sludge train (gravity thickener; aerobic digester; belt filter presses)
本文模型_模型扩展 The sludge train (gravity thickener; aerobic digester; belt filter presses)
本文模型_模型不扩展 belt filter presses 8; 11(2 kW respectively )
The Generation Example of One-to-Many Entity Relation Instance
[1] Eric M, Sebastian K, Sascha R, et al. Encode, Tag, Realize: High-Precision Text Editing[C]// Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing. 2019.
[2] Zhang S C, Wang F, Bao H Y, et al. Joint Extraction of Entities and Relations Based on a Novel Tagging Scheme[C]// Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics. 2017: 1227-1236.
[3] Li Q, Ji H. Incremental Joint Extraction of Entity Mentions and Relations[C]// Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics. 2014. DOI: 10.3115/v1/P14-1038.
doi: 10.3115/v1/P14-1038
[4] Dai D, Xiao X Y, Lyu Y J, et al. Joint Extraction of Entities and Overlapping Relations Using Position-Attentive Sequence Labeling[C]// Proceedings of the AAAI Conference on Artificial Intelligence. 2019: 6300-6308.
[5] Wei Z P, Su J L, Wang Y, et al. A Novel Cascade Binary Tagging Framework for Relational Triple Extraction[C]// Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. 2020: 1476-1488.
[6] Sui D B, Chen Y B, Liu K, et al. Joint Entity and Relation Extraction with Set Prediction Networks[OL]. arXiv Preprint, arXiv:2011.01675.
[7] Sutskever I, Vinyals O, Le Q V. Sequence to Sequence Learning with Neural Networks[C]// Proceedings of the 27th International Conference on Neural Information Processing Systems. 2014: 3104-3112.
[8] Zeng X R, Zeng D J, He S Z, et al. Extracting Relational Facts by an End-to-End Neural Model with Copy Mechanism[C]// Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics. 2018: 506-514.
[9] Nayak T, Ng H T. Effective Modeling of Encoder-Decoder Architecture for Joint Entity and Relation Extraction[C]// Proceedings of the AAAI Conference on Artificial Intelligence. 2020:8528-8535.
[10] Zeng D, Zhang H, Liu Q. CopyMTL: Copy Mechanism for Joint Extraction of Entities and Relations with Multi-Task Learning[C]// Proceedings of the AAAI Conference on Artificial Intelligence. 2020:9507-9514.
[11] Wang Y Q, Yao Q M, Kwok J T, et al. Generalizing from a Few Examples: A Survey on Few-shot Learning[J]. ACM Computing Surveys, 2020, 53(3):63.
[12] Han X, Zhu H, Yu P F, et al. FewRel: A Large-Scale Supervised Few-Shot Relation Classification Dataset with State-of-the-Art Evaluation[C]// Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. 2018: 4803-4809.
[13] Gao T Y, Han X, Zhu H, et al. FewRel 2.0: Towards More Challenging Few-Shot Relation Classification[C]// Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing. 2019: 6250-6255.
[14] Snell J, Swersky K, Zemel R. Prototypical Networks for Few-shot Learning[C]// Proceedings of the 31st Conference on Neural Information Processing Systems. 2017: 4077-4087.
[15] Fritzler A, Logacheva V, Kretov M. Few-shot Classification in Named Entity Recognition Task[C]// Proceedings of the 34th ACM/SIGAPP Symposium on Applied Computing. 2019: 993-1000.
[16] Ye Z X, Ling Z H. Multi-Level Matching and Aggregation Network for Few-Shot Relation Classification[C]// Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. 2019: 2872-2881.
[17] Devlin J, Chang M W, Lee K, et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding[OL]. arXiv Preprint, arXiv:1810.04805.
[18] Yang Y, Katiyar A. Simple and Effective Few-Shot Named Entity Recognition with Structured Nearest Neighbor Learning[C]// Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing. 2020: 6365-6375.
[19] 戴尚峰. 少样本关系抽取方法研究[D]. 哈尔滨: 哈尔滨工业大学, 2020.
[19] ( Dai Shangfeng. Research on Few-Shot Relation Extraction Method[D]. Harbin: Harbin Institute of Technology, 2020.)
[20] Soares L B, FitzGerald N, Ling J, et al. Matching the Blanks: Distributional Similarity for Relation Learning[C]// Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. 2019: 2895-2905.
[21] Yu H Y, Zhang N Y, Deng S M, et al. Bridging Text and Knowledge with Multi-Prototype Embedding for Few-Shot Relational Triple Extraction[C]// Proceedings of the 28th International Conference on Computational Linguistics. 2020: 6399-6410.
[22] McCloskey M, Cohen N J. Catastrophic Interference in Connectionist Networks: The Sequential Learning Problem[J]. Psychology of Learning and Motivation, 1989, 24:109-165.
[23] Xu H, Liu B, Shu L, et al. Lifelong Domain Word Embedding via Meta-Learning[C]// Proceedings of the 27th International Joint Conference on Artificial Intelligence. 2018: 4510-4516.
[24] Liu T L, Ungar L, Sedoc J. Continual Learning for Sentence Representations Using Conceptors[C]// Proceedings of the NAACL-HLT 2019. 2019:3274-3279.
[25] de Masson d'Autume C, Ruder S, Kong L P, et al. Episodic Memory in Lifelong Language Learning[OL]. arXiv Preprint, arXiv: 1906.01076.
[26] Jong W P. Continual BERT: Continual Learning for Adaptive Extractive Summarization of COVID-19 Literature[OL]. arXiv Preprint, arXiv: 2007.03405.
[27] Kirkpatrick J, Pascanu R, Rabinowitz N, et al. Overcoming Catastrophic Forgetting in Neural Networks[J]. Proceedings of the National Academy of Sciences of the United States of America, 2016, 114(13):3521-3526.
[28] Rajpurkar P, Irvin J, Zhu K, et al. CheXNet: Radiologist-Level Pneumonia Detection on Chest X-Rays with Deep Learning[OL]. arXiv Preprint, arXiv: 1711.05225.
[29] Wang X S, Peng Y F, Lu L, et al. ChestX-Ray8: Hospital-Scale Chest X-Ray Database and Benchmarks on Weakly-Supervised Classification and Localization of Common Thorax Diseases[C]// Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2017: 3462-3471.
[30] Xu W, Napoles C, Pavlick E, et al. Optimizing Statistical Machine Translation for Text Simplification[J]. Transactions of the Association for Computational Lingus, 2016, 4(4):401-415.
[1] Zhang Le, Leng Jidong, Lv Xueqiang, Cui Zhuo, Wang Lei, You Xindong. RLCPAR: A Rewriting Model for Chinese Patent Abstracts Based on Reinforcement Learning[J]. 数据分析与知识发现, 2021, 5(7): 59-69.
[2] Song Ruoxuan,Qian Li,Du Yu. Identifying Academic Creative Concept Topics Based on Future Work of Scientific Papers[J]. 数据分析与知识发现, 2021, 5(5): 10-20.
[3] Jiao Qihang,Le Xiaoqiu. Generating Sentences of Contrast Relationship[J]. 数据分析与知识发现, 2020, 4(6): 43-50.
[4] Xu Jian,Zhang Zhixiong,Wu Zhenxin. Review on Techniques of Entity Relation Extraction[J]. 现代图书情报技术, 2008, 24(8): 18-23.
[5] Song Yunlong . Study on Ontology-based Reference on the Web[J]. 现代图书情报技术, 2006, 1(3): 27-30.
  Copyright © 2016 Data Analysis and Knowledge Discovery   Tel/Fax:(010)82626611-6626,82624938   E-mail:jishu@mail.las.ac.cn