Please wait a minute...
Advanced Search
数据分析与知识发现  2021, Vol. 5 Issue (8): 45-53     https://doi.org/10.11925/infotech.2096-3467.2020.1302
  研究论文 本期目录 | 过刊浏览 | 高级检索 |
一对多实体关系少样本持续学习方法研究
江雅仁,乐小虬()
中国科学院文献情报中心 北京 100190
中国科学院大学经济与管理学院图书情报与档案管理系 北京 100190
Continual Learning for One-to-many Entity Relationship Generation with Small Samples
Jiang Yaren,Le Xiaoqiu()
National Science Library, Chinese Academy of Sciences, Beijing 100190, China
Department of Library, Information and Archives Management, School of Economics and Management, University of Chinese Academy of Sciences, Beijing 100190, China
全文: PDF (746 KB)   HTML ( 2
输出: BibTeX | EndNote (RIS)      
摘要 

【目的】 在少量样本情况下识别句中一对多实体关系(如包含关系、并列关系等)实例,在新增类别后保持识别效果,实现持续学习。【方法】 以LaserTagger模型为基础,利用文本生成的方法识别句子中包含、并列关系实体,并通过位置特征编码、加权Loss计算的方式增强模型在少样本情况下的特征学习能力,通过模型的压缩、扩展实现多个类别的持续学习。【结果】 在少量训练样本的情况下,本文方法在5个类别上的SARI值均比基线模型提高1%以上;多类别依次学习的情况下,通过模型的压缩、扩展能够较好地保留模型已学习到的知识,SARI值最高能提升16.92%。【局限】 仅选取包含关系、并列关系中的5种句式类别进行实验,数据类别较少,句式结构比较简单,暂未考虑模型在更多类别、更复杂句式情况下的性能。【结论】 所提方法在一定程度上能满足少样本、多类别依次学习的应用场景,具有一定优越性。

服务
把本文推荐给朋友
加入引用管理器
E-mail Alert
RSS
作者相关文章
江雅仁
乐小虬
关键词 实体关系文本生成少样本持续学习    
Abstract

[Objective] This paper tries to recognize the one-to-many entity relationship instances (such as inclusion relation,coordination relation) from sentences using small amount of samples, aiming to realize continual learning with new data. [Methods] First, we generated the one-to-many inclusion and coordination entities from sentences using LaserTagger. Then, with the help of position embedding and weighted loss,our model captured more features with limited data. Finally, the model achieved continual learning by model compression and expansion. [Results] Our approach’s SARI was 1% better than those of the baseline models in all tests. The model compression and expansion can effectively retain the learned knowledge on previous data and the SARI was about 16.92% higher than the performance of baseline models. [Limitations] More research is needed to examine the proposed method with more complex data sets. [Conclusions] Our study could effectively identify entity relationship with small amout of training data from different categories.

Key wordsEntity Relation    Text Generation    Small Samples    Continual Learning
收稿日期: 2020-12-28      出版日期: 2021-09-15
ZTFLH:  TP391  
通讯作者: 乐小虬 ORCID:0000-0002-7114-5544     E-mail: lexq@mail.las.ac.cn
引用本文:   
江雅仁, 乐小虬. 一对多实体关系少样本持续学习方法研究[J]. 数据分析与知识发现, 2021, 5(8): 45-53.
Jiang Yaren, Le Xiaoqiu. Continual Learning for One-to-many Entity Relationship Generation with Small Samples. Data Analysis and Knowledge Discovery, 2021, 5(8): 45-53.
链接本文:  
http://manu44.magtech.com.cn/Jwk_infotech_wk3/CN/10.11925/infotech.2096-3467.2020.1302      或      http://manu44.magtech.com.cn/Jwk_infotech_wk3/CN/Y2021/V5/I8/45
Fig.1  模型框架
Fig.2  模型压缩-扩展示意图
类别 核心句式特征 例句 一对多实体关系实例
1 E1 include E2, E3 Fuzzy-logic system includes of a fuzzifier, membership functions, fuzzy rule base, inference engine and defuzzifier. Fuzzy-logic system (fuzzifier; membership functions; fuzzy rule base; inference engine; defuzzifier)
2 E1 consist of E2, E3 In general, soft computing methods consist of three essential paradigms: neural networks, fuzzy logic, and GAs soft computing methods (neural networks; fuzzy logic; GAs)
3 E1 be composed of E2, E3 The ADM model is composed of an Eulerian model, a Lagrangian particle dispersion model, and a probabilistic Lagrangian puff model. The ADM model(an Eulerian model; a Lagrangian particle dispersion model; a probabilistic Lagrangian puff model )
4 E1 be comprised of E2, E3 The WPD is comprised of low-pass filter and high-pass filter. The WPD(low-pass filter; high-pass filter)
5 E1 such as E2, E3 He used several classifiers such as SVM, NB, DT. classifier(SVM; NB; DT)
Table 1  5个类别实验数据示例
参数
Batch Size 4
最大输入文本长度(max_seq_length) 128
学习率(lr) 3e-5
Dropout 0.1
隐藏层节点数 768
词向量维度 512
Table 2  模型参数
项目 配置
操作系统 Ubuntu 16.04.12
GPU GeForce RTX 2080 Ti
内存 64GB
Python Python 3.6.10
TensorFlow TensorFlow-gpu 1.15.0
Table 3  实验环境
模型 类别1 SARI 类别2 SARI 类别3 SARI 类别4 SARI 类别5 SARI
BERT+Transformer 38.39% 38.50% 20.13% 23.16% 15.49%
LaserTagger 86.03% 87.35% 77.36% 78.99% 69.05%
LaserTagger+加权Loss 86.51% 89.09% 78.21% 80.66% 69.79%
本文模型(LaserTagger+加权Loss+位置编码) 87.77% 89.88% 78.91% 81.96% 70.09%
Table 4  模型少样本学习结果
模型 类别1(SARI) 类别2(SARI) 类别3(SARI) 类别4(SARI) 类别5(SARI)
本文模型_模型不扩展 72.29% 72.91% 64.84% 63.27% 74.48%
本文模型_模型扩展 84.04% 84.24% 73.22% 80.19% 71.67%
Δ 11.75% 11.33% 8.38% 16.92% -2.81%
Table 5  模型持续学习实验结果
语料/方法 实例
一对多实体关系例句 The sludge train included a gravity thickener, an aerobic digester and belt filter presses, with additional power consumptions of 8 and 11.2 kW respectively.
人工标注结果 The sludge train (gravity thickener; aerobic digester; belt filter presses)
本文模型_模型扩展 The sludge train (gravity thickener; aerobic digester; belt filter presses)
本文模型_模型不扩展 belt filter presses 8; 11(2 kW respectively )
Table 6  一对多实体关系生成结果实例
[1] Eric M, Sebastian K, Sascha R, et al. Encode, Tag, Realize: High-Precision Text Editing[C]// Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing. 2019.
[2] Zhang S C, Wang F, Bao H Y, et al. Joint Extraction of Entities and Relations Based on a Novel Tagging Scheme[C]// Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics. 2017: 1227-1236.
[3] Li Q, Ji H. Incremental Joint Extraction of Entity Mentions and Relations[C]// Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics. 2014. DOI: 10.3115/v1/P14-1038.
doi: 10.3115/v1/P14-1038
[4] Dai D, Xiao X Y, Lyu Y J, et al. Joint Extraction of Entities and Overlapping Relations Using Position-Attentive Sequence Labeling[C]// Proceedings of the AAAI Conference on Artificial Intelligence. 2019: 6300-6308.
[5] Wei Z P, Su J L, Wang Y, et al. A Novel Cascade Binary Tagging Framework for Relational Triple Extraction[C]// Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. 2020: 1476-1488.
[6] Sui D B, Chen Y B, Liu K, et al. Joint Entity and Relation Extraction with Set Prediction Networks[OL]. arXiv Preprint, arXiv:2011.01675.
[7] Sutskever I, Vinyals O, Le Q V. Sequence to Sequence Learning with Neural Networks[C]// Proceedings of the 27th International Conference on Neural Information Processing Systems. 2014: 3104-3112.
[8] Zeng X R, Zeng D J, He S Z, et al. Extracting Relational Facts by an End-to-End Neural Model with Copy Mechanism[C]// Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics. 2018: 506-514.
[9] Nayak T, Ng H T. Effective Modeling of Encoder-Decoder Architecture for Joint Entity and Relation Extraction[C]// Proceedings of the AAAI Conference on Artificial Intelligence. 2020:8528-8535.
[10] Zeng D, Zhang H, Liu Q. CopyMTL: Copy Mechanism for Joint Extraction of Entities and Relations with Multi-Task Learning[C]// Proceedings of the AAAI Conference on Artificial Intelligence. 2020:9507-9514.
[11] Wang Y Q, Yao Q M, Kwok J T, et al. Generalizing from a Few Examples: A Survey on Few-shot Learning[J]. ACM Computing Surveys, 2020, 53(3):63.
[12] Han X, Zhu H, Yu P F, et al. FewRel: A Large-Scale Supervised Few-Shot Relation Classification Dataset with State-of-the-Art Evaluation[C]// Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. 2018: 4803-4809.
[13] Gao T Y, Han X, Zhu H, et al. FewRel 2.0: Towards More Challenging Few-Shot Relation Classification[C]// Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing. 2019: 6250-6255.
[14] Snell J, Swersky K, Zemel R. Prototypical Networks for Few-shot Learning[C]// Proceedings of the 31st Conference on Neural Information Processing Systems. 2017: 4077-4087.
[15] Fritzler A, Logacheva V, Kretov M. Few-shot Classification in Named Entity Recognition Task[C]// Proceedings of the 34th ACM/SIGAPP Symposium on Applied Computing. 2019: 993-1000.
[16] Ye Z X, Ling Z H. Multi-Level Matching and Aggregation Network for Few-Shot Relation Classification[C]// Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. 2019: 2872-2881.
[17] Devlin J, Chang M W, Lee K, et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding[OL]. arXiv Preprint, arXiv:1810.04805.
[18] Yang Y, Katiyar A. Simple and Effective Few-Shot Named Entity Recognition with Structured Nearest Neighbor Learning[C]// Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing. 2020: 6365-6375.
[19] 戴尚峰. 少样本关系抽取方法研究[D]. 哈尔滨: 哈尔滨工业大学, 2020.
[19] ( Dai Shangfeng. Research on Few-Shot Relation Extraction Method[D]. Harbin: Harbin Institute of Technology, 2020.)
[20] Soares L B, FitzGerald N, Ling J, et al. Matching the Blanks: Distributional Similarity for Relation Learning[C]// Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. 2019: 2895-2905.
[21] Yu H Y, Zhang N Y, Deng S M, et al. Bridging Text and Knowledge with Multi-Prototype Embedding for Few-Shot Relational Triple Extraction[C]// Proceedings of the 28th International Conference on Computational Linguistics. 2020: 6399-6410.
[22] McCloskey M, Cohen N J. Catastrophic Interference in Connectionist Networks: The Sequential Learning Problem[J]. Psychology of Learning and Motivation, 1989, 24:109-165.
[23] Xu H, Liu B, Shu L, et al. Lifelong Domain Word Embedding via Meta-Learning[C]// Proceedings of the 27th International Joint Conference on Artificial Intelligence. 2018: 4510-4516.
[24] Liu T L, Ungar L, Sedoc J. Continual Learning for Sentence Representations Using Conceptors[C]// Proceedings of the NAACL-HLT 2019. 2019:3274-3279.
[25] de Masson d'Autume C, Ruder S, Kong L P, et al. Episodic Memory in Lifelong Language Learning[OL]. arXiv Preprint, arXiv: 1906.01076.
[26] Jong W P. Continual BERT: Continual Learning for Adaptive Extractive Summarization of COVID-19 Literature[OL]. arXiv Preprint, arXiv: 2007.03405.
[27] Kirkpatrick J, Pascanu R, Rabinowitz N, et al. Overcoming Catastrophic Forgetting in Neural Networks[J]. Proceedings of the National Academy of Sciences of the United States of America, 2016, 114(13):3521-3526.
[28] Rajpurkar P, Irvin J, Zhu K, et al. CheXNet: Radiologist-Level Pneumonia Detection on Chest X-Rays with Deep Learning[OL]. arXiv Preprint, arXiv: 1711.05225.
[29] Wang X S, Peng Y F, Lu L, et al. ChestX-Ray8: Hospital-Scale Chest X-Ray Database and Benchmarks on Weakly-Supervised Classification and Localization of Common Thorax Diseases[C]// Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2017: 3462-3471.
[30] Xu W, Napoles C, Pavlick E, et al. Optimizing Statistical Machine Translation for Text Simplification[J]. Transactions of the Association for Computational Lingus, 2016, 4(4):401-415.
[1] 张乐, 冷基栋, 吕学强, 崔卓, 王磊, 游新冬. RLCPAR:一种基于强化学习的中文专利摘要改写模型*[J]. 数据分析与知识发现, 2021, 5(7): 59-69.
[2] 宋若璇,钱力,杜宇. 基于科技论文中未来工作句集的学术创新构想话题自动生成方法研究*[J]. 数据分析与知识发现, 2021, 5(5): 10-20.
[3] 焦启航,乐小虬. 对比关系句子生成方法研究[J]. 数据分析与知识发现, 2020, 4(6): 43-50.
[4] 徐彤彤,孙华志,马春梅,姜丽芬,刘逸琛. 基于双向长效注意力特征表达的少样本文本分类模型研究*[J]. 数据分析与知识发现, 2020, 4(10): 113-123.
[5] 徐健,张智雄,吴振新. 实体关系抽取的技术方法综述*[J]. 现代图书情报技术, 2008, 24(8): 18-23.
[6] 宋云龙 . 基于Ontology的网上咨询专家系统的研究[J]. 现代图书情报技术, 2006, 1(3): 27-30.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
版权所有 © 2015 《数据分析与知识发现》编辑部
地址:北京市海淀区中关村北四环西路33号 邮编:100190
电话/传真:(010)82626611-6626,82624938
E-mail:jishu@mail.las.ac.cn