Please wait a minute...
Advanced Search
数据分析与知识发现  2022, Vol. 6 Issue (12): 90-98     https://doi.org/10.11925/infotech.2096-3467.2022.0214
  研究论文 本期目录 | 过刊浏览 | 高级检索 |
联合关系上下文负采样的知识图谱嵌入*
李智杰(),王瑞,李昌华,张颉
西安建筑科技大学信息与控制工程学院 西安 710055
Embedding Knowledge Graph with Negative Sampling and Joint Relational Contexts
Li Zhijie(),Wang Rui,Li Changhua,Zhang Jie
School of Information and Control Engineering, Xi’an University of Architectural Science and Technology, Xi’an 710055, China
全文: PDF (1063 KB)   HTML ( 9
输出: BibTeX | EndNote (RIS)      
摘要 

目的】 针对当前基于翻译的知识图谱嵌入模型负采样质量偏低,影响知识图谱的有效嵌入,导致模型表征能力低、性能较差等问题,提出一种联合关系上下文负采样的知识图谱模型。【方法】 从原始知识图谱中提取目标实例的邻居并生成上下文向量;然后根据相邻关系可提供给定实体性质或类型信息的特性,在负采样时利用Concat聚合函数对给定实体的关系上下文进行聚合,确定被替换实体的属性;最后结合TransE模型的三元组嵌入并选择相同属性的替换实体生成负例三元组,从而提高正负例三元组的相似度。【结果】 实体链接中,在FB15K-237与WN18RR数据集上相对于基准模型分别提升18.3、29.2个百分点;同时在关系链接中较基准模型中的最优结果提升0.7个百分点。【局限】 在邻居关系上只考虑了关系上下文的语义信息,故难以确定相对位置,需要进一步探索其路径信息。【结论】 该采样策略通过提高替换实体与被替换实体间的相似性,提升了负例三元组的质量,使模型的准确率得到提高。

服务
把本文推荐给朋友
加入引用管理器
E-mail Alert
RSS
作者相关文章
李智杰
王瑞
李昌华
张颉
关键词 知识图谱负采样策略实体链接关系链接    
Abstract

[Objective] This paper proposes a knowledge graph model based on negative sampling and joint relational contexts, aiming to improve the quality of current translation-based knowledge graph embedding models. [Methods] Firstly, we extracted the neighbors of the target instances from the original knowledge graph to generate the context vector. Then, we decided the properties of adjacent relations, which also provided information on the nature or type of a given entity. Third, we used the Concat function to aggregate contexts of the given entities of negative sampling and determined the entity attributes to be replaced. Finally, we adopted the triple embedding of the TransE model to generate negative triples, and improved the similarities of positive and negative triples. [Results] We examined the proposed model with data sets of FB15K-237 and WN18RR. The entity link was 18.3% and 29.2% higher than those of the benchmark model. Meantime, the relationship link was 0.7% better than the optimal result of the benchmark model. [Limitations] Our model only included the semantics of the relational contexts, which is very hard to determine their relative positions. [Conclusions] The proposed sampling strategy effectively improves the quality of negative triples, as well as the accuracy of knowledge graph.

Key wordsKnowledge Graph    Negative Sampling Strategy    Entity Link    Relation Link
收稿日期: 2022-03-14      出版日期: 2023-02-03
ZTFLH:  TP391  
基金资助:*国家自然科学基金项目(51878536);陕西省自然科学基金项目(2020JQ-687);陕西省住房城乡建设科技计划项目(2020-K09)
通讯作者: 李智杰,ORCID:0000-0003-4362-5652     E-mail: lizhijie@xauat.edu.cn
引用本文:   
李智杰, 王瑞, 李昌华, 张颉. 联合关系上下文负采样的知识图谱嵌入*[J]. 数据分析与知识发现, 2022, 6(12): 90-98.
Li Zhijie, Wang Rui, Li Changhua, Zhang Jie. Embedding Knowledge Graph with Negative Sampling and Joint Relational Contexts. Data Analysis and Knowledge Discovery, 2022, 6(12): 90-98.
链接本文:  
https://manu44.magtech.com.cn/Jwk_infotech_wk3/CN/10.11925/infotech.2096-3467.2022.0214      或      https://manu44.magtech.com.cn/Jwk_infotech_wk3/CN/Y2022/V6/I12/90
Fig.1  相邻关系提供给定实体的属性
Fig.2  关系子图
Fig.3  rcTransE模型架构
Fig.4  负采样过程
数据集 关系数量 实体数量 训练集 测试集 验证集
FB15K-237 237 14 541 272 115 20 466 17 535
WN18RR 11 40 943 86 835 3 134 3 034
Table 1  数据集信息
数据集 上下文跳数m 学习率λ 嵌入维度n
FB15K-237 2 0.001 200
WN18RR 3 0.005 200
Table 2  最优参数值
实验
模型
FB15K-237 WN18RR
MRR Hits@1 Hits@3 MRR Hits@1 Hits@3
TransE 0.966 0.946 0.984
0.927
0.841
0.844
0.936
0.835
0.988
0.784 0.669 0.870
TransD 0.845 0.851 0.811 0.863 0.923
TransH 0.824 0.813 0.630 0.602 0.753
TransR 0.837 0.826 0.685 0.663 0.726
DistMult 0.875 0.806 0.847 0.787 0.891
PTransE 0.795 0.817 0.632 0.693 0.743
rcTransE 0.978 0.961 0.849 0.894 0.993
Table 3  实体预测结果
Fig.5  实验数据集在不同聚合函数上的表现
Fig.6  不同的关系上下文跳数在数据集上的表现
实验模型 FB15K-237 WN18RR
Hits@1 Hits@1
TransE 84.3

33.3
93.6
94.0
87.5
66.1
TransR 87.6
RTtransE 28.9
PTransE(2-step) 67.3
PTransE(3-step) 34.6
rcTransE 88.3
Table 4  关系预测结果
Fig.7  负采样方法对模型的影响
[1] Amit S. Introducing the Knowledge Graph[R]. America: Official Blog of Google, 2012.
[2] Bollacker K, Cook R, Tufts P. Freebase: A Shared Database of Structured General Human Knowledge[C]// Proceedings of the 22nd AAAI Conference on Artificial Intelligence. 2007: 1962-1963.
[3] WMF. Wikidata[EB/OL]. [2019-11-11]. https://www.wikidata.org/wiki/Wikidata:Main_Page.
[4] Suchanek F M, Kasneci G, Weikum G. YAGO: A Large Ontology from Wikipedia and WordNet[J]. Journal of Web Semantics, 2008, 6(3): 203-217.
doi: 10.1016/j.websem.2008.06.001
[5] Bizer C, Lehmann J, Kobilarov G, et al. DBpedia - A Crystallization Point for the Web of Data[J]. Journal of Web Semantics, 2009, 7(3): 154-165.
doi: 10.1016/j.websem.2009.07.002
[6] Li M D, Sun Z Y, Zhang S H, et al. Enhancing Knowledge Graph Embedding with Relational Constraints[J]. Neurocomputing, 2021, 429: 77-88.
doi: 10.1016/j.neucom.2020.12.012
[7] Li Z F, Liu H, Zhang Z L, et al. Recalibration Convolutional Networks for Learning Interaction Knowledge Graph Embedding[J]. Neurocomputing, 2021, 427: 118-130.
doi: 10.1016/j.neucom.2020.07.137
[8] Gong F, Wang M, Wang H F, et al. SMR: Medical Knowledge Graph Embedding for Safe Medicine Recommendation[J]. Big Data Research, 2021, 23: 100174.
doi: 10.1016/j.bdr.2020.100174
[9] 徐增林, 盛泳潘, 贺丽荣, 等. 知识图谱技术综述[J]. 电子科技大学学报, 2016, 45(4): 589-606.
[9] (Xu Zenglin, Sheng Yongpan, He Lirong, et al. Review on Knowledge Graph Techniques[J]. Journal of University of Electronic Science and Technology of China, 2016, 45(4): 589-606.)
[10] 舒世泰, 李松, 郝晓红, 等. 知识图谱嵌入技术研究进展[J]. 计算机科学与探索, 2021, 15(11): 2048-2062.
doi: 10.3778/j.issn.1673-9418.2103086
[10] (Shu Shitai, Li Song, Hao Xiaohong, et al. Knowledge Graph Embedding Technology: A Review[J]. Journal of Frontiers of Computer Science and Technology, 2021, 15(11): 2048-2062.)
doi: 10.3778/j.issn.1673-9418.2103086
[11] Bengio Y, Senecal J S. Adaptive Importance Sampling to Accelerate Training of a Neural Probabilistic Language Model[J]. IEEE Transactions on Neural Networks, 2008, 19(4): 713-722.
doi: 10.1109/TNN.2007.912312 pmid: 18390314
[12] Zhen Y, Ming D, Chang Z, et al. Understanding Negative Sampling in Graph Representation Learning[C]// Proceedings of the 26th ACM SIGKDD Conference on Knowledge Discovery and Data Mining. 2020: 1666-1676.
[13] Wang Q, Mao Z D, Wang B, et al. Knowledge Graph Embedding: A Survey of Approaches and Applications[J]. IEEE Transactions on Knowledge and Data Engineering, 2017, 29(12): 2724-2743.
doi: 10.1109/TKDE.2017.2754499
[14] Bordes A, Usunier N, Garcia-Duran A, et al. Translating Embeddings for Modeling Multi-Relational Data[C]// Proceedings of the 27th Annual Conference on Neural Information Processing Systems. 2013: 2787-2795.
[15] Socher R, Chen D, Manning C D, et al. Reasoning with Neural Tensor Networks for Knowledge Base Completion[C]// Proceedings of the 26th International Conference on Neural Information Processing Systems. 2013, 26: 926-934.
[16] Wang Z, Zhang J W, Feng J L, et al. Knowledge Graph Embedding by Translating on Hyperplanes[C]// Proceedings of the 28th AAAI Conference on Artificial Intelligence. 2014: 1112-1119.
[17] Wang P, Li S, Pan R. Incorporating GAN for Negative Sampling in Knowledge Representation Learning[C]// Proceedings of the 32nd AAAI Conference on Artificial Intelligence. 2018: 2005-2012.
[18] Sun Z, Deng Z H, Nie J Y, et al. RotatE: Knowledge Graph Embedding by Relational Rotation in Complex Space[OL]. arXiv Preprint, arXiv: 1902.10197.
[19] 郭智, 郑彦斌, 夏志超, 等. 融合属性信息的知识表示方法[J]. 科学技术与工程, 2019, 19(33): 259-265.
[19] (Guo Zhi, Zheng Yanbin, Xia Zhichao, et al. Knowledge Representation Learning Method with Attribute Information[J]. Science Technology and Engineering, 2019, 19(33): 259-265.)
[20] Duchi J, Hazan E, Singer Y. Adaptive Subgradient Methods for Online Learning and Stochastic Optimization[J]. Journal of Machine Learning Research, 2011, 12(7): 257-269.
[21] Toutanova K, Chen D Q. Observed Versus Latent Features for Knowledge Base and Text Inference[C]// Proceedings of the 3rd Workshop on Continuous Vector Space Models and Their Compositionality. 2015: 57-66.
[22] Dettmers T, Minervini P, Stenetorp P, et al. Convolutional 2D Knowledge Graph Embeddings[C]// Proceedings of the 32nd AAAI Conference on Artificial Intelligence. 2018: 1811-1818.
[23] Lin Y K, Liu Z Y, Luan H B, et al. Modeling Relation Paths for Representation Learning of Knowledge Bases[C]// Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing. 2015: 705-714.
[24] Garcia-Duran A, Bordes A, Usunier N. Composing Relationships with Translations[C]// Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing. 2015: 286-290.
[1] 刘春江, 李姝影, 胡汗林, 方曙. 图数据库在复杂网络分析中的研究与应用进展*[J]. 数据分析与知识发现, 2022, 6(7): 1-11.
[2] 张晗, 安欣宇, 刘春鹤. 基于多源语义知识图谱的药物知识发现:以药物重定位为实证*[J]. 数据分析与知识发现, 2022, 6(7): 87-98.
[3] 刘勘, 徐勤亚, 於陆. 面向营商环境的知识图谱构建研究*[J]. 数据分析与知识发现, 2022, 6(4): 82-96.
[4] 张卫, 王昊, 陈玥彤, 范涛, 邓三鸿. 融合迁移学习与文本增强的中文成语隐喻知识识别与关联研究*[J]. 数据分析与知识发现, 2022, 6(2/3): 167-183.
[5] 刘政昊, 钱宇星, 衣天龙, 吕华揆. 知识关联视角下金融证券知识图谱构建与相关股票发现*[J]. 数据分析与知识发现, 2022, 6(2/3): 184-201.
[6] 程子佳, 陈翀. 面向流行性疾病科普的用户问题理解与答案内容组织*[J]. 数据分析与知识发现, 2022, 6(2/3): 202-211.
[7] 侯党, 傅湘玲, 高嵩峰, 彭雷, 王友军, 宋美琦. 基于企业知识图谱的企业关联关系挖掘*[J]. 数据分析与知识发现, 2022, 6(2/3): 212-221.
[8] 华斌,康月,范林昊. 政策文本的知识建模与关联问答研究[J]. 数据分析与知识发现, 2022, 6(11): 79-92.
[9] 周阳,李学俊,王冬磊,陈方,彭莉娟. 炸药配方设计知识图谱的构建与可视分析方法研究*[J]. 数据分析与知识发现, 2021, 5(9): 42-53.
[10] 沈科杰, 黄焕婷, 化柏林. 基于公开履历数据的人物知识图谱构建*[J]. 数据分析与知识发现, 2021, 5(7): 81-90.
[11] 阮小芸,廖健斌,李祥,杨阳,李岱峰. 基于人才知识图谱推理的强化学习可解释推荐研究*[J]. 数据分析与知识发现, 2021, 5(6): 36-50.
[12] 李贺,刘嘉宇,李世钰,吴迪,金帅岐. 基于疾病知识图谱的自动问答系统优化研究*[J]. 数据分析与知识发现, 2021, 5(5): 115-126.
[13] 代冰,胡正银. 基于文献的知识发现新近研究综述 *[J]. 数据分析与知识发现, 2021, 5(4): 1-12.
[14] 朱冬亮, 文奕, 万子琛. 基于知识图谱的推荐系统研究综述*[J]. 数据分析与知识发现, 2021, 5(12): 1-13.
[15] 余传明, 张贞港, 孔令格. 面向链接预测的知识图谱表示模型对比研究*[J]. 数据分析与知识发现, 2021, 5(11): 29-44.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
版权所有 © 2015 《数据分析与知识发现》编辑部
地址:北京市海淀区中关村北四环西路33号 邮编:100190
电话/传真:(010)82626611-6626,82624938
E-mail:jishu@mail.las.ac.cn