Please wait a minute...
Data Analysis and Knowledge Discovery  2021, Vol. 5 Issue (11): 29-44    DOI: 10.11925/infotech.2096-3467.2021.0491
Current Issue | Archive | Adv Search |
Comparing Knowledge Graph Representation Models for Link Prediction
Yu Chuanming(),Zhang Zhengang,Kong Lingge
School of Information and Safety Engineering, Zhongnan University of Economics and Law,Wuhan 430073, China
Download: PDF (1283 KB)   HTML ( 33
Export: BibTeX | EndNote (RIS)      
Abstract  

[Objective] This study systematically reviews the internal mechanism and influencing factors of knowledge graph representation models, aiming to investigate their impacts on specific tasks. [Methods] For the link prediction task, we compared the performance of translation-based and semantic matching-based knowledge graph representation models on FB15K, WN18, FB15K-237 and WN18RR datasets. [Results] With the Hits@1 indicator, the TuckER model generated the best value on WN18, FB15K-237 and WN18RR datasets (0.946 0, 0.263 3 and 0.443 0, respectively), while the ComplEx model yielded the highest value on FB15K dataset (0.731 4). [Limitations] We only compared the effects of knowledge graph representation model on the link prediction and knowledge base QA tasks. More research is needed to examine their performance on information retrieval, recommendation system and other tasks. [Conclusions] There are significant differences between the translation-based and the semantic matching-based knowledge graph representation models. The score function, negative sampling, and optimization method of the knowledge graph representation model, as well as the proportion of training data have significant impacts on the results of the link prediction.

Key wordsKnowledge Graph      Representation Learning      Deep Learning      Link Prediction     
Received: 18 May 2021      Published: 23 December 2021
ZTFLH:  TP391  
Fund:National Natural Science Foundation of China(71790612);National Natural Science Foundation of China(71974202)
Corresponding Authors: Yu Chuanming,ORCID:0000-0001-7099-0853     E-mail: yucm@zuel.edu.cn

Cite this article:

Yu Chuanming, Zhang Zhengang, Kong Lingge. Comparing Knowledge Graph Representation Models for Link Prediction. Data Analysis and Knowledge Discovery, 2021, 5(11): 29-44.

URL:

https://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/10.11925/infotech.2096-3467.2021.0491     OR     https://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/Y2021/V5/I11/29

The Link Prediction Framework Based on Knowledge Graph Representation Models
模型 得分函数 参数复杂度
TransE | | h + r - t | | P O ( n e k + n r k )
TransH | | ( h - r T hr ) + r - ( t - r T tr ) | | P O ( n e k + 2 n r k )
TransR | | hM r + r - t M r | | P O ( n e k + n r k + n r k 2 )
TransD | | h M rh + r - t M rt | | P O ( 2 n e k + 2 n r k )
DisMult < h , r , t > O ( n e k + n r k )
ComplEx < h , r , t > O ( 2 n e k + 2 n r k )
ConvE f ( vec ( f ( [ h , r ] × w ) W ) t O ( n e k + n r k )
HypER f ( vec ( h × ve c - 1 ( w r M ) ) W ) t O ( n e k + n r k )
TuckER < h , r , t , T > O ( n e k + n r k + k 3 )
Score Function and Parameter Complexity of Different Models
统计项 FB15K FB15K-237 WN18 WN18RR
实体数 14 951 14 541 40 943 40 943
关系数 1 345 237 18 11
训练集 483 142 272 115 141 442 86 835
验证集 50 000 17 535 5 000 3 034
测试集 59 071 20 466 5 000 3 134
Statistical Descriptions of Four Datasets
模型 参数名 参数值
TransE 迭代次数 500
边距超参数
负采样方法
5.0
概率抽样
负采样数量
嵌入维度
25
200
TransH 迭代次数 500
边距超参数 4.0
负采样方法 概率抽样
负采样数量 25
嵌入维度 200
TransR 迭代次数 500
边距超参数 4.0
负采样方法 概率抽样
负采样数量 25
嵌入维度 200
TransD 迭代次数 500
边距超参数 4.0
负采样方法 概率抽样
负采样数量 25
嵌入维度 200
DistMult 迭代次数 500
Input Dropout 0.2
嵌入维度 200
标签平滑 0.1
ComplEx 迭代次数 500
Input Dropout 0.2
嵌入维度 200
标签平滑 0.1
ConvE 迭代次数 500
Input Dropout 0.2
Hidden Dropout 0.2
Feature Map Dropout 0.2
嵌入维度 200
标签平滑 0.1
HypER 迭代次数 500
Input Dropout 0.2
Hidden Dropout 0.2
Feature Map Dropout 0.2
嵌入维度 200
标签平滑 0.1
TuckER 迭代次数 500
Input Dropout 0.3
Hidden Dropout1 0.4
Hidden Dropout2 0.5
嵌入维度 200
标签平滑 0.1
Parameter Settings for the Model
模型 FB15K WN18
Hit@1 Hit@3 Hit@10 MRR Hit@1 Hit@3 Hit@10 MRR
TransE 0.378 3 0.604 1 0.738 1 0.512 5 0.241 9 0.932 4 0.948 6 0.586 8
TransH 0.365 2 0.634 9 0.766 2 0.519 8 0.059 8 0.916 1 0.935 9 0.485 8
TransR 0.425 1 0.703 3 0.807 6 0.579 6 0.366 9 0.911 9 0.930 5 0.637 9
TransD 0.400 5 0.651 0 0.777 2 0.544 6 0.241 3 0.922 5 0.942 0 0.580 4
DistMult 0.706 6 0.814 4 0.877 7 0.770 4 0.848 8 0.942 5 0.953 2 0.896 3
ComplEx 0.731 4 0.820 0 0.875 8 0.784 5 0.941 2 0.949 1 0.956 0 0.946 2
TuckER 0.686 1 0.819 0 0.888 9 0.762 5 0.946 0 0.952 8 0.956 8 0.950 1
ConvE 0.561 7 0.721 8 0.827 2 0.658 0 0.940 9 0.947 0 0.955 0 0.945 4
HypER 0.659 0 0.793 6 0.868 3 0.737 3 0.945 4 0.952 2 0.957 6 0.949 6
Results of Different Knowledge Graph Representation Methods on FB15K and WN18 Datasets
模型 FB15K-237 WN18RR
Hit@1 Hit@3 Hit@10 MRR Hit@1 Hit@3 Hit@10 MRR
TransE 0.189 9 0.325 5 0.472 7 0.285 9 0.045 9 0.353 7 0.503 7 0.220 7
TransH 0.190 9 0.332 9 0.492 5 0.291 5 0.049 1 0.368 1 0.502 9 0.225 5
TransR 0.216 1 0.354 2 0.510 3 0.314 5 0.046 9 0.397 6 0.507 0 0.234 8
TransD 0.181 0 0.325 2 0.486 7 0.284 1 0.037 5 0.366 9 0.508 1 0.219 1
DistMult 0.216 6 0.329 4 0.474 1 0.301 0 0.396 1 0.451 5 0.515 5 0.434 3
ComplEx 0.213 5 0.325 0 0.470 5 0.297 8 0.411 3 0.463 9 0.534 1 0.449 5
TuckER 0.263 3 0.389 6 0.539 0 0.355 3 0.443 0 0.482 0 0.526 0 0.470 0
ConvE 0.225 6 0.344 8 0.492 1 0.314 2 0.393 3 0.439 7 0.499 0 0.427 7
HypER 0.249 8 0.376 0 0.519 7 0.340 3 0.428 7 0.471 8 0.520 7 0.458 9
Results of Different Knowledge Graph Representation Methods on FB15K-237 and WN18RR Datasets
Influence of Dimension on the Performance of Translation-Based Knowledge Graph Representation Models
The Influence of Dimension on the Performance of Semantic Matching-Based Knowledge Graph Representation Models
Influence of Negative Sampling Method and Quantity on the Performance of TransE
Influence of Negative Sampling Method and Quantity on the Performance of TransH
The Influence of Negative Sampling Method and Quantity on the Performance of TransR
Influence of Negative Sampling Method and Quantity on the Performance of TransD
Influence of the Dropout Ratio on the Performance of Knowledge Graph Representation Models
Influence of the Label Smoothing Ratio on the Performance of Knowledge Graph Representation Models
Influence of Dynamic Entity Adding on the Performance of Knowledge Graph Representation Models
Influence of the Proportion of Training Data on the Performance of Translation-Based Knowledge Graph Representation Models
Influence of the Proportion of Training Data on the Performance of Semantic-Match-Based Knowledge Graph Representation Models
数据集 模型 准确率/%
Simple Questions No_emb 41.20
TransE 74.63
TransH 73.33
TransR 73.37
TransD 73.21
DistMult 25.65
ComplEx 46.47
TuckER 73.25
ConvE 67.56
HypER 70.41
Experimental Results of Knowledge Graph Representation Models in Knowledge-Based QA Tasks
[1] Bollacker K, Cook R, Tufts P. Freebase: A Shared Database of Structured General Human Knowledge[C]// Proceedings of the 22nd AAAI Conference on Artificial Intelligence. Menlo Park, CA: AAAI, 2007: 1962-1963.
[2] Bizer C, Lehmann J, Kobilarov G, et al. DBpedia- A Crystallization Point for the Web of Data[J]. Journal of Web Semantics, 2009, 7(3):154-165.
doi: 10.1016/j.websem.2009.07.002
[3] WMF. Wikidata[EB/OL]. [2019-11-11]. https://www.wikidata.org/wiki/Wikidata:Main_Page.
[4] Suchanek F M, Kasneci G, Weikum G. YAGO: A Large Ontology from Wikipedia and WordNet[J]. Journal of Web Semantics, 2008, 6(3):203-217.
doi: 10.1016/j.websem.2008.06.001
[5] Zhang Y Z, Liu K, He S Z, et al. Question Answering over Knowledge Base with Neural Attention Combining Global Knowledge Information[OL]. arXiv Preprint, arXiv: 1606.00979.
[6] Yang B S, Mitchell T. Leveraging Knowledge Bases in LSTMs for Improving Machine Reading[C]// Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics(Volume 1: Long Papers). 2017: 1436-1446.
[7] Almousa M, Benlamri R, Khoury R. A Novel Word Sense Disambiguation Approach Using WordNet Knowledge Graph [OL]. arXiv Preprint, arXiv: 2101.02875.
[8] 阮小芸, 廖健斌, 李祥, 等. 基于人才知识图谱推理的强化学习可解释推荐研究[J]. 数据分析与知识发现, 2021, 5(6):36-50.
[8] (Ruan Xiaoyun, Liao Jianbin, Li Xiang, et al. Interpretable Recommendation of Reinforcement Learning Based on Talent Knowledge Graph Reasoning[J]. Data Analysis and Knowledge Discovery, 2021, 5(6):36-50.)
[9] Bellman R E. Dynamic Programming[M]. Dover Publications, Incorporated, 2003.
[10] 刘知远, 孙茂松, 林衍凯, 等. 知识表示学习研究进展[J]. 计算机研究与发展, 2016, 53(2):247-261.
[10] (Liu Zhiyuan, Sun Maosong, Lin Yankai, et al. Knowledge Representation Learning: A Review[J]. Journal of Computer Research and Development, 2016, 53(2):247-261.)
[11] 余传明, 王曼怡, 林虹君, 等. 基于深度学习的词汇表示模型对比研究[J]. 数据分析与知识发现, 2020, 4(8):28-40.
[11] (Yu Chuanming, Wang Manyi, Lin Hongjun, et al. A Comparative Study of Word Representation Models Based on Deep Learning[J]. Data Analysis and Knowledge Discovery, 2020, 4(8):28-40.)
[12] Bordes A, Usunier N, Garcia-Duran A, et al. Translating Embeddings for Modeling Multi-relational Data[C]//Proceedings of the 26th International Conference on Neural Information Processing Systems. Cambridge, MA: MIT Press, 2013: 2787-2795.
[13] Ruffinelli D, Broscheit S, Gemulla R. You Can Teach an Old Dog New Tricks! On Training Knowledge Graph Embeddings[C]// Proceedings of the International Conference on Learning Representations. Addis Ababa: ICLR, 2020.
[14] Wang Z, Zhang J W, Feng J L, et al. Knowledge Graph Embedding by Translating on Hyperplanes[C]//Proceedings of the 28th AAAI Conference on Artificial Intelligence. Menlo Park, CA: AAAI Press, 2014: 1112-1119.
[15] Lin Y K, Liu Z Y, Zhu M S, et al. Learning Entity and Relation Embeddings for Knowledge Graph Completion[C]//Proceedings of the 29th AAAI Conference on Artificial Intelligence. Menlo Park, CA: AAAI Press, 2015:2181-2187.
[16] Ji G L, He S Z, Xu L H, et al. Knowledge Graph Embedding via Dynamic Mapping Matrix[C]//Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing. Stroudsburg, PA: ACL Press, 2015: 687-696.
[17] Yang B S, Yih W T, He X D, et al. Embedding Entities and Relations for Learning and Inference in Knowledge Bases[OL]. arXiv Preprint, arXiv: 1412.6575.
[18] Trouillon T, Welbl J, Riedel S, et al. Complex Embeddings for Simple Link Prediction[C]// Proceedings of the International Conference on Machine Learning, New York, USA: ICML, 2016:2071-2080.
[19] Dettmers T, Minervini P, Stenetorp P, et al. Convolutional 2D Knowledge Graph Embeddings[C]//Proceedings of the 32nd AAAI Conference on Artificial Intelligence. Menlo Park, CA: AAAI Press, 2018:1811-1818.
[20] Balažević I, Allen C, Hospedales T M, et al. Hypernetwork Knowledge Graph Embeddings[C]// Proceedings of the 2019 Conference in Artificial Neural Networks and Machine Learning: Workshop and Special Sessions. 2019:553-565.
[21] Balažević I, Allen C, Hospedales T M. Tucker: Tensor Factorization for Knowledge Graph Completion[OL]. arXiv Preprint, arXiv: 1901.09590.
[22] Mai G C, Janowicz K, Cai L, et al. SE‐KGE: A Location‐Aware Knowledge Graph Embedding Model for Geographic Question Answering and Spatial Semantic Lifting[J]. Transactions in GIS, 2020, 24(3):623-655.
doi: 10.1111/tgis.v24.3
[23] Kumar S, Jat S, Saxena K, et al. Zero-Shot Word Sense Disambiguation Using Sense Definition Embeddings[C]// Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Florence, Italy:ACL, 2019: 5670-5681.
[24] Li C, Peng X T, Zhang S H, et al. Modeling Relation Paths for Knowledge Base Completion via Joint Adversarial Training[J]. Knowledge-Based Systems, 2020, 201-202:105865.
doi: 10.1016/j.knosys.2020.105865
[25] He L R, Liu B, Li G X, et al. Knowledge Base Completion by Variational Bayesian Neural Tensor Decomposition[J]. Cognitive Computation, 2018, 10(6):1075-1084.
doi: 10.1007/s12559-018-9565-x
[26] Wang H B, Jiang S C, Yu Z T. Modeling of Complex Internal Logic for Knowledge Base Completion[J]. Applied Intelligence, 2020, 50(10):3336-3349.
doi: 10.1007/s10489-020-01734-z
[27] 徐增林, 盛泳潘, 贺丽荣, 等. 知识图谱技术综述[J]. 电子科技大学学报, 2016, 45(4):589-606.
[27] (Xu Zenglin, Sheng Yongpan, He Lirong, et al. Review on Knowledge Graph Techniques[J]. Journal of University of Electronic Science and Technology of China, 2016, 45(4):589-606.)
[28] Ren F L, Li J C, Zhang H H, et al. Knowledge Graph Embedding with Atrous Convolution and Residual Learning[C]// Proceedings of the 28th International Conference on Computational Linguistics. Barcelona, Spain: ACM, 2020: 1532-1543.
[29] Choi S J, Song H J, Park S B. An Approach to Knowledge Base Completion by a Committee-Based Knowledge Graph Embedding[J]. Applied Sciences, 2020, 10(8):2651.
doi: 10.3390/app10082651
[30] Zhang M, Geng G H, Zeng S, et al. Knowledge Graph Completion for the Chinese Text of Cultural Relics Based on Bidirectional Encoder Representations from Transformers with Entity-Type Information[J]. Entropy, 2020, 22(10):1168.
doi: 10.3390/e22101168
[31] Wang Q, de Ji Y, Hao Y S, et al. GRL: Knowledge Graph Completion with GAN-Based Reinforcement Learning[J]. Knowledge-Based Systems, 2020, 209:106421.
doi: 10.1016/j.knosys.2020.106421
[32] Liu F F, Shen Y, Zhang T N, et al. Entity-Related Paths Modeling for Knowledge Base Completion[J]. Frontiers of Computer Science, 2020, 14(5):145311.
doi: 10.1007/s11704-019-8264-4
[33] Huang X, Zhang J Y, Li D C, et al. Knowledge Graph Embedding Based Question Answering[C]// Proceedings of the 12th International Conference on Web Search and Data Mining. Melbourne, VIC, Australia: ACM, 2019: 105-112.
[34] Bordes A, Usunier N, Chopra S, et al. Large-Scale Simple Question Answering with Memory Networks[OL]. arXiv Preprint, arXiv: 1506.02075.
[35] 安波, 韩先培, 孙乐. 融合知识表示的知识库问答系统[J]. 中国科学: 信息科学, 2018, 48(11):1521-1532.
[35] (An Bo, Han Xianpei, Sun Le. Knowledge-Representation-Enhanced Question-Answering System[J]. Scientia Sinica (Informationis), 2018, 48(11):1521-1532.)
[36] Saxena A, Tripathi A, Talukdar P. Improving Multi-hop Question Answering over Knowledge Graphs Using Knowledge Base Embeddings[C]// Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Seattle: ACL, 2020: 4498-4507.
[37] Chen J D, Hu Y Z, Liu J P, et al. Deep Short Text Classification with Knowledge Powered Attention[C]// Proceedings of the AAAI Conference on Artificial Intelligence. Hawaii: AAAI, 2019, 33(1):6252-6259.
[38] Wang H, Zhang F, Xie X, et al. DKN: Deep Knowledge-Aware Network for News Recommendation[OL]. arXiv Preprint, arXiv: 1801.08284.
[39] Miller G A. WordNet: Alexical Database for English[J]. Communications of the ACM, 1995, 38(11):39-41.
[40] Kumar S, Jat S, Saxena K, et al. Zero-Shot Word Sense Disambiguation Using Sense Definition Embeddings[C]// Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Florence, Italy: ACL, 2019:5670-5681.
[41] 余传明, 王峰, 安璐. 基于深度学习的领域知识对齐模型研究:知识图谱视角[J]. 情报学报, 2019, 38(6):641-654.
[41] (Yu Chuanming, Wang Feng, An Lu. Research on the Domain Knowledge Alignment Model Based on Deep Learning: The Knowledge Graph Perspective[J]. Journal of the China Society for Scientific and Technical Information, 2019, 38(6):641-654.)
[42] Mikolov T, Chen K, Corrado G, et al. Efficient Estimation of Word Representations in Vector Space[C]// Proceedings of the International Conference on Learning Representations. New York:ACM, 2013:1156-1165.
[43] Toutanova K, Chen D Q. Observed Versus Latent Features for Knowledge Base and Text Inference[C]// Proceedings of the 3rd Workshop on Continuous Vector Space Models and Their Compositionality. Beijing, China: ACL, 2015: 57-66.
[44] Akrami F, Saeef M S, Zhang Q H, et al. Realistic Re-evaluation of Knowledge Graph Completion Methods: An Experimental Study[C]// Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data. Oregon, USA: ACM, 2020:1995-2010.
[45] 余传明, 王峰, 张贞港, 等. 基于表示学习的知识库问答模型研究[J]. 科技情报研究, 2021, 3(1):56-70.
[45] (Yu Chuanming, Wang Feng, Zhang Zhen’gang, et al. Research on Knowledge Graph Question Answering Model Based on Representation Learning[J]. Scientific Information Research, 2021, 3(1):56-70.)
[1] Shan Xiaohong,Wang Chunwen,Liu Xiaoyan,Han Shengxi,Yang Juan. Identifying Lead Users in Open Innovation Community from Knowledge-based Perspectives[J]. 数据分析与知识发现, 2021, 5(9): 85-96.
[2] Zhou Zeyu,Wang Hao,Zhao Zibo,Li Yueyan,Zhang Xiaoqin. Construction and Application of GCN Model for Text Classification with Associated Information[J]. 数据分析与知识发现, 2021, 5(9): 31-41.
[3] Zhou Yang,Li Xuejun,Wang Donglei,Chen Fang,Peng Lijuan. Visualizing Knowledge Graph for Explosive Formula Design[J]. 数据分析与知识发现, 2021, 5(9): 42-53.
[4] Zhao Danning,Mu Dongmei,Bai Sen. Automatically Extracting Structural Elements of Sci-Tech Literature Abstracts Based on Deep Learning[J]. 数据分析与知识发现, 2021, 5(7): 70-80.
[5] Shen Kejie, Huang Huanting, Hua Bolin. Constructing Knowledge Graph with Public Resumes[J]. 数据分析与知识发现, 2021, 5(7): 81-90.
[6] Xu Yuemei, Wang Zihou, Wu Zixin. Predicting Stock Trends with CNN-BiLSTM Based Multi-Feature Integration Model[J]. 数据分析与知识发现, 2021, 5(7): 126-138.
[7] Huang Mingxuan,Jiang Caoqing,Lu Shoudong. Expanding Queries Based on Word Embedding and Expansion Terms[J]. 数据分析与知识发现, 2021, 5(6): 115-125.
[8] Zhong Jiawa,Liu Wei,Wang Sili,Yang Heng. Review of Methods and Applications of Text Sentiment Analysis[J]. 数据分析与知识发现, 2021, 5(6): 1-13.
[9] Ruan Xiaoyun,Liao Jianbin,Li Xiang,Yang Yang,Li Daifeng. Interpretable Recommendation of Reinforcement Learning Based on Talent Knowledge Graph Reasoning[J]. 数据分析与知识发现, 2021, 5(6): 36-50.
[10] Chen Wenjie,Wen Yi,Yang Ning. Fuzzy Overlapping Community Detection Algorithm Based on Node Vector Representation[J]. 数据分析与知识发现, 2021, 5(5): 41-50.
[11] Li He,Liu Jiayu,Li Shiyu,Wu Di,Jin Shuaiqi. Optimizing Automatic Question Answering System Based on Disease Knowledge Graph[J]. 数据分析与知识发现, 2021, 5(5): 115-126.
[12] Zhang Guobiao,Li Jie. Detecting Social Media Fake News with Semantic Consistency Between Multi-model Contents[J]. 数据分析与知识发现, 2021, 5(5): 21-29.
[13] Dai Bing,Hu Zhengyin. Review of Studies on Literature-Based Discovery[J]. 数据分析与知识发现, 2021, 5(4): 1-12.
[14] Hu Haotian,Ji Jinfeng,Wang Dongbo,Deng Sanhong. An Integrated Platform for Food Safety Incident Entities Based on Deep Learning[J]. 数据分析与知识发现, 2021, 5(3): 12-24.
[15] Zhang Qi,Jiang Chuan,Ji Youshu,Feng Minxuan,Li Bin,Xu Chao,Liu Liu. Unified Model for Word Segmentation and POS Tagging of Multi-Domain Pre-Qin Literature[J]. 数据分析与知识发现, 2021, 5(3): 2-11.
  Copyright © 2016 Data Analysis and Knowledge Discovery   Tel/Fax:(010)82626611-6626,82624938   E-mail:jishu@mail.las.ac.cn