Please wait a minute...
Data Analysis and Knowledge Discovery  2020, Vol. 4 Issue (1): 63-75    DOI: 10.11925/infotech.2096-3467.2019.0505
Current Issue | Archive | Adv Search |
Knowledge Representation Based on Deep Learning:Network Perspective
Chuanming Yu1(),Haonan Li2,Manyi Wang2,Tingting Huang2,Lu An3
1School of Information and Security Engineering, Zhongnan University of Economics and Law,Wuhan 430073, China
2School of Statistics and Mathematics, Zhongnan University of Economics and Law, Wuhan 430073, China
3School of Information Management, Wuhan University, Wuhan 430072, China
Download: PDF (838 KB)   HTML ( 28
Export: BibTeX | EndNote (RIS)      
Abstract  

[Objective] This paper explores better representation models for the semantic relationship among knowledge objects.[Methods] Based on the existing algorithm of network representation learning, we proposed a combined knowledge network representation learning model (CKNRL), with integrated learning and deep learning techniques.[Results] We examined our new model with the knowledge network link prediction task of Chinese and English news parallel corpus. The AUC value of the CKNRL model was 0.929, which was higher than those of the traditional algorithms, i.e. DeepWalk(0.925), Node2Vec(0.926) and SDNE(0.899).[Limitations] Our study was based on the word co-occurrence network, and more research is needed to examine the CKNRL model for link prediction on more types of knowledge networks.[Conclusions] The semantic relationship among knowledge objects can be better represented by the proposed fusion model.

Key wordsKnowledge Representation      Deep Learning      Network Representation Learning      Link Prediction     
Received: 14 May 2019      Published: 14 March 2020
ZTFLH:  TP391  
Corresponding Authors: Chuanming Yu     E-mail: yucm@zuel.edu.cn

Cite this article:

Chuanming Yu,Haonan Li,Manyi Wang,Tingting Huang,Lu An. Knowledge Representation Based on Deep Learning:Network Perspective. Data Analysis and Knowledge Discovery, 2020, 4(1): 63-75.

URL:

https://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/10.11925/infotech.2096-3467.2019.0505     OR     https://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/Y2020/V4/I1/63

符号表示 说明
di DeepWalk算法获得的针对网络中第i个节点的表征
ni Node2Vec算法获得的针对网络中第i个节点的表征
si SDNE算法获得的针对网络中第i个节点的表征
Dj DeepWalk算法针对第j个节点对的分类结果(概率)
Nj Node2Vec算法针对第j个节点对的分类结果(概率)
Sj SDNE算法针对第j个节点对的分类结果(概率)
Description of Related Symbols
Process of the CKNRL Model
网络信息 中文网络
名词性节点 3 132
动词性节点 1 956
形容词性节点 342
其他知识节点 50
边数 110 301
Statistics of the Single Language Knowledge Network
相关操作 详细说明
表示学习算法种类 DeepWalk、Node2Vec、SDNE
完成任务 链接预测
数据不平衡比例 正例:负例=1:3
选取变量 网络嵌入维度大小
滑动窗口大小
特征构造方法
模型融合方式
机器学习算法
训练集和测试比例 8:2
评估指标 Precision、Recall、F1、Accuracy、AUC
机器学习算法 XGBoost、LightGBM、NB、LR、MLP、RF
Description of Deep Representation Learning Experiments
算法 参数名 参数值
DeepWalk 迭代次数 80
随机游走长度 40
嵌入维度 50、100、150、200
Node2Vec 迭代次数 100
随机游走长度 80
嵌入维度 50、100、150、200
p 1
q 0.5
SDNE 迭代次数 300
学习率 0.01
批处理样本数 64
嵌入维度 50、100、150、200
Alpha 100
Gamma 1
Beta 10
XGBoost Thread 5
scale_pos_weight 3
Parameters of Each Representation Learning Algorithm
维度大小 Precision Recall F1 Accuracy AUC
50 0.74 0.69 0.72 0.864 0.912
100 0.79 0.69 0.74 0.876 0.917
150 0.78 0.69 0.73 0.873 0.915
200 0.75 0.72 0.73 0.870 0.912
Experimental Results of Link Prediction Tasks with Different Embedding Dimensions
窗口大小 Precision Recall F1 Accuracy AUC
3 0.80 0.53 0.63 0.848 0.869
5 0.79 0.69 0.74 0.876 0.917
7 0.75 0.73 0.74 0.871 0.921
9 0.74 0.77 0.75 0.875 0.928
Experimental Results of Link Prediction Tasks with Different Window Sizes
特征构造方法 Precision Recall F1 Accuracy AUC
拼接 0.63 0.78 0.69 0.828 0.891
点乘 0.69 0.78 0.74 0.859 0.919
相减取绝对值 0.74 0.76 0.75 0.872 0.927
相加取平均 0.54 0.73 0.62 0.775 0.828
相减取平方 0.73 0.77 0.75 0.872 0.927
Impact of Different Feature Construction Methods on Link Prediction
α β γ Precision Recall F1 AUC
0.0 0.0 1.0 0.65 0.77 0.71 0.899
0.0 0.3 0.7 0.67 0.74 0.70 0.896
0.0 0.6 0.4 0.59 0.71 0.64 0.861
0.0 0.9 0.1 0.73 0.76 0.75 0.925
0.1 0.0 0.9 0.67 0.77 0.72 0.906
0.1 0.3 0.6 0.67 0.74 0.70 0.893
0.1 0.6 0.3 0.64 0.73 0.68 0.885
0.1 0.9 0.0 0.74 0.77 0.75 0.929
0.2 0.0 0.8 0.67 0.77 0.72 0.905
0.2 0.3 0.5 0.65 0.73 0.69 0.880
0.2 0.6 0.2 0.69 0.75 0.72 0.908
0.3 0.0 0.7 0.68 0.76 0.72 0.901
0.3 0.3 0.4 0.60 0.70 0.65 0.857
0.3 0.6 0.1 0.71 0.77 0.74 0.917
0.4 0.1 0.5 0.66 0.73 0.69 0.884
0.4 0.4 0.2 0.66 0.75 0.70 0.897
0.5 0.0 0.5 0.64 0.72 0.68 0.877
0.5 0.3 0.2 0.64 0.75 0.69 0.896
0.6 0.0 0.4 0.60 0.72 0.66 0.862
0.6 0.3 0.1 0.70 0.78 0.73 0.917
0.7 0.1 0.2 0.69 0.77 0.73 0.911
0.8 0.0 0.2 0.70 0.77 0.73 0.915
0.9 0.0 0.1 0.71 0.79 0.75 0.922
1.0 0.0 0.0 0.72 0.78 0.75 0.925
Partial Experimental Results of Link Prediction Tasks with Different Model Fusion Methods (Network Embedding Fusion)
λ μ η Precision Recall F1 AUC
0.0 0.0 1.0 0.50 0.90 0.64 0.898
0.0 0.5 0.5 0.54 0.93 0.68 0.925
0.0 1.0 0.0 0.55 0.93 0.69 0.928
0.1 0.0 0.9 0.51 0.91 0.65 0.905
0.1 0.5 0.4 0.55 0.93 0.69 0.929
0.2 0.0 0.8 0.52 0.91 0.66 0.911
0.2 0.5 0.3 0.56 0.94 0.70 0.932
0.3 0.0 0.7 0.53 0.91 0.67 0.916
0.3 0.5 0.2 0.57 0.94 0.71 0.934
0.3 0.6 0.1 0.57 0.94 0.71 0.935
0.4 0.0 0.6 0.53 0.91 0.67 0.920
0.4 0.5 0.1 0.58 0.94 0.72 0.935
0.5 0.0 0.5 0.54 0.92 0.68 0.922
0.5 0.5 0.0 0.58 0.94 0.72 0.934
0.6 0.0 0.4 0.55 0.92 0.69 0.924
0.7 0.0 0.3 0.56 0.92 0.70 0.926
0.7 0.1 0.2 0.57 0.93 0.70 0.928
0.8 0.0 0.2 0.57 0.92 0.70 0.926
0.9 0.0 0.1 0.57 0.92 0.70 0.925
1.0 0.0 0.0 0.57 0.92 0.70 0.924
Partial Experimental Results of Link Prediction Tasks with Different Model Fusion Methods (Classification Result Fusion)
Algorithm Precision Recall F1 AUC
NB 0.74 0.76 0.75 0.929
LR 0.75 0.75 0.75 0.928
XGBoost 0.69 0.82 0.75 0.927
LightGBM 0.75 0.75 0.75 0.925
MLP 0.75 0.73 0.74 0.914
RF 0.66 0.77 0.71 0.903
Bagging 0.66 0.73 0.69 0.893
BVC 0.67 0.75 0.71 0.903
Voting 0.78 0.74 0.76 0.924
Impact of Machine Learning Algorithms on Link Prediction (Network Embedding Fusion)
Algorithm Precision Recall F1 AUC
RF 0.70 0.77 0.73 0.917
LR 0.79 0.72 0.75 0.935
MLP 0.79 0.69 0.74 0.918
XGBoost 0.71 0.84 0.77 0.937
LightGBM 0.78 0.74 0.76 0.936
NB 0.78 0.76 0.77 0.920
Bagging 0.72 0.75 0.73 0.920
Voting 0.78 0.75 0.76 0.933
BVC 0.72 0.78 0.75 0.924
Impact of Machine Learning Algorithms on Link Prediction (Classification Result Fusion)
方法 Precision Recall F1 Accuracy AUC
CKNRL 0.74 0.77 0.75 0.874 0.929
DeepWalk 0.72 0.78 0.75 0.868 0.925
Node2Vec 0.73 0.77 0.75 0.872 0.926
SDNE 0.65 0.77 0.71 0.840 0.899
Represents Learning Comparison Experiment Results
[1] Perozzi B, Al-Rfou R, Skiena S. DeepWalk: Online Learning of Social Representations [C]// Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.ACM, 2014: 701-710.
[2] Tang J, Qu M, Wang M, et al. LINE: Large-scale Information Network Embedding [C]// Proceedings of the 24th International Conference on World Wide Web. 2015: 1067-1077.
[3] Grover A, Leskovec J. Node2Vec: Scalable Feature Learning for Networks [C]//Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 2016: 855-864.
[4] Wang D, Cui P, Zhu W. Structural Deep Network Embedding [C]//Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.ACM, 2016: 1225-1234.
[5] 郑彦宁, 许晓阳, 刘志辉 . 基于关键词共现的研究前沿识别方法研究[J]. 图书情报工作, 2016,60(4):85-92.
[5] ( Zheng Yanning, Xu Xiaoyang, Liu Zhihui . Study on the Method of Identifying Research Fronts Based on Keywords Co-occurrence[J]. Library and Information Service, 2016,60(4):85-92.)
[6] 商宪丽, 王学东 . 图书情报知识[J]. 图书情报知识,2016(3):80-88.
[6] ( Shang Xianli, Wang Xuedong . A Feature Selection Method Based on Dynamic Co-Word Network for Microblog Topic Detection[J]. Document, Information & Knowledge, 2016(3):80-88.)
[7] 孙耀吾, 龚晓叶 . 技术标准化主题学术关注度及共词网络演化研究[J]. 情报杂志, 2017,36(9):64-70,37.
[7] ( Sun Yaowu, Gong Xiaoye . The Academic Interest of Technological Standardization Topic and Its Co-Word Network Evolution Research[J]. Journal of Intelligence, 2017,36(9):64-70, 37.)
[8] 马红, 蔡永明 . 共词网络LDA模型的中文文本主题分析:以交通法学文献( 2000 -2016)为例[J]. 现代图书情报技术, 2016(12):17-26.
[8] ( Ma Hong, Cai Yongming . A CA-LDA Model for Chinese Topic Analysis: Case Study of Transportation Law Literature[J]. New Technology of Library and Information Service, 2016(12):17-26.)
[9] 蔡永明, 长青 . 共词网络LDA模型的中文短文本主题分析[J]. 情报学报, 2018,37(3):305-317.
[9] ( Cai Yongming, Chang Qing . Chinese Short Text Topic Analysis by Latent Dirichlet Allocation Model with Co-Word Network Analysis[J]. Journal of the China Society for Scientific and Technical Information, 2018,37(3):305-317.)
[10] 宫雪, 崔雷 . 基于医学主题词共现网络的链接预测研究[J]. 情报杂志, 2018,37(1):66-71,52.
[10] ( Gong Xue, Cui Lei . Link Prediction in MeSH Terms Co-occurring Networks[J]. Journal of Intelligence, 2018,37(1):66-71,52.)
[11] 高继平, 丁堃, 潘云涛 , 等. 共词网络中连线的重要性分析及其应用[J]. 情报理论与实践, 2015,38(2):79-83,70.
[11] ( Gao Jiping, Ding Kun, Pan Yuntao , et al. Importance Analysis and Application of Connections in Co-Word Networks[J]. Information Studies: Theory & Application, 2015,38(2):79-83,70.)
[12] 李纲, 任佳佳, 毛进 , 等. 专利权人合作网络的社群结构分析——以燃料电池电动汽车专利为例[J]. 情报学报, 2014,33(3):267-276.
[12] ( Li Gang, Ren Jiajia, Mao Jin , et al. Analysis of the Community Structure of Patentees’ Collaboration Network——Fuel Cell Electric Vehicle Patents as an Example[J]. Journal of the China Society for Scientific and Technical Information, 2014,33(3):267-276.)
[13] 吕鹏辉, 刘盛博 . 学科知识网络实证研究(Ⅳ)合作网络的结构与特征分析[J]. 情报学报, 2014,33(4):367-374.
[13] ( Lv Penghui, Liu Shengbo . Scientific Knowledge Networks in LIS(IV): Investigation on the Structure and Characteristics of Cooperation Networks[J]. Journal of the China Society for Scientific and Technical Information, 2014,33(4):367-374.)
[14] 陈伟, 李传云, 周文 , 等. 基于新能源汽车的加权专利合作网络研究[J]. 情报学报, 2016,35(6):563-572.
[14] ( Chen Wei, Li Chuanyun, Zhou Wen , et al. Research on the Weighted Patent Cooperation Network Based on New Energy Vehicles[J]. Journal of the China Society for Scientific and Technical Information, 2016,35(6):563-572.)
[15] 范如霞, 曾建勋, 高亚瑞玺 . 基于合作网络的学者动态学术影响力模式识别研究[J]. 数据分析与知识发现, 2017,1(4):30-37.
[15] ( Fan Ruxia, Zeng Jianxun, Gao Yaruixi . Recognizing Dynamic Academic Impacts of Scholars Based on Cooperative Network[J]. Data Analysis and Knowledge Discovery, 2017,1(4):30-37.)
[16] 施晓华, 卢宏涛 . 基于矩阵分解学习的科学合作网络社区发现研究[J]. 数据分析与知识发现, 2017,1(9):49-56.
[16] ( Shi Xiaohua, Lu Hongtao . Detecting Community in Scientific Collaboration Network with Bayesian Symmetric NMF[J]. Data Analysis and Knowledge Discovery, 2017,1(9):49-56.)
[17] 吕鹏辉, 张士靖 . 学科知识网络研究(Ⅰ)引文网络的结构、特征与演化[J]. 情报学报, 2014,33(4):340-348.
[17] ( Lv Penghui, Zhang Shijing . Scientific Knowledge Networks in LIS(I): Case Study on the Structure, Characteristics and Evolution of Citation Networks[J]. Journal of the China Society for Scientific and Technical Information, 2014,33(4):340-348.)
[18] 隗玲, 刘春江, 许海云 , 等. 基于文献关联属性的引文网络主路径识别——以合成生物学领域为例[J]. 情报学报, 2018,37(4):351-361.
[18] ( Wei Ling, Liu Chunjiang, Xu Haiyun , et al. Citation Network Main Path Identification Based on Associated Attributes of Articles: Case Study from Synthetic Biology[J]. Journal of the China Society for Scientific and Technical Information, 2018,37(4):351-361.)
[19] 王忠义, 张鹤铭, 黄京 , 等. 基于社会网络分析的网络问答社区知识传播研究[J]. 数据分析与知识发现, 2018,2(11):80-94.
[19] ( Wang Zhongyi, Zhang Heming, Huang Jing , et al. Studying Knowledge Dissemination of Online Q&A Community with Social Network Analysis[J]. Data Analysis and Knowledge Discovery, 2018,2(11):80-94.)
[20] 范馨月, 崔雷 . 基于网络属性的抗肿瘤药物靶点预测方法及其应用[J]. 数据分析与知识发现, 2018,2(12):98-108.
[20] ( Fan Xinyue, Cui Lei . Predicting Antineoplastic Drug Targets Based on Network Properties[J]. Data Analysis and Knowledge Discovery, 2018,2(12):98-108.)
[21] 余传明, 冯博琳, 安璐 . 基于深度表示学习的跨领域情感分析[J]. 数据分析与知识发现, 2017,1(7):73-81.
[21] ( Yu Chuanming, Feng Bolin, An Lu . Sentiment Analysis in Cross-Domain Environment with Deep Representative Learning[J]. Data Analysis and Knowledge Discovery, 2017,1(7):73-81.)
[22] 李宇琦, 陈维政, 闫宏飞 , 等. 基于网络表示学习的个性化商品推荐[J]. 计算机学报, 2019,42(8):1767-1778.
[22] ( Li Yuqi, Chen Weizheng, Yan Hongfei , et al. Learning Graph-based Embedding for Personalized Product Recommendation[J]. Chinese Journal of Computers, 2019,42(8):1767-1778.)
[23] 张金柱, 于文倩, 刘菁婕 , 等. 基于网络表示学习的科研合作预测研究[J]. 情报学报, 2018,37(2):132-139.
[23] ( Zhang Jinzhu, Yu Wenqian, Liu Jingjie , et al. Predicting Research Collaborations Based on Network Embedding[J]. Journal of the China Society for Scientific and Technical Information, 2018,37(2):132-139.)
[24] 刘姝雯, 徐扬, 王冰璐 , 等. 基于用户表示学习的微博水军识别研究[J]. 情报杂志, 2018,37(7):95-100,87.
[24] ( Liu Shuwen, Xu Yang, Wang Binglu , et al. Water Army Detection of Weibo Using User Representation Learning[J]. Journal of Intelligence, 2018,37(7):95-100,87.)
[25] 樊玮, 韩佳宁, 张宇翔 . 基于网络表示学习的论文影响力预测算法[J/OL]. 计算机工程. .
[25] ( Fan Wei, Han Jianing, Zhang Yuxiang . Paper Influence Prediction Algorithm Based on Network Representation Learning[J/OL]. Computer Engineering. .)
[26] 孙晓玲, 丁堃 . 深度学习中的表示学习研究及其对知识计量的影响[J]. 情报理论与实践, 2018,41(9):118-122.
[26] ( Sun Xiaoling, Ding Kun . Study of Representation Learning in Deep Learning and Its Impact on Knowledge Measurement[J]. Information Studies: Theory & Application, 2018,41(9):118-122.)
[27] 廖祥文, 刘德元, 桂林 , 等. 融合文本概念化与网络表示的观点检索[J]. 软件学报, 2018,29(10):2899-2914.
[27] ( Liao Xiangwen, Liu Deyuan, Gui Lin , et al. Opinion Retrieval Method Combining Text Conceptualization and Network Embedding[J]. Journal of Software, 2018,29(10):2899-2914.)
[28] 刘思, 刘海, 陈启买 , 等. 基于网络表示学习与随机游走的链路预测算法[J]. 计算机应用, 2017,37(8):2234-2239.
[28] ( Liu Si, Liu Hai, Chen Qimai , et al. Link Prediction Algorithm Based on Network Representation Learning and Random Walk[J]. Journal of Computer Applications, 2017,37(8):2234-2239.)
[29] 刘知远, 孙茂松, 林衍凯 , 等. 知识表示学习研究进展[J]. 计算机研究与发展, 2016,53(2):247-261.
[29] ( Liu Zhiyuan, Sun Maosong, Lin Yankai , et al. Knowledge Representation Learning: A Review[J]. Journal of Computer Research and Development, 2016,53(2):247-261.)
[30] 新闻平行语料数据集[EB/OL]. [ 2018- 01- 01]. .
[30] ( News-commentary Corpus[EB/OL]. [ 2018- 01- 01]. )
[31] Jieba文档[EB/OL]. [2018-01-01]..
[31] ( Jieba Document[EB/OL]. [2018-01-01]..)
[1] Shan Xiaohong,Wang Chunwen,Liu Xiaoyan,Han Shengxi,Yang Juan. Identifying Lead Users in Open Innovation Community from Knowledge-based Perspectives[J]. 数据分析与知识发现, 2021, 5(9): 85-96.
[2] Zhou Zeyu,Wang Hao,Zhao Zibo,Li Yueyan,Zhang Xiaoqin. Construction and Application of GCN Model for Text Classification with Associated Information[J]. 数据分析与知识发现, 2021, 5(9): 31-41.
[3] Xu Yuemei, Wang Zihou, Wu Zixin. Predicting Stock Trends with CNN-BiLSTM Based Multi-Feature Integration Model[J]. 数据分析与知识发现, 2021, 5(7): 126-138.
[4] Zhao Danning,Mu Dongmei,Bai Sen. Automatically Extracting Structural Elements of Sci-Tech Literature Abstracts Based on Deep Learning[J]. 数据分析与知识发现, 2021, 5(7): 70-80.
[5] Huang Mingxuan,Jiang Caoqing,Lu Shoudong. Expanding Queries Based on Word Embedding and Expansion Terms[J]. 数据分析与知识发现, 2021, 5(6): 115-125.
[6] Zhong Jiawa,Liu Wei,Wang Sili,Yang Heng. Review of Methods and Applications of Text Sentiment Analysis[J]. 数据分析与知识发现, 2021, 5(6): 1-13.
[7] Zhang Guobiao,Li Jie. Detecting Social Media Fake News with Semantic Consistency Between Multi-model Contents[J]. 数据分析与知识发现, 2021, 5(5): 21-29.
[8] Shi Xiang,Liu Ping. Extraction and Representation of Domain Knowledge with Semantic Description Model and Knowledge Elements——Case Study of Information Retrieval[J]. 数据分析与知识发现, 2021, 5(4): 123-133.
[9] Zhang Xin,Wen Yi,Xu Haiyun. A Prediction Model with Network Representation Learning and Topic Model for Author Collaboration[J]. 数据分析与知识发现, 2021, 5(3): 88-100.
[10] Chang Chengyang,Wang Xiaodong,Zhang Shenglei. Polarity Analysis of Dynamic Political Sentiments from Tweets with Deep Learning Method[J]. 数据分析与知识发现, 2021, 5(3): 121-131.
[11] Feng Yong,Liu Yang,Xu Hongyan,Wang Rongbing,Zhang Yonggang. Recommendation Model Incorporating Neighbor Reviews for GRU Products[J]. 数据分析与知识发现, 2021, 5(3): 78-87.
[12] Hu Haotian,Ji Jinfeng,Wang Dongbo,Deng Sanhong. An Integrated Platform for Food Safety Incident Entities Based on Deep Learning[J]. 数据分析与知识发现, 2021, 5(3): 12-24.
[13] Zhang Qi,Jiang Chuan,Ji Youshu,Feng Minxuan,Li Bin,Xu Chao,Liu Liu. Unified Model for Word Segmentation and POS Tagging of Multi-Domain Pre-Qin Literature[J]. 数据分析与知识发现, 2021, 5(3): 2-11.
[14] Lv Xueqiang,Luo Yixiong,Li Jiaquan,You Xindong. Review of Studies on Detecting Chinese Patent Infringements[J]. 数据分析与知识发现, 2021, 5(3): 60-68.
[15] Cheng Bin,Shi Shuicai,Du Yuncheng,Xiao Shibin. Keyword Extraction for Journals Based on Part-of-Speech and BiLSTM-CRF Combined Model[J]. 数据分析与知识发现, 2021, 5(3): 101-108.
  Copyright © 2016 Data Analysis and Knowledge Discovery   Tel/Fax:(010)82626611-6626,82624938   E-mail:jishu@mail.las.ac.cn