Please wait a minute...
Data Analysis and Knowledge Discovery  2021, Vol. 5 Issue (11): 102-113    DOI: 10.11925/infotech.2096-3467.2021.0323
Current Issue | Archive | Adv Search |
Network Structure’s Impacts on Link Prediction Algorithm from Meta-Analysis Perspective
Wu Shengnan,Pu Hongjun,Tian Ruonan,Liang Wenqi,Yu Qi()
School of Management, Shanxi Medical University, Taiyuan 030001, China
Download: PDF (1215 KB)   HTML ( 9
Export: BibTeX | EndNote (RIS)      
Abstract  

[Objective] This paper tries to identify the main influencing parameters of the link prediction algorithms with the help of network structures and data from multiple studies. [Methods] We retrieved empirical research on link prediction from China and abroad, which include 5 papers, 22 networks, 26 algorithms and 278 studies. We used three-level meta-analysis and Bayesian network meta-analysis to explore the network structures and their impacts on algorithms’ performance. [Results] The algorithms included in our study generally had a good predictive effect MD=1.183 2 (95%CI: (1.000 5, 1.365 9)). The network density, average degree and clustering coefficient are the main factors affecting the prediction results (Pval<0.05). Katz, LHN-II, MFI, LRW, and SRW algorithms yielded better results with sparse networks and their SUCRA values were greater than 0.5. [Limitations] Our research does not include empirical analysis with large-scale data. [Conclusions] With the help of meta-analysis, our study explores the development directions for the link prediction algorithms.

Key wordsLink Prediction Algorithm      Prediction Accuracy      Network Structure      Three-Level Meta-Analysis      Bayesian Network Meta-Analysis     
Received: 01 April 2021      Published: 23 December 2021
ZTFLH:  G353  
Fund:National Natural Science Foundation of China(71804102);National Natural Science Foundation of China(71573162);Philosophy and Social Science Research Project of Colleges and Universities in Shanxi Province(2019W040)
Corresponding Authors: Yu Qi,ORCID:0000-0001-9154-0229     E-mail: yuqi@sxmu.edu.cn

Cite this article:

Wu Shengnan, Pu Hongjun, Tian Ruonan, Liang Wenqi, Yu Qi. Network Structure’s Impacts on Link Prediction Algorithm from Meta-Analysis Perspective. Data Analysis and Knowledge Discovery, 2021, 5(11): 102-113.

URL:

https://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/10.11925/infotech.2096-3467.2021.0323     OR     https://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/Y2021/V5/I11/102

序号 网络 节点数 边数 网络直径 网络密度 平均度 聚集系数 平均(最短)距离
1 Karate 34 78 5 0.139 0 4.588 0.588 2.408
2 Train Bombing 64 243 6 0.121 0 8.812 0.711 2.691
3 Net Science 379 914 17 0.013 0 4.823 0.798 6.042
4 US Airports 332 2 126 6 0.039 0 12.807 0.749 2.738
5 Karate(unweighted) 34 77 5 0.137 0 4.529 0.574 2.426
6 Dolphins 62 159 6 0.084 0 5.129 0.290 3.111
7 Football 115 613 4 0.094 0 10.661 0.403 2.508
8 C.elegan 297 2 148 5 0.049 0 14.465 0.308 2.455
9 FWFW 128 2 075 3 0.255 3 32.422 0.335 1.776
10 Polblogs 1 490 19 025 8 0.015 0 22.436 0.360 2.738
11 Jazz 198 2 742 6 0.140 6 27.700 0.618 2.235
12 中国文学 91 163 8 0.039 8 3.580 0.570 4.160
13 历史学 24 38 2 0.137 7 3.170 0.910 1.860
14 法学 56 69 7 0.044 0 2.460 0.430 3.850
15 教育学 2 556 3 647 44 0.001 1 2.850 0.570 16.180
16 图书馆情报文献学 9 471 19 104 22 0.000 4 4.030 0.680 7.340
17 经济学 14 900 23 746 26 0.000 2 3.180 0.590 9.330
18 管理学 10 878 18 360 27 0.000 3 3.380 0.690 9.990
19 Citeseer 3 312 4 732 28 0.001 0 2.857 0.257 9.036
20 DBLP 3 119 39 516 14 0.008 0 25.339 0.259 4.199
21 Cora 2 708 5 429 19 0.001 0 4.010 0.293 6.310
22 Wiki 2 405 17 981 9 0.006 0 14.953 0.480 3.650
Network Datasets Included in the Study
主效应 k 主效应及CI 异质性检验
MD 95%CI 标准误差 I2/% Q Pval
总效应 278 1.183 2 (1.000 5,1.365 9) 0.092 8 0.333/98.77% 12 812.024 7 <0.000 1
水平2 28 1.213 4 (1.033 1,1.393 7) 0.091 6 0.108/32.14% —— <0.000 1
水平3 278 1.231 5 (1.168 9,1.294 2) 0.031 8 0.225/66.64% —— <0.000 1
Main Effects and Heterogeneity
截距 95%CI t Pval
-5.697 0 (-7.461 0,3.933 0) -6.212 0 0.000 0
Egger’s Test Results
Funnel Plot after Duval&Tweedie’s Trim-and-Fill Procedure
调节变量 回归系数 标准误差 t Pval CI
截距 0.593 0 0.323 2 1.834 9 0.067 6 (-0.043 3,1.229 2)
节点 0.000 0 0.000 0 -0.120 2 0.904 4 (-0.000 1,0.000 1)
0.000 0 0.000 0 -1.275 5 0.203 2 (0.000 0,0.000 0)
网络直径 -0.010 0 0.049 4 -0.203 3 0.839 1 (-0.107 2,0.087 1)
网络密度 -7.271 8 1.860 6 -3.908 4 0.000 1 (-10.934 8,-3.608 7)
平均度 0.034 9 0.013 8 2.522 5 0.012 2 (0.007 7,0.062 2)
聚集系数 1.297 6 0.437 0 2.969 7 0.003 2 (0.437 3,2.157 9)
平均(最短)距离 0.057 6 0.149 4 0.385 7 0.700 0 (-0.236 5,0.351 7)
Effect of Sample Size and Structural Characteristics of Network Heterogeneity
调节变量 分组
标准
研究
数量
合并MD值(95%CI) 标准误差 t Q Pval 水平2方差I2/% 水平3方差I2/%
网络密度 >0.06 54 0.777 4 (0.461 9,1.093 0) 0.157 3 4.941 3 1 517.564 1 <0.000 1 0.139/38.96 0.209/58.15
≤0.06 224 1.393 7 (1.237 5,1.549 9) 0.079 3 17.580 7 10 236.787 6 <0.000 1 0.101/49.45 0.100/48.81
平均度 >10.02 98 1.136 1 (0.895 6,1.376 6) 0.121 2 9.374 8 2 736.970 8 <0.000 1 0.076/33.95 0.146/64.70
≤10.02 180 1.212 6 (0.949 7,1.475 5) 0.133 2 9.103 2 8 971.691 6 <0.000 1 0.126/30.59 0.280/68.19
聚集系数 >0.49 102 1.306 0 (0.966 7,1.645 2) 0.171 0 7.635 4 6 769.819 0 <0.000 1 0.127/26.21 0.355/73.10
≤0.49 176 1.082 5 (0.911 1,1.253 9) 0.086 9 12.463 7 3 783.386 9 <0.000 1 0.094/47.87 0.098/49.76
Hierarchical Three-Level Meta-Analysis Results
The Impact of Network Density for Prediction Accuracy of Each Algorithm
The Impact of Average Degree for Prediction Accuracy of Each Algorithm
The Impact of Clustering Coefficient for Prediction Accuracy of Each Algorithm
[1] Watts D J, Strogatz S H. Collective Dynamics of ‘Small-World’ Networks[J]. Nature, 1998, 393(6684):440-442.
doi: 10.1038/30918
[2] Chen S Y, Huang W, Cattani C, et al. Traffic Dynamics on Complex Networks: A Survey[J]. Mathematical Problems in Engineering, 2012: Article No.732698.
[3] 张彦超, 刘云, 张海峰, 等. 基于在线社交网络的信息传播模型[J]. 物理学报, 2011, 60(5):66-72.
[3] (Zhang Yanchao, Liu Yun, Zhang Haifeng, et al. The Research of Information Dissemination Model on Online Social Network[J]. Acta Physica Sinica, 2011, 60(5):66-72.)
[4] 吕琳媛. 复杂网络链路预测[J]. 电子科技大学学报, 2010, 39(5):651-661.
[4] (Lv Linyuan. Link Prediction on Complex Networks[J]. Journal of University of Electronic Science and Technology of China, 2010, 39(5):651-661.)
[5] Mamitsuka H. Mining from Protein-Protein Interactions[J]. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 2012, 2(5):400-410.
doi: 10.1002/widm.1065
[6] 张斌, 马费成. 科学知识网络中的链路预测研究述评[J]. 中国图书馆学报, 2015, 41(3):99-113.
[6] (Zhang Bin, Ma Feicheng. A Review on Link Prediction of Scientific Knowledge Network[J]. Journal of Library Science in China, 2015, 41(3):99-113.)
[7] Kossinets G. Effects of Missing Data in Social Networks[J]. Social Networks, 2006, 28(3):247-268.
doi: 10.1016/j.socnet.2005.07.002
[8] Ren X L, Lv L, Liu R R, et al. Avoiding Congestion in Recommender Systems[J]. New Journal of Physics, 2014, 16(6):063057.
doi: 10.1088/1367-2630/16/6/063057
[9] Ahmed H S, Faouzi B M, Caelen J. Detection and Classification of the Behavior of People in an Intelligent Building by Camera[J]. International Journal on Smart Sensing and Intelligent Systems, 2013, 6(4):1317-1342.
doi: 10.21307/ijssis-2017-592
[10] Kaya B, Poyraz M. Unsupervised Link Prediction in Evolving Abnormal Medical Parameter Networks[J]. International Journal of Machine Learning and Cybernetics, 2016, 7(1):145-155.
doi: 10.1007/s13042-015-0405-y
[11] 马代川, 罗代兵. 基于网络链路预测的药物分子重定位研究[J]. 化学研究与应用, 2020, 32(5):752-757.
[11] (Ma Daichuan, Luo Daibing. Link Prediction Algorithm for Drug Repositioning Based on Structural Feature[J]. Chemical Research and Application, 2020, 32(5):752-757.)
[12] 吕琳媛, 任晓龙, 周涛. 网络链路预测: 概念与前沿[J]. 中国计算机学会通讯, 2016, 12(4):12-19.
[12] (Lv Linyuan, Ren Xiaolong, Zhou Tao. Network Link Prediction: Concepts and Frontiers[J]. Communications of CCF, 2016, 12(4):12-19.)
[13] Zhou T, Lv L, Zhang Y C. Predicting Missing Links via Local Information[J]. The European Physical Journal B, 2009, 71(4):623-630.
doi: 10.1140/epjb/e2009-00335-8
[14] Ahn M W, Jung W S. Accuracy Test for Link Prediction in Terms of Similarity Index: The Case of WS and BA Models[J]. Physica A: Statistical Mechanics and Its Applications, 2015, 429:177-183.
doi: 10.1016/j.physa.2015.01.083
[15] Gupta A K, Sardana N. Impact of Topological Properties over Link Prediction Based on Node Neighbourhood: A Study[C]// Proceedings of the 7th International Conference on Contemporary Computing (IC3). IEEE, 2014: 194-198.
[16] Feng X, Zhao J C, Xu K. Link Prediction in Complex Networks: A Clustering Perspective[J]. The European Physical Journal B, 2012, 85:3.
doi: 10.1140/epjb/e2011-20207-x
[17] 贾珺, 胡晓峰, 贺筱媛. 网络结构特征与链路预测算法关系研究[J]. 复杂系统与复杂性科学, 2017, 14(1):28-37.
[17] (Jia Jun, Hu Xiaofeng, He Xiaoyuan. On the Relationship Between Network Structure Features and Link Prediction Algorithms[J]. Complex Systems and Complexity Science, 2017, 14(1):28-37.)
[18] Liu M M, Guo J F, Luo X. Link Prediction Based on the Similarity of Transmission Nodes of Multiple Paths in Weighted Social Networks[J]. Journal of Information Hiding and Multimedia Signal Processing, 2016, 7(4):771-780.
[19] 陈佳璐, 钱宇华, 张晓琴, 等. 依据节点贡献的链路预测方法[J]. 小型微型计算机系统, 2016, 37(1):92-95.
[19] (Chen Jialu, Qian Yuhua, Zhang Xiaoqin, et al. Link Prediction Method According to Node Contribution[J]. Journal of Chinese Computer Systems, 2016, 37(1):92-95.)
[20] 张斌, 李亚婷, 戴怡清. 学科合作网络的链路挖掘与应用分析[J]. 情报理论与实践, 2018, 41(9):108-113.
[20] (Zhang Bin, Li Yating, Dai Yiqing. Link Mining and Application Analysis of Disciplinary Collaboration Network[J]. Information Studies: Theory & Application, 2018, 41(9):108-113.)
[21] Yang Y L, Ye Z L, Zhao H X, et al. Link Prediction Based on Gravitational Field of Complex Network[C]// Proceedings of 2019 IEEE International Conference on Dependable, Autonomic and Secure Computing, International Conference on Pervasive Intelligence and Computing, International Conference on Cloud and Big Data Computing, International Conference on Cyber Science and Technology Congress. IEEE, 2019: 499-504.
[22] 杨燕琳, 冶忠林, 赵海兴, 等. 基于高阶近似的链路预测算法[J]. 计算机应用, 2019, 39(8):2366-2373.
[22] (Yang Yanlin, Ye Zhonglin, Zhao Haixing, et al. Link Prediction Algorithm Based on High-Order Proximity Approximation[J]. Journal of Computer Applications, 2019, 39(8):2366-2373.)
[23] 王慧, 乐孜纯, 龚轩, 等. 基于特征分类的链路预测方法综述[J]. 计算机科学, 2020, 47(8):302-312.
[23] (Wang Hui, Le Zichun, Gong Xuan, et al. Review of Link Prediction Methods Based on Feature Classification[J]. Computer Science, 2020, 47(8):302-312.)
[24] Newman M E. Clustering and Preferential Attachment in Growing Networks[J]. Physical Review E, Statistical, Nonlinear, and Soft Matter Physics, 2001, 64(2):025102.
doi: 10.1103/PhysRevE.64.025102
[25] Adamic L A, Adar E. Friends and Neighbors on the Web[J]. Social Networks, 2003, 25(3):211-230.
doi: 10.1016/S0378-8733(03)00009-1
[26] Barabási A L, Albert R. Emergence of Scaling in Random Networks[J]. Science, 1999, 286(5439):509-512.
doi: 10.1126/science.286.5439.509
[27] Katz L. A New Status Index Derived from Sociometric Analysis[J]. Psychometrika, 1953, 18(1):39-43.
doi: 10.1007/BF02289026
[28] Leicht E A, Holme P, Newman M E J. Vertex Similarity in Networks[J]. Physical Review E, 2006, 73(2):026120.
doi: 10.1103/PhysRevE.73.026120
[29] Lv L, Jin C H, Zhou T. Similarity Index Based on Local Paths for Link Prediction of Complex Networks[J]. Physical Review E, Statistical, Nonlinear, and Soft Matter Physics, 2009, 80(4):046122.
doi: 10.1103/PhysRevE.80.046122
[30] Liu W P, Lv L. Link Prediction Based on Local Random Walk[J]. EPL (EuroPhysics Letters), 2010, 89(5):58007.
doi: 10.1209/0295-5075/89/58007
[31] Symeonidis P, Tiakas E, Manolopoulos Y. Transitive Node Similarity for Link Prediction in Social Networks with Positive and Negative Links[C]// Proceedings of the 4th ACM Conference on Recommender Systems. 2010.
[32] Fredman M L, Tarjan R E. Fibonacci Heaps and Their Uses in Improved Network Optimization Algorithms[C]// Proceedings of the 25th Annual Symposium on Foundations of Computer Science. 1984. DOI: 10.1109/SFCS.1984.715934.
doi: 10.1109/SFCS.1984.715934
[33] Chebotarev P, Shamis E. The Matrix-Forest Theorem and Measuring Relations in Small Social Groups[J]. Automation & Remote Control, 2006, 58(9):1505-1514.
[34] 王润芳, 陈增强, 刘忠信. 融合朴素贝叶斯方法的复杂网络链路预测[J]. 智能系统学报, 2019, 14(1):99-107.
[34] (Wang Runfang, Chen Zengqiang, Liu Zhongxin. Link Prediction in Complex Networks with Syncretic Naive Bayes Methods[J]. CAAI Transactions on Intelligent Systems, 2019, 14(1):99-107.)
[35] Gupta A K, Sardana N. Significance of Clustering Coefficient over Jaccard Index[C]// Proceedings of the 8th International Conference on Contemporary Computing. 2015: 463-466.
[36] 张斌, 李亚婷, 戴怡清. 聚集系数对合著网络链路预测效果的影响研究[J]. 情报理论与实践, 2018, 41(1):100-104.
[36] (Zhang Bin, Li Yating, Dai Yiqing. Research on the Influence of Clustering Coefficient on the Link Prediction in Collaboration Networks[J]. Information Studies: Theory & Application, 2018, 41(1):100-104.)
[37] Yan E J, Guns R. Predicting and Recommending Collaborations: An Author-, Institution-, and Country-Level Analysis[J]. Journal of Informetrics, 2014, 8(2):295-309.
doi: 10.1016/j.joi.2014.01.008
[38] Assink M, Wibbelink C J M. Fitting Three-Level Meta-Analytic Models in R: A Step-by-Step Tutorial[J]. The Quantitative Methods for Psychology, 2016, 12(3):154-174.
doi: 10.20982/tqmp.12.3.p154
[39] Glass G V. Primary, Secondary, and Meta-Analysis of Research[J]. Educational Research, 1976, 5(10):3-8.
[40] 刘志辉, 李文绚. 元分析方法研究综述[J]. 情报工程, 2016, 2(6):31-38.
[40] (Liu Zhihui, Li Wenxuan. Literature Review on Meta-Analysis[J]. Technology Intelligence Engineering, 2016, 2(6):31-38.)
[41] 龙雄伟. 复杂网络视角下我国产业关联结构研究[J]. 商业经济研究, 2017(20):180-182.
[41] (Long Xiongwei. Study on the Industrial Correlation Structure in China from the Aspects of Complex Network[J]. Journal of Commercial Economics, 2017 (20):180-182.)
[42] 沈旭慧, 王珍. 非比较研究及前后对照研究结局变量Meta分析的效应量选择[J]. 中国组织工程研究与临床康复, 2011, 15(4):733-736.
[42] (Shen Xuhui, Wang Zhen. Effect Size Selection for Meta Analysis of Outcome Variables in Noncomparative Studies and Pre-post Contrasts Studies[J]. Journal of Clinical Rehabilitative Tissue Engineering Research, 2011, 15(4):733-736.)
[43] Borenstein M, Hedges L V, Higgins J P T, et al. Introduction to Meta-Analysis[M]. Chichester, UK: John Wiley & Sons, Ltd, 2009.
[44] van Valkenhoef G, Lu G B, de Brock B, et al. Automating Network Meta-Analysis[J]. Research Synjournal Methods, 2012, 3(4):285-299.
[45] 沃特·德·诺伊, 安德烈·姆尔瓦, 弗拉迪米尔·巴塔盖尔吉, 等. 蜘蛛: 社会网络分析技术[M]. 林枫译. 北京: 世界图书出版公司, 2012.
[45] (Wouter D N, Andrej M, Vladimir B, et al. Exploratory Social Network Analysis with Pajek[M]. Translated by Lin Feng. Beijing: World Publishing Corporation, 2012.)
[46] 李欣, 温阳, 黄鲁成, 等. 多层网络分析视域下的新兴技术研发合作网络演化特征研究[J]. 情报杂志, 2021, 40(1):62-70.
[46] (Li Xin, Wen Yang, Huang Lucheng, et al. Research on Evolution Characteristics of Emerging Technology R&D Cooperation Network from the Perspective of Multi-level Network Analysis[J]. Journal of Intelligence, 2021, 40(1):62-70.)
[47] 刘建香. 复杂网络及其在国内研究进展的综述[J]. 系统科学学报, 2009, 17(4):31-37.
[47] (Liu Jianxiang. Complex Network and Review of Domestic Research[J]. Chinese Journal of Systems Science, 2009, 17(4):31-37.)
[1] Wang Song, Yang Yang, Liu Xinmin. Discovering Potentialities of User Ideas from Open Innovation Communities with Graph Attention Network[J]. 数据分析与知识发现, 2021, 5(11): 89-101.
[2] Qingtian Zeng,Xiaohui Hu,Chao Li. Extracting Keywords with Topic Embedding and Network Structure Analysis[J]. 数据分析与知识发现, 2019, 3(7): 52-60.
[3] Huang Wei,Yu Hui,Li Yuefeng. Review of Online Anti-terrorism Research in China[J]. 现代图书情报技术, 2016, 32(11): 1-10.
[4] Lixin Xia,Ying Tan. Analysis and Visualization of the LOD Network Structure[J]. 现代图书情报技术, 2016, 32(1): 65-72.
[5] Chen Guo, Hu Changping. Research on the Structural Features of Keyword Network of Scientific Research Areas:An Empirical Study of LIS[J]. 现代图书情报技术, 2014, 30(7): 84-91.
  Copyright © 2016 Data Analysis and Knowledge Discovery   Tel/Fax:(010)82626611-6626,82624938   E-mail:jishu@mail.las.ac.cn