Please wait a minute...
Data Analysis and Knowledge Discovery  2024, Vol. 8 Issue (2): 114-130    DOI: 10.11925/infotech.2096-3467.2022.1311
Current Issue | Archive | Adv Search |
Influence of Network Structure Changes on Co-word Network Link Prediction
Chen Zhuo1,Jiang Xixi1,Zhang Xiaojuan2()
1School of Computer and Information Science, Southwest University, Chongqing 400715, China
2School of Public Administration, Sichuan University, Chengdu 610065, China
Download: PDF (2103 KB)   HTML ( 5
Export: BibTeX | EndNote (RIS)      
Abstract  

[Objective] This article studies the impacts of co-word network structure changes on link prediction using the similarity metric.[Methods] Firstly, we randomly retrieved the ISLS, LAW, BSS, COM, and Ocean literature from the core collection of Web of Science (2015 to 2020). Secondly, according to the diverse keyword frequencies, we constructed co-word networks with various topological features, such as the number of nodes and edges, the Average Clustering Coefficient, the Density, the Network Transitivity, and the Average Degree. Finally, we chose 15 traditional link prediction similarity metrics(e.g., AA, CN, RWR, and Katz) to conduct link prediction experiments on various co-word networks. [Results] We compared and analyzed the prediction effects of different similarity metrics with the network structure change. (1) In different disciplines, in most cases, the larger the overall frequency of keywords in the co-word network, the smaller the average clustering coefficient, the larger the density, network transitivity, average degree, average degree centrality, average betweenness centrality and average closeness, and the greater the possibility of poor link prediction effect. Conversely, the larger the average clustering coefficient, the smaller the other network topologies, and the better the link prediction effect. (2) Among the 15 selected similarity indicators, the RWR metric performed the best in co-word networks with different topological characteristics. The prediction performance of the Katz metrics is the most stable in different co-word networks. The prediction results of each index in the LAW discipline are most affected by the change in keyword frequency. [Limitations] Due to limited computing space, we only used one classification method and one evaluation index in this study. In addition, we did not explore some node similarity indicators (i.e., likelihood analysis-based metrics and probability model-based metrics). [Conclusions] This study provides a theoretical foundation for selecting similarity metrics of co-word networks of different disciplines.

Key wordsCo-word Network      Link Prediction      Network Structure      Similarity Metric     
Received: 09 December 2022      Published: 08 May 2023
ZTFLH:  TP393  
  G250  
Fund:National Social Science Fund of China(21BTQ072)
Corresponding Authors: Zhang Xiaojuan,ORCID:0000-0002-5889-5922, E-mail: zxj0614@scu.edu.cn。   

Cite this article:

Chen Zhuo, Jiang Xixi, Zhang Xiaojuan. Influence of Network Structure Changes on Co-word Network Link Prediction. Data Analysis and Knowledge Discovery, 2024, 8(2): 114-130.

URL:

https://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/10.11925/infotech.2096-3467.2022.1311     OR     https://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/Y2024/V8/I2/114

The Number of Publications in Each Discipline (2015-2020)
数据集 ISLS COM LAW Ocean BSS
关键词总数(个) 1 960 971 1 882 008 2 418 845 3 008 470 660 138
训练阶段关键词数(个) 40 816 72 019 75 446 58 271 22 019
测试阶段关键词数(个) 35 042 34 495 158 669 43 193 15 946
训练测试关键词交集数(个) 10 596 68 990 9 916 14 567 12 566
The Total Number of Keywords in Each Discipline
The Number of Keywords Obtained by Different Subject Networks According to Different Keyword Frequencies
学科 关键词频次 样本总数 正样本数 负样本数
ISLS >4 8 965 657 47 482 8 918 175
>6 4 373 525 38 531 4 334 994
>8 2 508 636 31 944 2 476 692
>10 1 597 617 27 338 1 570 279
>12 1 102 056 23 799 1 078 257
>14 807 506 20 926 786 580
COM >4 13 193 290 78 361 13 114 929
>6 7 480 671 57 537 7 423 134
>8 4 307 303 45 189 4 262 194
>10 2 817 955 39 467 2 778 488
>12 1 960 670 34 751 1 925 919
>14 1 430 528 30 298 1 400 230
LAW >4 8 803 410 34 882 8 768 528
>6 4 258 082 27 969 4 230 113
>8 2 477 283 23 242 2 454 041
>10 1 544 960 19 445 1 525 515
>12 1 048 298 16 481 1 031 817
>14 720 238 14 138 706 100
Ocean >4 19 285 428 39 419 19 246 009
>6 9 301 975 30 683 9 271 292
>8 5 179 696 24 684 5 155 012
>10 3 244 393 20 637 3 223 756
>12 2 118 050 17 328 2 100 722
>14 1 481 369 14 954 1 466 415
BSS >4 4 483 656 29 825 4 453 831
>6 2 354 809 24 679 2 330 130
>8 952 727 19 482 933 245
>10 952 727 18 269 934 458
>12 649 326 15 804 633 522
>14 399 602 12 036 399 482
Statistical Information of Samples in Each Discipline
关键词频次 节点数 连边数 密度 平均聚类系数 网络传递性 平均度 平均度中心性 平均中介中心性 平均接近中心性
>4 5 307 81 988 0.005 8 0.363 6 0.113 7 30.898 0 0.005 8 0.000 3 0.354 6
>6 3 674 67 426 0.010 0 0.321 6 0.127 6 36.704 4 0.010 0 0.000 4 0.378 3
>8 2 769 56 972 0.014 9 0.302 5 0.141 1 41.149 9 0.014 9 0.000 5 0.396 9
>10 2 217 49 518 0.020 2 0.292 6 0.151 6 44.671 2 0.020 2 0.000 6 0.414 2
>12 1 823 32 231 0.019 4 0.255 5 0.143 7 35.360 4 0.019 4 0.000 8 0.383 2
>14 1 550 28 376 0.023 6 0.263 4 0.153 8 36.614 2 0.023 6 0.000 8 0.388 4
The Characteristics of Topological Structures of Co-work Constructed in ISLS Discipline
关键词频次 节点数 连边数 密度 平均聚类系数 网络传递性 平均度 平均度中心性 平均中介中心性 平均接近中心性
>4 10 304 313 072 0.005 9 0.353 7 0.097 1 60.767 1 0.005 9 0.001 7 0.368 7
>6 3 886 75 599 0.020 0 0.303 8 0.108 4 38.908 4 0.010 0 0.000 3 0.364 1
>8 2 957 66 019 0.015 1 0.291 8 0.119 7 44.652 7 0.015 1 0.000 4 0.401 2
>10 2 398 58 445 0.020 3 0.285 8 0.131 3 48.744 8 0.020 3 0.000 5 0.424 3
>12 2 006 52 350 0.026 0 0.285 1 0.141 8 52.193 4 0.026 0 0.000 6 0.441 6
>14 1 719 47 812 0.032 4 0.291 2 0.151 4 55.627 7 0.032 4 0.000 7 0.467 1
The Characteristics of Topological Structures of Co-work Constructed in COM Discipline
关键词频次 节点数 连边数 密度 平均聚类系数 网络传递性 平均度 平均度中心性 平均中介中心性 平均接近中心性
>4 4 207 48 117 0.005 4 0.342 0 0.095 3 22.874 7 0.005 4 0.000 5 0.338 3
>6 2 931 38 763 0.009 0 0.292 9 0.103 7 26.450 4 0.009 0 0.000 6 0.362 8
>8 2 240 32 636 0.013 0 0.271 2 0.111 4 29.139 3 0.013 0 0.000 7 0.382 6
>10 1 773 27 690 0.017 6 0.256 4 0.119 1 31.235 2 0.017 6 0.000 9 0.398 2
>12 1 464 24 082 0.022 5 0.247 7 0.126 9 32.898 9 0.022 5 0.001 0 0.411 6
>14 1 217 20 915 0.028 3 0.244 4 0.135 5 34.371 4 0.028 3 0.001 1 0.424 4
The Characteristics of Topological Structures of Co-work Constructed in LAW Discipline
关键词频次 节点数 连边数 密度 平均聚类系数 网络传递性 平均度 平均度中心性 平均中介中心性 平均接近中心性
>4 6 223 80 547 0.004 2 0.293 3 0.092 8 25.886 9 0.004 1 0.000 3 0.327 4
>6 4 328 65 980 0.007 0 0.251 9 0.101 7 30.489 8 0.007 0 0.000 4 0.352 7
>8 3 235 54 533 0.010 4 0.231 6 0.111 9 33.714 4 0.010 4 0.000 5 0.369 8
>10 2 565 46 501 0.014 1 0.224 9 0.121 7 36.258 1 0.014 1 0.000 6 0.384 0
>12 2 077 39 952 0.018 5 0.223 1 0.132 2 38.470 9 0.018 5 0.000 7 0.397 3
>14 1 741 35 041 0.023 1 0.220 2 0.141 9 40.253 9 0.023 1 0.000 8 0.408 6
The Characteristics of Topological Structures of Co-work Constructed in Ocean Discipline
关键词频次 节点数 连边数 密度 平均聚类系数 网络传递性 平均度 平均度中心性 平均中介中心性 平均接近中心性
>4 3 013 56 934 0.012 5 0.376 4 0.144 0 37.792 2 0.012 5 0.000 5 0.395 2
>6 2 192 48 718 0.020 3 0.342 3 0.162 9 44.450 7 0.020 3 0.000 6 0.421 6
>8 1 710 42 546 0.029 1 0.331 5 0.180 4 49.761 4 0.029 1 0.000 8 0.442 5
>10 1 407 37 800 0.038 2 0.325 5 0.195 3 53.731 3 0.038 2 0.000 9 0.459 2
>12 1 168 33 370 0.049 0 0.328 9 0.212 4 57.140 4 0.049 0 0.001 0 0.473 1
>14 1 014 30 398 0.059 2 0.335 1 0.226 5 59.956 6 0.059 2 0.001 1 0.483 6
The Characteristics of Topological Structures of Co-work Constructed in BSS Discipline
The Topological Structure Values Calculated for Each Network within Each Discipline
指标 关键词频次>4 关键词频次>6 关键词频次>8 关键词频次>10 关键词频次>12 关键词频次>14 平均值 差值
AA 0.812 9 0.815 3 0.814 3 0.808 5 0.772 4 0.767 0 0.798 4 0.048 3
ACT 0.773 1 0.762 1 0.749 5 0.737 1 0.702 2 0.692 1 0.736 0 0.081 0
CN 0.810 9 0.813 1 0.812 1 0.806 1 0.769 4 0.763 8 0.795 9 0.049 3
Cos+ 0.781 6 0.771 8 0.765 6 0.750 9 0.722 8 0.707 1 0.750 0 0.074 5
HDI 0.768 1 0.756 9 0.748 6 0.737 7 0.699 6 0.688 4 0.733 2 0.079 7
HPI 0.791 6 0.792 5 0.792 6 0.787 6 0.747 0 0.741 4 0.775 5 0.051 2
Jaccard 0.773 0 0.766 5 0.762 5 0.755 4 0.712 8 0.704 3 0.745 8 0.068 7
Katz 0.487 5 0.489 6 0.479 9 0.480 2 0.492 2 0.486 1 0.485 9 0.012 3
LHN_I 0.736 4 0.709 5 0.690 3 0.671 4 0.644 7 0.627 2 0.679 9 0.109 2
LP 0.782 9 0.764 6 0.748 7 0.733 7 0.714 0 0.701 0 0.740 8 0.081 9
PA 0.819 8 0.802 1 0.786 1 0.771 7 0.772 1 0.761 1 0.785 5 0.058 7
RA 0.809 7 0.810 9 0.809 5 0.803 8 0.768 5 0.763 0 0.794 2 0.047 9
RWR 0.855 1 0.841 5 0.830 7 0.820 0 0.804 0 0.795 7 0.824 5 0.059 4
Salton 0.784 3 0.784 0 0.784 1 0.779 3 0.730 3 0.723 4 0.764 2 0.060 9
Sorensen 0.773 0 0.766 5 0.762 5 0.755 4 0.712 8 0.704 3 0.745 8 0.068 7
Predcition Performance (AUC Value)of Various Metrics in Different Networks Constructed in ISLS
指标 关键词频次>4 关键词频次>6 关键词频次>8 关键词频次>10 关键词频次>12 关键词频次>14 平均值 差值
AA 0.771 8 0.778 6 0.777 2 0.771 8 0.765 0 0.774 9 0.773 2 0.013 6
ACT 0.715 7 0.722 4 0.721 4 0.715 9 0.711 3 0.711 0 0.716 3 0.011 4
CN 0.766 7 0.774 4 0.772 2 0.766 7 0.760 0 0.769 4 0.768 2 0.014 4
Cos+ 0.693 4 0.703 9 0.694 9 0.688 5 0.684 1 0.683 8 0.691 4 0.020 1
HDI 0.661 4 0.693 2 0.673 5 0.661 4 0.650 0 0.651 0 0.665 1 0.043 2
HPI 0.755 2 0.757 8 0.756 6 0.755 2 0.749 5 0.760 6 0.755 8 0.011 1
Jaccard 0.6880 0.707 5 0.695 2 0.688 0 0.680 5 0.686 2 0.690 9 0.027 0
Katz 0.513 1 0.502 7 0.509 7 0.513 1 0.514 7 0.514 6 0.511 3 0.012 0
LHN_I 0.564 9 0.627 0 0.586 7 0.564 9 0.545 0 0.535 8 0.570 7 0.091 2
LP 0.614 0 0.624 5 0.619 2 0.614 0 0.608 9 0.612 7 0.615 6 0.015 6
PA 0.767 2 0.792 2 0.778 7 0.767 2 0.757 4 0.749 7 0.768 7 0.042 5
RA 0.770 3 0.776 4 0.775 0 0.770 3 0.764 5 0.775 0 0.771 9 0.011 9
RWR 0.783 8 0.794 2 0.790 1 0.783 8 0.775 4 0.788 3 0.785 9 0.018 8
Salton 0.732 1 0.738 3 0.735 0 0.732 1 0.726 6 0.736 0 0.733 4 0.011 7
Sorensen 0.688 0 0.707 5 0.695 2 0.688 0 0.680 5 0.686 2 0.690 9 0.027 0
Predcition Performance (AUC Value)of Various Metrics in Different Networks Constructed in COM
指标 关键词频次>4 关键词频次>6 关键词频次>8 关键词频次>10 关键词频次>12 关键词频次>14 平均值 差值
AA 0.751 6 0.758 4 0.761 4 0.760 3 0.761 7 0.758 7 0.758 7 0.010 1
ACT 0.500 2 0.755 4 0.705 5 0.699 2 0.700 5 0.696 9 0.676 3 0.255 2
CN 0.749 1 0.755 4 0.758 0 0.756 5 0.757 8 0.754 1 0.755 2 0.008 9
Cos+ 0.500 2 0.709 1 0.699 7 0.694 9 0.694 0 0.691 2 0.664 9 0.208 9
HDI 0.719 2 0.713 5 0.705 8 0.697 0 0.692 3 0.686 5 0.702 4 0.032 7
HPI 0.733 7 0.738 8 0.740 8 0.740 3 0.741 3 0.738 2 0.738 9 0.007 6
Jaccard 0.721 4 0.718 7 0.714 5 0.709 4 0.707 9 0.704 6 0.712 8 0.016 8
Katz 0.395 8 0.605 4 0.598 4 0.594 1 0.594 0 0.587 3 0.562 5 0.209 6
LHN_I 0.702 7 0.687 4 0.670 1 0.652 1 0.638 4 0.626 1 0.662 8 0.076 6
LP 0.753 2 0.739 4 0.726 4 0.714 3 0.707 6 0.698 0 0.723 2 0.055 2
PA 0.781 4 0.764 2 0.751 0 0.739 7 0.736 5 0.726 7 0.749 9 0.054 7
RA 0.749 3 0.754 8 0.757 0 0.755 4 0.756 7 0.754 0 0.754 5 0.007 7
RWR 0.822 7 0.810 8 0.800 3 0.790 6 0.786 5 0.778 5 0.798 2 0.044 2
Salton 0.727 1 0.725 3 0.729 4 0.727 5 0.727 8 0.725 3 0.727 1 0.004 1
Sorensen 0.721 4 0.718 7 0.714 5 0.709 3 0.707 9 0.704 6 0.712 7 0.016 8
Predcition Performance (AUC Value)of Various Metrics in Different Networks Constructed in LAW
指标 关键词频次>4 关键词频次>6 关键词频次>8 关键词频次>10 关键词频次>12 关键词频次>14 平均值 差值
AA 0.762 5 0.780 1 0.787 5 0.789 8 0.790 4 0.788 3 0.783 1 0.027 9
ACT 0.717 8 0.711 2 0.708 2 0.701 2 0.698 4 0.693 9 0.705 1 0.023 9
CN 0.759 7 0.775 9 0.782 5 0.784 1 0.784 2 0.781 9 0.778 1 0.024 5
Cos+ 0.799 8 0.799 3 0.798 2 0.794 3 0.792 0 0.786 1 0.795 0 0.013 7
HDI 0.737 9 0.743 7 0.744 5 0.742 7 0.741 8 0.739 1 0.741 6 0.006 6
HPI 0.751 8 0.768 3 0.776 1 0.779 2 0.782 0 0.781 6 0.773 2 0.030 2
Jaccard 0.740 6 0.749 8 0.753 9 0.754 9 0.756 7 0.755 8 0.752 0 0.016 1
Katz 0.532 0 0.531 7 0.530 8 0.529 1 0.527 3 0.526 3 0.529 5 0.005 7
LHN_I 0.725 0 0.722 4 0.715 5 0.707 8 0.703 1 0.698 3 0.712 0 0.026 7
LP 0.749 7 0.742 2 0.735 1 0.726 2 0.718 3 0.711 7 0.730 5 0.038 0
PA 0.762 7 0.751 6 0.742 9 0.733 2 0.725 4 0.718 0 0.738 9 0.044 7
RA 0.761 6 0.779 0 0.786 7 0.789 7 0.791 7 0.790 6 0.783 2 0.030 1
RWR 0.842 0 0.839 8 0.834 3 0.829 3 0.825 3 0.818 8 0.831 6 0.023 2
Salton 0.746 7 0.760 4 0.767 4 0.770 2 0.772 8 0.772 4 0.765 0 0.026 1
Sorensen 0.740 6 0.749 8 0.753 9 0.755 0 0.756 7 0.755 9 0.752 0 0.016 1
Predcition Performance (AUC Value)of Various Metrics in Different Networks Constructed in Ocean
指标 关键词频次>4 关键词频次>6 关键词频次>8 关键词频次>10 关键词频次>12 关键词频次>14 平均值 差值
AA 0.813 9 0.804 9 0.793 3 0.785 6 0.776 7 0.769 0 0.790 6 0.044 9
ACT 0.746 0 0.734 0 0.722 1 0.717 9 0.711 2 0.714 7 0.724 3 0.034 8
CN 0.809 4 0.800 1 0.788 3 0.780 9 0.772 3 0.764 8 0.786 0 0.044 6
Cos+ 0.739 8 0.724 2 0.713 4 0.701 5 0.694 2 0.690 2 0.710 6 0.049 6
HDI 0.725 7 0.704 0 0.687 2 0.679 8 0.674 4 0.667 2 0.689 7 0.058 5
HPI 0.791 3 0.786 8 0.778 9 0.772 9 0.766 0 0.759 0 0.775 8 0.032 3
Jaccard 0.739 1 0.724 6 0.713 5 0.710 0 0.706 4 0.700 4 0.715 7 0.038 7
Katz 0.498 5 0.495 9 0.492 5 0.491 6 0.492 2 0.493 1 0.494 0 0.006 9
LHN_I 0.654 9 0.613 4 0.585 9 0.573 0 0.567 8 0.561 3 0.592 7 0.093 6
LP 0.572 5 0.542 0 0.521 0 0.508 5 0.500 4 0.497 9 0.523 7 0.074 6
PA 0.811 8 0.785 0 0.767 8 0.756 0 0.744 0 0.735 5 0.766 7 0.076 3
RA 0.811 8 0.803 1 0.792 8 0.786 0 0.777 9 0.770 9 0.790 4 0.040 9
RWR 0.839 9 0.824 4 0.810 6 0.799 1 0.788 0 0.776 4 0.806 4 0.063 5
Salton 0.770 0 0.763 8 0.755 6 0.751 7 0.746 6 0.739 9 0.754 6 0.030 1
Sorensen 0.739 1 0.724 6 0.713 5 0.710 0 0.706 4 0.700 4 0.715 7 0.038 7
Predcition Performance (AUC Value)of Various Metrics in Different Networks Constructed in BSS
The Difference Between the Maximum and Minimum Values of the Prediction Effect of the Co-word Network Constructed by Different Metrics in Different Disciplines (AUC Values)
Average of Prediction Effects (AUC) in Various Networks Constructed in Different Disciplines
Prediction Effect (AUC Values)of Different Metrics in Various Networks Constructed in Different Disciplines
Changes in Prediction Performance (AUC Values) of the Metrics with the Change of Network Topology Parameters
[1] Callon M, Courtial J P, Turner W A, et al. From Translations to Problematic Networks: An Introduction to Co-word Analysis[J]. Social Science Information, 1983, 22(2): 191-235.
doi: 10.1177/053901883022002003
[2] 王艳东, 李萌萌, 付小康, 等. 基于社交媒体共词网络的灾情发展态势探测方法[J]. 武汉大学学报(信息科学版), 2020, 45(5): 691-698, 735.
[2] (Wang Yandong, Li Mengmeng, Fu Xiaokang, et al. A New Method to Detect the Development Situation of Disasters Based on Social Media Co-word Network[J]. Geomatics and Information Science of Wuhan University, 2020, 45(5): 691-698, 735.)
[3] 张斌, 马费城. 科学知识网络中的链路预测研究述评[J]. 中国图书馆学报, 2015, 41(217): 99-113.
[3] (Zhang Bin, Ma Feicheng. A Review on Link Prediction of Scientific Knowledge Network[J]. Journal of Library Science in China, 2015, 41(217): 399-113.)
[4] 张斌, 李亚婷, 戴怡清. 聚集系数对合著网络链路预测效果的影响研究[J]. 情报理论与实践, 2018, 41(1) : 100-104, 99.
[4] (Zhang Bin, Li Yating, Dai Yiqing. Research on the Influence of Clustering Coefficient on the Link Prediction in Collaboration Networks[J]. Information Studies: Theory & Application, 2018, 41(1): 100-104, 99.)
[5] Breiman L. Random Forests[J]. Machine Learning, 2001, 45(1): 5-32.
doi: 10.1023/A:1010933404324
[6] Benchettara N, Kanawati R, Rouveirol C. Supervised Machine Learning Applied to Link Prediction in Bipartite Social Networks[C]// Proceedings of the 2010 International Conference on Advances in Social Networks Analysis and Mining. 2010: 326-330.
[7] 丁敬达, 郭杰. 融合内容相似度和路径相似性的潜在作者合作关系挖掘[J]. 情报理论与实践, 2021, 44(1): 124-128, 123.
[7] (Ding Jingda, Guo Jie. Mining Potential Author Cooperative Relationships Based on the Similarity of Content and Path[J]. Information Studies: Theory & Application, 2021, 44(1): 124-128, 123.)
[8] 吴胜男, 蒲虹君, 田若楠, 等. 网络结构对链路预测算法的影响研究——基于元分析视角[J]. 数据分析与知识发现, 2021, 5(11): 102-113.
[8] (Wu Shengnan, Pu Hongjun, Tian Ruonan, et al. Network Structure's Impacts on Link Prediction Algorithm from Meta-Analysis Perspective[J]. Data Analysis and Knowledge Discovery, 2021, 5(11): 102-113.)
[9] Freeman L C. A Set of Measures of Centrality Based on Betweenness[J]. Sociometry, 1977, 40(1): 35-41.
doi: 10.2307/3033543
[10] Adamic L A, Adar E. Friends and Neighbors on the Web[J]. Social Networks, 2003, 25(3): 211-230.
doi: 10.1016/S0378-8733(03)00009-1
[11] Agrawal R, Imieliński T, Swami A. Mining Association Rules Between Sets of Items in Large Databases[C]// Proceedings of the 1993 ACM SIGMOD International Conference on Management of Data. 1993: 207-216.
[12] Kleinberg J M. Navigation in A Small World[J]. Nature, 2000, 406(6798): 845.
doi: 10.1038/35022643
[13] Katz L. A New Status Index Derived from Sociometric Analysis[J]. Psychometrika, 1953, 18(1): 39-43.
doi: 10.1007/BF02289026
[14] Fouss F, Pirotte A, Renders J M, et al. Random-Walk Computation of Similarities Between Nodes of a Graph with Application to Collaborative Recommendation[J]. IEEE Transactions on Knowledge and Data Engineering, 2007, 19(3): 355-369.
doi: 10.1109/TKDE.2007.46
[15] Brin S, Page L. The Anatomy of a Large-Scale Hypertextual Web Search Engine[J]. Computer Networks and ISDN Systems, 1998, 30(1-7): 107-117.
doi: 10.1016/S0169-7552(98)00110-X
[16] 张敏, 朱明星, 刘晓彤. 基于关键词网络挖掘和时序分析的云计算研究扩散与演化[J]. 图书馆工作与研究, 2016(12): 61-68.
[16] (Zhang Min, Zhu Mingxing, Liu Xiaotong. The Diffusion and Evolution of Cloud Computing Research Based on Keywords Network Mining and Time Sequence Analysis[J]. Library Work and Study, 2016(12): 61-68.)
[17] 刘自强, 岳丽欣, 许海云, 等. 时序共词网络构建及其动态可视化研究[J]. 情报学报, 2020, 39(2): 186-198.
[17] (Liu Ziqiang, Yue Lixin, Xu Haiyun, et al. Construction of a Temporal Co-word Network and Its Dynamic Visualization[J]. Journal of the China Society for Scientific and Technical Information, 2020, 39(2): 186-198.)
[18] 王晓光, 程齐凯. 基于NEViewer的学科主题演化可视化分析[J]. 情报学报, 2013, 32(9): 900-911.
[18] (Wang Xiaoguang, Cheng Qikai. Analysis on Evolution of Research Topics in a Discipline Based on NEViewer[J]. Journal of the China Society for Scientific and Technical Information, 2013, 32(9): 900-911.)
[19] Salton G, McGill M J. Introduction to Modern Information Retrieval[M]. New York: McGraw-Hill, 1983.
[20] Jaccard P. Étude Comparative de la Distribution Florale Dans Une Portion des Alpes et du Jura[J]. Bulletin de la Societe Vaudoise des Sciences Naturelles, 1901, 37: 547-579.
[21] Sorensen T. A Method of Establishing Groups of Equal Amplitude in Plant Sociology Based on Similarity of Species Content and Its Application to Analyses of the Vegetation on Danish Commons[M]. København: I kommission hos E. Munksgaard, 1948.
[22] Ravasz E, Somera A L, Mongru D A, et al. Hierarchical Organization of Modularity in Metabolic Networks[J]. Science, 2002, 297(5586): 1551-1555.
doi: 10.1126/science.1073374 pmid: 12202830
[23] Leicht E A, Holme P, Newman M E J. Vertex Similarity in Networks[J]. Physical Review E, Statistical, Nonlinear, and Soft Matter Physics, 2006, 73(2 Pt 2): 026120.
[24] 宫雪, 崔雷. 基于医学主题词共现网络的链接预测研究[J]. 情报杂志, 2018, 37(1): 66-71, 52.
[24] (Gong Xue, Cui Lei. Link Prediction in MeSH Terms Co-occurring Networks[J]. Journal of Intelligence, 2018, 37(1): 66-71, 52.)
[25] Zhou T, Lv L Y, Zhang Y C. Predicting Missing Links via Local Information[J]. The European Physical Journal B, 2009, 71(4): 623-630.
doi: 10.1140/epjb/e2009-00335-8
[26] 邹列, 张月霞. 基于复杂网络的Psor链路预测算法[J]. 电讯技术, 2021, 61(12): 1579-1585.
[26] (Zou Lie, Zhang Yuexia. A Psor Link Prediction Algorithm Based on Complex Network[J]. Telecommunication Engineering, 2021, 61(12): 1579-1585.)
[27] Leskovec J, Adamic L A, Huberman B A. The Dynamics of Viral Marketing[J]. ACM Transactions on the Web (TWEB), 2007, 1(1):5-es.
[28] Barabasi A L, Albert R. Emergence of Scaling in Random Networks[J]. Science, 1999, 286(5439): 509-512.
doi: 10.1126/science.286.5439.509 pmid: 10521342
[29] 刘宇航, 尹小庆, 林云. 基于网络资源流量的链路预测方法[J]. 计算机工程, 2022, 48(9): 78-88.
doi: 10.19678/j.issn.1000-3428.0062044
[29] (Liu Yuhang, Yin Xiaoqing, Lin Yun. Link Prediction Method Based on Network Resource Traffic[J]. Computer Engineering, 2022, 48(9): 78-88.)
doi: 10.19678/j.issn.1000-3428.0062044
[30] 刘思, 刘海, 陈启买, 等. 基于网络表示学习与随机游走的链路预测算法[J]. 计算机应用, 2017, 37(8): 2234-2239.
doi: 10.11772/j.issn.1001-9081.2017.08.2234
[30] (Liu Si, Liu Hai, Chen Qimai, et al. Link Prediction Algorithm Based on Network Representation Learning and Random Walk[J]. Journal of Computer Applications, 2017, 37(8): 2234-2239.)
doi: 10.11772/j.issn.1001-9081.2017.08.2234
[31] 曹志威, 樊志杰, 王青杨, 等. 一种降噪自编码器的复杂网络链路预测算法[J]. 小型微型计算机系统, 2023, 44(3): 665-672.
[31] (Cao Zhiwei, Fan Zhijie, Wang Qingyang, et al. Link Prediction Algorithm Based on Denoising Autoencoder in Complex Networks[J]. Journal of Chinese Computer Systems, 2023, 44(3): 665-672.)
[32] 吕琳媛. 复杂网络链路预测[J]. 电子科技大学学报, 2010, 39(5): 651-661.
[32] (Lv Linyuan. Link Prediction on Complex Networks[J]. Journal of University of Electronic Science and Technology of China, 2010, 39(5): 651-661.)
[33] Xiong T, Zhou L, Zhao Y, et al. Mining Semantic Information of Co-word Network to Improve Link Prediction Performance[J]. Scientometrics, 2022, 127(6): 2981-3004.
doi: 10.1007/s11192-021-04247-9
[34] Zachary W W. An Information Flow Model for Conflict and Fission in Small Groups[J]. Journal of Anthropological Research, 1977, 33(4): 452-473.
doi: 10.1086/jar.33.4.3629752
[35] Lusseau D, Schneider K, Boisseau O J, et al. The Bottlenose Dolphin Community of Doubtful Sound Features A Large Proportion of Long-Lasting Associations[J]. Behavioral Ecology and Sociobiology, 2003, 54 (4): 396-405.
doi: 10.1007/s00265-003-0651-y
[36] Guimerà R, Mossa S, Turtschi A, et al. The Worldwide Air Transportation Network: Anomalous Centrality, Community Structure, and Cities' Global Roles[J]. PNAS, 2005, 102(22): 7794-7799.
pmid: 15911778
[37] White J G, Southgate E, Thomson J N, et al. The Structure of the Nervous System of the Nematode Caenorhabditis Elegans[J]. Philosophical Transactions of the Royal Society of London Series B, Biological Sciences, 1986, 314(1165): 1-340.
[38] Reed J L, Vo T D, Schilling C H, et al. An Expanded Genome-scale Model of Escherichia Coli K-12 (iJR904 GSM/GPR)[J]. Genome Biology, 2003, 4 (9): R54.
doi: 10.1186/gb-2003-4-9-r54 pmid: 12952533
[39] Guimerà R, Sales-Pardo M. Missing and Spurious Interactions and the Reconstruction of Complex Networks[J]. Proceedings of the National Academy of Sciences of the United States of America, 2009, 106(52): 22073-22078.
doi: 10.1073/pnas.0908366106 pmid: 20018705
[40] Clauset A, Moore C, Newman M E J. Hierarchical Structure and the Prediction of Missing Links in Networks[J]. Nature, 2008, 453(7191): 98-101.
doi: 10.1038/nature06830
[41] Li H J, An H Z, Wang Y, et al. Evolutionary Features of Academic Articles Co-keyword Network and Keywords Co-occurrence Network: Based on Two-Mode Affiliation Network[J]. Physica A: Statistical Mechanics and Its Applications, 2016, 450: 657-669.
doi: 10.1016/j.physa.2016.01.017
[42] Liben-Nowell D, Kleinberg J. The Link Prediction Problem for Social Networks[C]// Proceedings of the 12th International Conference on Information and Knowledge Management. 2003: 556-559.
[43] Sun J C, Feng L, Xie J R, et al. Revealing the Predictability of Intrinsic Structure in Complex Networks[J]. Nature Communications, 2020, 11: 574.
doi: 10.1038/s41467-020-14418-6 pmid: 31996676
[44] 张斌, 马费成. 科学知识网络中的链路预测研究述评[J]. 中国图书馆学报, 2015, 41(3): 99-113.
[44] (Zhang Bin, Ma Feicheng. A Review on Link Prediction of Scientific Knowledge Network[J]. Journal of Library Science in China, 2015, 41 (3): 99-113.)
[45] 岳增慧, 许海云, 王倩飞. 基于局部信息相似性的学科引证知识扩散动态链路预测研究[J]. 情报理论与实践, 2020, 43(2): 84-91, 99.
[45] (Yue Zenghui, Xu Haiyun, Wang Qianfei. Dynamic Link Prediction of Knowledge Diffusion in Disciplinary Citation Networks Based on Local Information[J]. Information Studies: Theory & Application, 2020, 43(2): 84-91, 99.)
[46] Euler L. The Solution of a Problem Relating to the Geometry of Position[J]. Commentarii Academiae Scientiarum Petropolitanae, 1741, 8: 128-140.
[47] Euler L. The Seven Bridges of Königsberg[J]. The World of Mathematics, 1956, 1: 573-580.
[48] Watts D J, Strogatz S H. Collective Dynamics of ‘Small-World’ Networks.[J]. Nature, 1998, 393(6684): 440-442.
doi: 10.1038/30918
[49] Moreno J L. Who Shall Survive? A New Approach to the Problem of Human Interrelations[J]. The Journal of Nervous & Mental Disease, 1934, 80(6): 724-725.
[50] Erdős P, Rényi A. On Random Graphs I[J]. Publicationes Mathematicae, 1959, 6: 290-297.
[51] Freeman L C. Centrality in Social Networks: Conceptual Clarification[J]. Social Networks, 1978-1979, 1(3): 215-239.
doi: 10.1016/0378-8733(78)90021-7
[52] Bavelas A. Communication Patterns in Task-Oriented Groups[J]. The Journal of the Acoustical Society of America, 1950, 22(6): 725-730.
doi: 10.1121/1.1906679
[53] 刘敏娟, 张学福, 颜蕴. 基于词频、词量、累积词频占比的共词分析词集范围选取方法研究[J]. 图书情报工作, 2016, 60(23): 135-142.
doi: 10.13266/j.issn.0252-3116.2016.23.017
[53] (Liu Minjuan, Zhang Xuefu, Yan Yun. Research on Method of Determining Scope of Word Set in Co-word Analysis Based on Word Frequency, Number of Words, Cumulative Word Frequency in Proportion[J]. Library and Information Service, 2016, 60(23): 135-142.)
doi: 10.13266/j.issn.0252-3116.2016.23.017
[54] Hardy M. Pareto's Law[J]. The Mathematical Intelligencer, 2010, 32(3): 38-43.
doi: 10.1007/s00283-010-9159-2
[55] 盛积良, 黄毅, 李居超. 我国行业风险敞口与行业网络结构的相关性研究[J/OL]. 中国管理科学: 1-15 [2022-09-12]. DOI:10.16381/j.cnki.issn1003-207x.2021.2369.
[55] (Sheng Jiliang, Huang Yi, Li Juchao. Research on the Correlation Between Industry Risk and Industry Network Structure in China[J/OL]. Chinese Journal of Management Science:: 1-15 [2022-09-12]. DOI:10.16381/j.cnki.issn1003-207x.2021.2369.)
[56] Lemaître G, Nogueira F, Aridas C K. Imbalanced-Learn: A Python Toolbox to Tackle the Curse of Imbalanced Datasets in Machine Learning[OL]. arXiv Preprint, arXiv: 1609.06570.
[57] Khoshgoftaar T M, Golawala M, Van Hulse J. An Empirical Study of Learning from Imbalanced Data Using Random Forest[C]// Proceedings of 19th IEEE International Conference on Tools with Artificial Intelligence. 2007: 310-317.
[58] Sun Y Z, Han J W. Mining Heterogeneous Information Networks: Principles and Methodologies[J]. Synthesis Lectures on Data Mining and Knowledge Discovery, 2012, 3(2):1-159.
[59] Abro M, Nawaz H, Abro W A. Performance Analysis of Dissimilar Classification Methods using RapidMiner[J]. Sindh University Research Journal, 2016, 48(1): 185-188.
[60] Ristoski P, Bizer C, Paulheim H. Mining the Web of Linked Data with RapidMiner[J]. Journal of Web Semantics, 2015, 35(P3): 142-151.
doi: 10.1016/j.websem.2015.06.004
[61] 黄丹阳, 张力文. 基于局部社团结构平衡的双模符号网络链路预测研究[J]. 统计研究, 2021, 38(12): 131-144.
[61] (Huang Danyang, Zhang Liwen. Link Prediction of Bipartite Signed Network Based on Structural Balance in Local Communities[J]. Statistical Research, 2021, 38(12): 131-144.)
[62] 吴翼腾, 于洪涛, 顾泽宇. 基于统一描述网络结构模型的链路预测方法[J]. 计算机工程, 2022, 48(7): 51-58.
doi: 10.19678/j.issn.1000-3428.0061523
[62] (Wu Yiteng, Yu Hongtao, Gu Zeyu. Link Prediction Method Based on Network Structure Model for Unified Description[J]. Computer Engineering, 2022, 48(7): 51-58.)
doi: 10.19678/j.issn.1000-3428.0061523
[63] Hanley J A, McNeil B J. The Meaning and Use of the Area under a Receiver Operating Characteristic (ROC) Curve[J]. Radiology, 1982, 143(1): 29-36.
doi: 10.1148/radiology.143.1.7063747 pmid: 7063747
[64] Chauhan R, Kaur H. Predictive Analytics and Data Mining: A Framework for Optimizing Decisions with R Tool[A]// Advances in Secure Computing, Internet Services, and Applications[M]. DOI: 10.4018/978-1-4666-4940-8.ch004.
[65] 万杨晔, 郭进利. 基于资源分配与图嵌入加权的链路预测算法[J]. 计算机与现代化, 2021(7): 12-17.
[65] (Wan Yangye, Guo Jinli. Link Prediction Algorithm Based on Resource Allocation and Graph Embedding Weighting[J]. Computer and Modernization, 2021(7): 12-17.)
[66] 吴晨程, 周银座. 基于图嵌入法的时序网络链路预测研究[J]. 杭州师范大学学报(自然科学版), 2020, 19(5): 472-480.
[66] Wu Chencheng, Zhou Yinzuo. Link Prediction Research on Temporal Network Based on Graph Embedding Method[J]. Journal of Hangzhou Normal University (Natural Science Edition), 2020, 19(5): 472-480.)
[1] Xu Xin, Li Qian, Yao Zhanlei. Technology Recognition and Link Prediction Method Based on GNN[J]. 数据分析与知识发现, 2023, 7(6): 15-25.
[2] Liu Kan, Xu Qinya, Yu Lu. Constructing Knowledge Graph for Business Environment[J]. 数据分析与知识发现, 2022, 6(4): 82-96.
[3] Shan Xiaohong,Wang Chunwen,Liu Xiaoyan,Han Shengxi,Yang Juan. Identifying Lead Users in Open Innovation Community from Knowledge-based Perspectives[J]. 数据分析与知识发现, 2021, 5(9): 85-96.
[4] Wu Shengnan, Tian Ruonan, Pu Hongjun, Liang Wenqi, Zhang Yafei, Yu Qi, He Peifeng. Predicting Related Medical Topics from Social Media[J]. 数据分析与知识发现, 2021, 5(12): 98-109.
[5] Wu Shengnan, Pu Hongjun, Tian Ruonan, Liang Wenqi, Yu Qi. Network Structure’s Impacts on Link Prediction Algorithm from Meta-Analysis Perspective[J]. 数据分析与知识发现, 2021, 5(11): 102-113.
[6] Yu Chuanming, Zhang Zhengang, Kong Lingge. Comparing Knowledge Graph Representation Models for Link Prediction[J]. 数据分析与知识发现, 2021, 5(11): 29-44.
[7] Wang Song, Yang Yang, Liu Xinmin. Discovering Potentialities of User Ideas from Open Innovation Communities with Graph Attention Network[J]. 数据分析与知识发现, 2021, 5(11): 89-101.
[8] Chen Wenjie. Predicting Research Collaboration Based on Translation Model[J]. 数据分析与知识发现, 2020, 4(10): 28-36.
[9] Chuanming Yu,Haonan Li,Manyi Wang,Tingting Huang,Lu An. Knowledge Representation Based on Deep Learning:Network Perspective[J]. 数据分析与知识发现, 2020, 4(1): 63-75.
[10] Qingtian Zeng,Xiaohui Hu,Chao Li. Extracting Keywords with Topic Embedding and Network Structure Analysis[J]. 数据分析与知识发现, 2019, 3(7): 52-60.
[11] Qikai Cheng,Jiamin Wang,Wei Lu. Discovering Domain Vocabularies Based on Citation Co-word Network[J]. 数据分析与知识发现, 2019, 3(6): 57-65.
[12] Yujie Cao,Jin Mao,Rongqing Pan,Zhichao Ba,Gang Li. Analyzing Characteristics of Interdisciplinary Research Evolutions: Case Study of Medical Informatics[J]. 数据分析与知识发现, 2019, 3(5): 107-116.
[13] Junwan Liu,Zhixin Long,Feifei Wang. Finding Collaboration Opportunities from Emerging Issues with LDA Topic Model and Link Prediction[J]. 数据分析与知识发现, 2019, 3(1): 104-117.
[14] Yu Chuanming,Gong Yutian,Zhao Xiaoli,An Lu. Collaboration Recommendation of Finance Research Based on Multi-feature Fusion[J]. 数据分析与知识发现, 2017, 1(8): 39-47.
[15] Lv Weimin,Wang Xiaomei,Han Tao. Recommending Scientific Research Collaborators with Link Prediction and Extremely Randomized Trees Algorithm[J]. 数据分析与知识发现, 2017, 1(4): 38-45.
  Copyright © 2016 Data Analysis and Knowledge Discovery   Tel/Fax:(010)82626611-6626,82624938   E-mail:jishu@mail.las.ac.cn