Please wait a minute...
Advanced Search
数据分析与知识发现  2024, Vol. 8 Issue (2): 114-130     https://doi.org/10.11925/infotech.2096-3467.2022.1311
  研究论文 本期目录 | 过刊浏览 | 高级检索 |
网络结构变动对共词网络链路预测效果的影响研究*
陈卓1,蒋茜茜1,张晓娟2()
1西南大学计算机与信息科学学院 重庆 400715
2四川大学公共管理学院 成都 610065
Influence of Network Structure Changes on Co-word Network Link Prediction
Chen Zhuo1,Jiang Xixi1,Zhang Xiaojuan2()
1School of Computer and Information Science, Southwest University, Chongqing 400715, China
2School of Public Administration, Sichuan University, Chengdu 610065, China
全文: PDF (2103 KB)   HTML ( 5
输出: BibTeX | EndNote (RIS)      
摘要 

【目的】 研究共词网络结构变动对链路预测相似性指标预测效果的影响。【方法】 本文从Web of Science核心合集中随机获取5个学科2015-2020年的文献数据;根据不同的关键词频次,分别构建不同网络拓扑结构特征的共词网络;选取AA、CN、RWR、Katz等15个传统链路预测相似性指标,在各共词网络上进行链路预测实验,以此对比分析不同指标在网络结构变动环境下的预测效果。【结果】 不同学科中,共词网络的关键词频次越大,平均聚类系数越小,密度、网络传递性、平均度、平均度中心性、平均中介中心性、平均接近中心性越大,链路预测效果越差的可能性较大;反之,平均聚类系数越大,其余网络拓扑结构属性特征越小,链路预测效果越好的可能性较大。在所选取的15个相似性指标中,RWR指标在不同拓扑结构特征的共词网络中均表现最好;Katz指标的预测效果最稳定。从学科来说,各指标的预测结果在LAW学科中受网络结构变动的影响最大。【局限】 由于计算空间有限,仅采用单个分类方法和评价指标,并且仅停留在基于节点相似性指标的探讨,缺乏对其他类别指标(如基于似然分析和基于概率模型等指标)的研究。【结论】 从共词网络的关键词频次出发,探讨了各网络结构变动对链路预测效果的影响,为不同学科及不同大小的共词网络选取相似性指标提供了理论依据。

服务
把本文推荐给朋友
加入引用管理器
E-mail Alert
RSS
作者相关文章
陈卓
蒋茜茜
张晓娟
关键词 共词网络链路预测网络结构相似性指标    
Abstract

[Objective] This article studies the impacts of co-word network structure changes on link prediction using the similarity metric.[Methods] Firstly, we randomly retrieved the ISLS, LAW, BSS, COM, and Ocean literature from the core collection of Web of Science (2015 to 2020). Secondly, according to the diverse keyword frequencies, we constructed co-word networks with various topological features, such as the number of nodes and edges, the Average Clustering Coefficient, the Density, the Network Transitivity, and the Average Degree. Finally, we chose 15 traditional link prediction similarity metrics(e.g., AA, CN, RWR, and Katz) to conduct link prediction experiments on various co-word networks. [Results] We compared and analyzed the prediction effects of different similarity metrics with the network structure change. (1) In different disciplines, in most cases, the larger the overall frequency of keywords in the co-word network, the smaller the average clustering coefficient, the larger the density, network transitivity, average degree, average degree centrality, average betweenness centrality and average closeness, and the greater the possibility of poor link prediction effect. Conversely, the larger the average clustering coefficient, the smaller the other network topologies, and the better the link prediction effect. (2) Among the 15 selected similarity indicators, the RWR metric performed the best in co-word networks with different topological characteristics. The prediction performance of the Katz metrics is the most stable in different co-word networks. The prediction results of each index in the LAW discipline are most affected by the change in keyword frequency. [Limitations] Due to limited computing space, we only used one classification method and one evaluation index in this study. In addition, we did not explore some node similarity indicators (i.e., likelihood analysis-based metrics and probability model-based metrics). [Conclusions] This study provides a theoretical foundation for selecting similarity metrics of co-word networks of different disciplines.

Key wordsCo-word Network    Link Prediction    Network Structure    Similarity Metric
收稿日期: 2022-12-09      出版日期: 2023-05-08
ZTFLH:  TP393  
  G250  
基金资助:*国家社会科学基金项目(21BTQ072)
通讯作者: 张晓娟,ORCID:0000-0002-5889-5922, E-mail: zxj0614@scu.edu.cn。   
引用本文:   
陈卓, 蒋茜茜, 张晓娟. 网络结构变动对共词网络链路预测效果的影响研究*[J]. 数据分析与知识发现, 2024, 8(2): 114-130.
Chen Zhuo, Jiang Xixi, Zhang Xiaojuan. Influence of Network Structure Changes on Co-word Network Link Prediction. Data Analysis and Knowledge Discovery, 2024, 8(2): 114-130.
链接本文:  
https://manu44.magtech.com.cn/Jwk_infotech_wk3/CN/10.11925/infotech.2096-3467.2022.1311      或      https://manu44.magtech.com.cn/Jwk_infotech_wk3/CN/Y2024/V8/I2/114
Fig.1  各学科的文献数量(2015-2020年)
数据集 ISLS COM LAW Ocean BSS
关键词总数(个) 1 960 971 1 882 008 2 418 845 3 008 470 660 138
训练阶段关键词数(个) 40 816 72 019 75 446 58 271 22 019
测试阶段关键词数(个) 35 042 34 495 158 669 43 193 15 946
训练测试关键词交集数(个) 10 596 68 990 9 916 14 567 12 566
Table 1  各学科的关键词数量
Fig.2  各学科网络按不同关键词频次所获得的关键词数
学科 关键词频次 样本总数 正样本数 负样本数
ISLS >4 8 965 657 47 482 8 918 175
>6 4 373 525 38 531 4 334 994
>8 2 508 636 31 944 2 476 692
>10 1 597 617 27 338 1 570 279
>12 1 102 056 23 799 1 078 257
>14 807 506 20 926 786 580
COM >4 13 193 290 78 361 13 114 929
>6 7 480 671 57 537 7 423 134
>8 4 307 303 45 189 4 262 194
>10 2 817 955 39 467 2 778 488
>12 1 960 670 34 751 1 925 919
>14 1 430 528 30 298 1 400 230
LAW >4 8 803 410 34 882 8 768 528
>6 4 258 082 27 969 4 230 113
>8 2 477 283 23 242 2 454 041
>10 1 544 960 19 445 1 525 515
>12 1 048 298 16 481 1 031 817
>14 720 238 14 138 706 100
Ocean >4 19 285 428 39 419 19 246 009
>6 9 301 975 30 683 9 271 292
>8 5 179 696 24 684 5 155 012
>10 3 244 393 20 637 3 223 756
>12 2 118 050 17 328 2 100 722
>14 1 481 369 14 954 1 466 415
BSS >4 4 483 656 29 825 4 453 831
>6 2 354 809 24 679 2 330 130
>8 952 727 19 482 933 245
>10 952 727 18 269 934 458
>12 649 326 15 804 633 522
>14 399 602 12 036 399 482
Table 2  各学科中各相关样本的具体统计信息
关键词频次 节点数 连边数 密度 平均聚类系数 网络传递性 平均度 平均度中心性 平均中介中心性 平均接近中心性
>4 5 307 81 988 0.005 8 0.363 6 0.113 7 30.898 0 0.005 8 0.000 3 0.354 6
>6 3 674 67 426 0.010 0 0.321 6 0.127 6 36.704 4 0.010 0 0.000 4 0.378 3
>8 2 769 56 972 0.014 9 0.302 5 0.141 1 41.149 9 0.014 9 0.000 5 0.396 9
>10 2 217 49 518 0.020 2 0.292 6 0.151 6 44.671 2 0.020 2 0.000 6 0.414 2
>12 1 823 32 231 0.019 4 0.255 5 0.143 7 35.360 4 0.019 4 0.000 8 0.383 2
>14 1 550 28 376 0.023 6 0.263 4 0.153 8 36.614 2 0.023 6 0.000 8 0.388 4
Table 3  ISLS学科中所构建共词网络的拓扑结构属性特征
关键词频次 节点数 连边数 密度 平均聚类系数 网络传递性 平均度 平均度中心性 平均中介中心性 平均接近中心性
>4 10 304 313 072 0.005 9 0.353 7 0.097 1 60.767 1 0.005 9 0.001 7 0.368 7
>6 3 886 75 599 0.020 0 0.303 8 0.108 4 38.908 4 0.010 0 0.000 3 0.364 1
>8 2 957 66 019 0.015 1 0.291 8 0.119 7 44.652 7 0.015 1 0.000 4 0.401 2
>10 2 398 58 445 0.020 3 0.285 8 0.131 3 48.744 8 0.020 3 0.000 5 0.424 3
>12 2 006 52 350 0.026 0 0.285 1 0.141 8 52.193 4 0.026 0 0.000 6 0.441 6
>14 1 719 47 812 0.032 4 0.291 2 0.151 4 55.627 7 0.032 4 0.000 7 0.467 1
Table 4  COM学科中所构建共词网络的拓扑结构属性特征
关键词频次 节点数 连边数 密度 平均聚类系数 网络传递性 平均度 平均度中心性 平均中介中心性 平均接近中心性
>4 4 207 48 117 0.005 4 0.342 0 0.095 3 22.874 7 0.005 4 0.000 5 0.338 3
>6 2 931 38 763 0.009 0 0.292 9 0.103 7 26.450 4 0.009 0 0.000 6 0.362 8
>8 2 240 32 636 0.013 0 0.271 2 0.111 4 29.139 3 0.013 0 0.000 7 0.382 6
>10 1 773 27 690 0.017 6 0.256 4 0.119 1 31.235 2 0.017 6 0.000 9 0.398 2
>12 1 464 24 082 0.022 5 0.247 7 0.126 9 32.898 9 0.022 5 0.001 0 0.411 6
>14 1 217 20 915 0.028 3 0.244 4 0.135 5 34.371 4 0.028 3 0.001 1 0.424 4
Table 5  LAW学科中所构建共词网络的拓扑结构属性特征
关键词频次 节点数 连边数 密度 平均聚类系数 网络传递性 平均度 平均度中心性 平均中介中心性 平均接近中心性
>4 6 223 80 547 0.004 2 0.293 3 0.092 8 25.886 9 0.004 1 0.000 3 0.327 4
>6 4 328 65 980 0.007 0 0.251 9 0.101 7 30.489 8 0.007 0 0.000 4 0.352 7
>8 3 235 54 533 0.010 4 0.231 6 0.111 9 33.714 4 0.010 4 0.000 5 0.369 8
>10 2 565 46 501 0.014 1 0.224 9 0.121 7 36.258 1 0.014 1 0.000 6 0.384 0
>12 2 077 39 952 0.018 5 0.223 1 0.132 2 38.470 9 0.018 5 0.000 7 0.397 3
>14 1 741 35 041 0.023 1 0.220 2 0.141 9 40.253 9 0.023 1 0.000 8 0.408 6
Table 6  Ocean学科中所构建共词网络的拓扑结构属性特征
关键词频次 节点数 连边数 密度 平均聚类系数 网络传递性 平均度 平均度中心性 平均中介中心性 平均接近中心性
>4 3 013 56 934 0.012 5 0.376 4 0.144 0 37.792 2 0.012 5 0.000 5 0.395 2
>6 2 192 48 718 0.020 3 0.342 3 0.162 9 44.450 7 0.020 3 0.000 6 0.421 6
>8 1 710 42 546 0.029 1 0.331 5 0.180 4 49.761 4 0.029 1 0.000 8 0.442 5
>10 1 407 37 800 0.038 2 0.325 5 0.195 3 53.731 3 0.038 2 0.000 9 0.459 2
>12 1 168 33 370 0.049 0 0.328 9 0.212 4 57.140 4 0.049 0 0.001 0 0.473 1
>14 1 014 30 398 0.059 2 0.335 1 0.226 5 59.956 6 0.059 2 0.001 1 0.483 6
Table 7  BSS学科中所构建共词网络的拓扑结构属性特征
Fig.3  不同学科中所构建不同网络中的拓扑值
指标 关键词频次>4 关键词频次>6 关键词频次>8 关键词频次>10 关键词频次>12 关键词频次>14 平均值 差值
AA 0.812 9 0.815 3 0.814 3 0.808 5 0.772 4 0.767 0 0.798 4 0.048 3
ACT 0.773 1 0.762 1 0.749 5 0.737 1 0.702 2 0.692 1 0.736 0 0.081 0
CN 0.810 9 0.813 1 0.812 1 0.806 1 0.769 4 0.763 8 0.795 9 0.049 3
Cos+ 0.781 6 0.771 8 0.765 6 0.750 9 0.722 8 0.707 1 0.750 0 0.074 5
HDI 0.768 1 0.756 9 0.748 6 0.737 7 0.699 6 0.688 4 0.733 2 0.079 7
HPI 0.791 6 0.792 5 0.792 6 0.787 6 0.747 0 0.741 4 0.775 5 0.051 2
Jaccard 0.773 0 0.766 5 0.762 5 0.755 4 0.712 8 0.704 3 0.745 8 0.068 7
Katz 0.487 5 0.489 6 0.479 9 0.480 2 0.492 2 0.486 1 0.485 9 0.012 3
LHN_I 0.736 4 0.709 5 0.690 3 0.671 4 0.644 7 0.627 2 0.679 9 0.109 2
LP 0.782 9 0.764 6 0.748 7 0.733 7 0.714 0 0.701 0 0.740 8 0.081 9
PA 0.819 8 0.802 1 0.786 1 0.771 7 0.772 1 0.761 1 0.785 5 0.058 7
RA 0.809 7 0.810 9 0.809 5 0.803 8 0.768 5 0.763 0 0.794 2 0.047 9
RWR 0.855 1 0.841 5 0.830 7 0.820 0 0.804 0 0.795 7 0.824 5 0.059 4
Salton 0.784 3 0.784 0 0.784 1 0.779 3 0.730 3 0.723 4 0.764 2 0.060 9
Sorensen 0.773 0 0.766 5 0.762 5 0.755 4 0.712 8 0.704 3 0.745 8 0.068 7
Table 8  不同指标在ISLS中所构建网络中的预测效果(AUC值)
指标 关键词频次>4 关键词频次>6 关键词频次>8 关键词频次>10 关键词频次>12 关键词频次>14 平均值 差值
AA 0.771 8 0.778 6 0.777 2 0.771 8 0.765 0 0.774 9 0.773 2 0.013 6
ACT 0.715 7 0.722 4 0.721 4 0.715 9 0.711 3 0.711 0 0.716 3 0.011 4
CN 0.766 7 0.774 4 0.772 2 0.766 7 0.760 0 0.769 4 0.768 2 0.014 4
Cos+ 0.693 4 0.703 9 0.694 9 0.688 5 0.684 1 0.683 8 0.691 4 0.020 1
HDI 0.661 4 0.693 2 0.673 5 0.661 4 0.650 0 0.651 0 0.665 1 0.043 2
HPI 0.755 2 0.757 8 0.756 6 0.755 2 0.749 5 0.760 6 0.755 8 0.011 1
Jaccard 0.6880 0.707 5 0.695 2 0.688 0 0.680 5 0.686 2 0.690 9 0.027 0
Katz 0.513 1 0.502 7 0.509 7 0.513 1 0.514 7 0.514 6 0.511 3 0.012 0
LHN_I 0.564 9 0.627 0 0.586 7 0.564 9 0.545 0 0.535 8 0.570 7 0.091 2
LP 0.614 0 0.624 5 0.619 2 0.614 0 0.608 9 0.612 7 0.615 6 0.015 6
PA 0.767 2 0.792 2 0.778 7 0.767 2 0.757 4 0.749 7 0.768 7 0.042 5
RA 0.770 3 0.776 4 0.775 0 0.770 3 0.764 5 0.775 0 0.771 9 0.011 9
RWR 0.783 8 0.794 2 0.790 1 0.783 8 0.775 4 0.788 3 0.785 9 0.018 8
Salton 0.732 1 0.738 3 0.735 0 0.732 1 0.726 6 0.736 0 0.733 4 0.011 7
Sorensen 0.688 0 0.707 5 0.695 2 0.688 0 0.680 5 0.686 2 0.690 9 0.027 0
Table 9  不同指标在COM中所构建网络中的预测效果(AUC值)
指标 关键词频次>4 关键词频次>6 关键词频次>8 关键词频次>10 关键词频次>12 关键词频次>14 平均值 差值
AA 0.751 6 0.758 4 0.761 4 0.760 3 0.761 7 0.758 7 0.758 7 0.010 1
ACT 0.500 2 0.755 4 0.705 5 0.699 2 0.700 5 0.696 9 0.676 3 0.255 2
CN 0.749 1 0.755 4 0.758 0 0.756 5 0.757 8 0.754 1 0.755 2 0.008 9
Cos+ 0.500 2 0.709 1 0.699 7 0.694 9 0.694 0 0.691 2 0.664 9 0.208 9
HDI 0.719 2 0.713 5 0.705 8 0.697 0 0.692 3 0.686 5 0.702 4 0.032 7
HPI 0.733 7 0.738 8 0.740 8 0.740 3 0.741 3 0.738 2 0.738 9 0.007 6
Jaccard 0.721 4 0.718 7 0.714 5 0.709 4 0.707 9 0.704 6 0.712 8 0.016 8
Katz 0.395 8 0.605 4 0.598 4 0.594 1 0.594 0 0.587 3 0.562 5 0.209 6
LHN_I 0.702 7 0.687 4 0.670 1 0.652 1 0.638 4 0.626 1 0.662 8 0.076 6
LP 0.753 2 0.739 4 0.726 4 0.714 3 0.707 6 0.698 0 0.723 2 0.055 2
PA 0.781 4 0.764 2 0.751 0 0.739 7 0.736 5 0.726 7 0.749 9 0.054 7
RA 0.749 3 0.754 8 0.757 0 0.755 4 0.756 7 0.754 0 0.754 5 0.007 7
RWR 0.822 7 0.810 8 0.800 3 0.790 6 0.786 5 0.778 5 0.798 2 0.044 2
Salton 0.727 1 0.725 3 0.729 4 0.727 5 0.727 8 0.725 3 0.727 1 0.004 1
Sorensen 0.721 4 0.718 7 0.714 5 0.709 3 0.707 9 0.704 6 0.712 7 0.016 8
Table 10  不同指标在LAW中所构建网络中的预测效果(AUC值)
指标 关键词频次>4 关键词频次>6 关键词频次>8 关键词频次>10 关键词频次>12 关键词频次>14 平均值 差值
AA 0.762 5 0.780 1 0.787 5 0.789 8 0.790 4 0.788 3 0.783 1 0.027 9
ACT 0.717 8 0.711 2 0.708 2 0.701 2 0.698 4 0.693 9 0.705 1 0.023 9
CN 0.759 7 0.775 9 0.782 5 0.784 1 0.784 2 0.781 9 0.778 1 0.024 5
Cos+ 0.799 8 0.799 3 0.798 2 0.794 3 0.792 0 0.786 1 0.795 0 0.013 7
HDI 0.737 9 0.743 7 0.744 5 0.742 7 0.741 8 0.739 1 0.741 6 0.006 6
HPI 0.751 8 0.768 3 0.776 1 0.779 2 0.782 0 0.781 6 0.773 2 0.030 2
Jaccard 0.740 6 0.749 8 0.753 9 0.754 9 0.756 7 0.755 8 0.752 0 0.016 1
Katz 0.532 0 0.531 7 0.530 8 0.529 1 0.527 3 0.526 3 0.529 5 0.005 7
LHN_I 0.725 0 0.722 4 0.715 5 0.707 8 0.703 1 0.698 3 0.712 0 0.026 7
LP 0.749 7 0.742 2 0.735 1 0.726 2 0.718 3 0.711 7 0.730 5 0.038 0
PA 0.762 7 0.751 6 0.742 9 0.733 2 0.725 4 0.718 0 0.738 9 0.044 7
RA 0.761 6 0.779 0 0.786 7 0.789 7 0.791 7 0.790 6 0.783 2 0.030 1
RWR 0.842 0 0.839 8 0.834 3 0.829 3 0.825 3 0.818 8 0.831 6 0.023 2
Salton 0.746 7 0.760 4 0.767 4 0.770 2 0.772 8 0.772 4 0.765 0 0.026 1
Sorensen 0.740 6 0.749 8 0.753 9 0.755 0 0.756 7 0.755 9 0.752 0 0.016 1
Table 11  不同指标在Ocean中所构建网络中的预测效果(AUC值)
指标 关键词频次>4 关键词频次>6 关键词频次>8 关键词频次>10 关键词频次>12 关键词频次>14 平均值 差值
AA 0.813 9 0.804 9 0.793 3 0.785 6 0.776 7 0.769 0 0.790 6 0.044 9
ACT 0.746 0 0.734 0 0.722 1 0.717 9 0.711 2 0.714 7 0.724 3 0.034 8
CN 0.809 4 0.800 1 0.788 3 0.780 9 0.772 3 0.764 8 0.786 0 0.044 6
Cos+ 0.739 8 0.724 2 0.713 4 0.701 5 0.694 2 0.690 2 0.710 6 0.049 6
HDI 0.725 7 0.704 0 0.687 2 0.679 8 0.674 4 0.667 2 0.689 7 0.058 5
HPI 0.791 3 0.786 8 0.778 9 0.772 9 0.766 0 0.759 0 0.775 8 0.032 3
Jaccard 0.739 1 0.724 6 0.713 5 0.710 0 0.706 4 0.700 4 0.715 7 0.038 7
Katz 0.498 5 0.495 9 0.492 5 0.491 6 0.492 2 0.493 1 0.494 0 0.006 9
LHN_I 0.654 9 0.613 4 0.585 9 0.573 0 0.567 8 0.561 3 0.592 7 0.093 6
LP 0.572 5 0.542 0 0.521 0 0.508 5 0.500 4 0.497 9 0.523 7 0.074 6
PA 0.811 8 0.785 0 0.767 8 0.756 0 0.744 0 0.735 5 0.766 7 0.076 3
RA 0.811 8 0.803 1 0.792 8 0.786 0 0.777 9 0.770 9 0.790 4 0.040 9
RWR 0.839 9 0.824 4 0.810 6 0.799 1 0.788 0 0.776 4 0.806 4 0.063 5
Salton 0.770 0 0.763 8 0.755 6 0.751 7 0.746 6 0.739 9 0.754 6 0.030 1
Sorensen 0.739 1 0.724 6 0.713 5 0.710 0 0.706 4 0.700 4 0.715 7 0.038 7
Table 12  不同指标在BSS中所构建网络中的预测效果(AUC值)
Fig.4  不同指标在不同学科中所构建的共词网络的预测效果最值之间的差值(AUC值)
Fig.5  不同学科中所构建的不同网络的预测效果平均值(AUC值)
Fig. 6  不同指标在不同学科中所构建的不同网络中的预测效果(AUC值)
Fig.7  网络拓扑参数的改变对不同指标预测效果的影响
[1] Callon M, Courtial J P, Turner W A, et al. From Translations to Problematic Networks: An Introduction to Co-word Analysis[J]. Social Science Information, 1983, 22(2): 191-235.
doi: 10.1177/053901883022002003
[2] 王艳东, 李萌萌, 付小康, 等. 基于社交媒体共词网络的灾情发展态势探测方法[J]. 武汉大学学报(信息科学版), 2020, 45(5): 691-698, 735.
[2] (Wang Yandong, Li Mengmeng, Fu Xiaokang, et al. A New Method to Detect the Development Situation of Disasters Based on Social Media Co-word Network[J]. Geomatics and Information Science of Wuhan University, 2020, 45(5): 691-698, 735.)
[3] 张斌, 马费城. 科学知识网络中的链路预测研究述评[J]. 中国图书馆学报, 2015, 41(217): 99-113.
[3] (Zhang Bin, Ma Feicheng. A Review on Link Prediction of Scientific Knowledge Network[J]. Journal of Library Science in China, 2015, 41(217): 399-113.)
[4] 张斌, 李亚婷, 戴怡清. 聚集系数对合著网络链路预测效果的影响研究[J]. 情报理论与实践, 2018, 41(1) : 100-104, 99.
[4] (Zhang Bin, Li Yating, Dai Yiqing. Research on the Influence of Clustering Coefficient on the Link Prediction in Collaboration Networks[J]. Information Studies: Theory & Application, 2018, 41(1): 100-104, 99.)
[5] Breiman L. Random Forests[J]. Machine Learning, 2001, 45(1): 5-32.
doi: 10.1023/A:1010933404324
[6] Benchettara N, Kanawati R, Rouveirol C. Supervised Machine Learning Applied to Link Prediction in Bipartite Social Networks[C]// Proceedings of the 2010 International Conference on Advances in Social Networks Analysis and Mining. 2010: 326-330.
[7] 丁敬达, 郭杰. 融合内容相似度和路径相似性的潜在作者合作关系挖掘[J]. 情报理论与实践, 2021, 44(1): 124-128, 123.
[7] (Ding Jingda, Guo Jie. Mining Potential Author Cooperative Relationships Based on the Similarity of Content and Path[J]. Information Studies: Theory & Application, 2021, 44(1): 124-128, 123.)
[8] 吴胜男, 蒲虹君, 田若楠, 等. 网络结构对链路预测算法的影响研究——基于元分析视角[J]. 数据分析与知识发现, 2021, 5(11): 102-113.
[8] (Wu Shengnan, Pu Hongjun, Tian Ruonan, et al. Network Structure's Impacts on Link Prediction Algorithm from Meta-Analysis Perspective[J]. Data Analysis and Knowledge Discovery, 2021, 5(11): 102-113.)
[9] Freeman L C. A Set of Measures of Centrality Based on Betweenness[J]. Sociometry, 1977, 40(1): 35-41.
doi: 10.2307/3033543
[10] Adamic L A, Adar E. Friends and Neighbors on the Web[J]. Social Networks, 2003, 25(3): 211-230.
doi: 10.1016/S0378-8733(03)00009-1
[11] Agrawal R, Imieliński T, Swami A. Mining Association Rules Between Sets of Items in Large Databases[C]// Proceedings of the 1993 ACM SIGMOD International Conference on Management of Data. 1993: 207-216.
[12] Kleinberg J M. Navigation in A Small World[J]. Nature, 2000, 406(6798): 845.
doi: 10.1038/35022643
[13] Katz L. A New Status Index Derived from Sociometric Analysis[J]. Psychometrika, 1953, 18(1): 39-43.
doi: 10.1007/BF02289026
[14] Fouss F, Pirotte A, Renders J M, et al. Random-Walk Computation of Similarities Between Nodes of a Graph with Application to Collaborative Recommendation[J]. IEEE Transactions on Knowledge and Data Engineering, 2007, 19(3): 355-369.
doi: 10.1109/TKDE.2007.46
[15] Brin S, Page L. The Anatomy of a Large-Scale Hypertextual Web Search Engine[J]. Computer Networks and ISDN Systems, 1998, 30(1-7): 107-117.
doi: 10.1016/S0169-7552(98)00110-X
[16] 张敏, 朱明星, 刘晓彤. 基于关键词网络挖掘和时序分析的云计算研究扩散与演化[J]. 图书馆工作与研究, 2016(12): 61-68.
[16] (Zhang Min, Zhu Mingxing, Liu Xiaotong. The Diffusion and Evolution of Cloud Computing Research Based on Keywords Network Mining and Time Sequence Analysis[J]. Library Work and Study, 2016(12): 61-68.)
[17] 刘自强, 岳丽欣, 许海云, 等. 时序共词网络构建及其动态可视化研究[J]. 情报学报, 2020, 39(2): 186-198.
[17] (Liu Ziqiang, Yue Lixin, Xu Haiyun, et al. Construction of a Temporal Co-word Network and Its Dynamic Visualization[J]. Journal of the China Society for Scientific and Technical Information, 2020, 39(2): 186-198.)
[18] 王晓光, 程齐凯. 基于NEViewer的学科主题演化可视化分析[J]. 情报学报, 2013, 32(9): 900-911.
[18] (Wang Xiaoguang, Cheng Qikai. Analysis on Evolution of Research Topics in a Discipline Based on NEViewer[J]. Journal of the China Society for Scientific and Technical Information, 2013, 32(9): 900-911.)
[19] Salton G, McGill M J. Introduction to Modern Information Retrieval[M]. New York: McGraw-Hill, 1983.
[20] Jaccard P. Étude Comparative de la Distribution Florale Dans Une Portion des Alpes et du Jura[J]. Bulletin de la Societe Vaudoise des Sciences Naturelles, 1901, 37: 547-579.
[21] Sorensen T. A Method of Establishing Groups of Equal Amplitude in Plant Sociology Based on Similarity of Species Content and Its Application to Analyses of the Vegetation on Danish Commons[M]. København: I kommission hos E. Munksgaard, 1948.
[22] Ravasz E, Somera A L, Mongru D A, et al. Hierarchical Organization of Modularity in Metabolic Networks[J]. Science, 2002, 297(5586): 1551-1555.
doi: 10.1126/science.1073374 pmid: 12202830
[23] Leicht E A, Holme P, Newman M E J. Vertex Similarity in Networks[J]. Physical Review E, Statistical, Nonlinear, and Soft Matter Physics, 2006, 73(2 Pt 2): 026120.
[24] 宫雪, 崔雷. 基于医学主题词共现网络的链接预测研究[J]. 情报杂志, 2018, 37(1): 66-71, 52.
[24] (Gong Xue, Cui Lei. Link Prediction in MeSH Terms Co-occurring Networks[J]. Journal of Intelligence, 2018, 37(1): 66-71, 52.)
[25] Zhou T, Lv L Y, Zhang Y C. Predicting Missing Links via Local Information[J]. The European Physical Journal B, 2009, 71(4): 623-630.
doi: 10.1140/epjb/e2009-00335-8
[26] 邹列, 张月霞. 基于复杂网络的Psor链路预测算法[J]. 电讯技术, 2021, 61(12): 1579-1585.
[26] (Zou Lie, Zhang Yuexia. A Psor Link Prediction Algorithm Based on Complex Network[J]. Telecommunication Engineering, 2021, 61(12): 1579-1585.)
[27] Leskovec J, Adamic L A, Huberman B A. The Dynamics of Viral Marketing[J]. ACM Transactions on the Web (TWEB), 2007, 1(1):5-es.
[28] Barabasi A L, Albert R. Emergence of Scaling in Random Networks[J]. Science, 1999, 286(5439): 509-512.
doi: 10.1126/science.286.5439.509 pmid: 10521342
[29] 刘宇航, 尹小庆, 林云. 基于网络资源流量的链路预测方法[J]. 计算机工程, 2022, 48(9): 78-88.
doi: 10.19678/j.issn.1000-3428.0062044
[29] (Liu Yuhang, Yin Xiaoqing, Lin Yun. Link Prediction Method Based on Network Resource Traffic[J]. Computer Engineering, 2022, 48(9): 78-88.)
doi: 10.19678/j.issn.1000-3428.0062044
[30] 刘思, 刘海, 陈启买, 等. 基于网络表示学习与随机游走的链路预测算法[J]. 计算机应用, 2017, 37(8): 2234-2239.
doi: 10.11772/j.issn.1001-9081.2017.08.2234
[30] (Liu Si, Liu Hai, Chen Qimai, et al. Link Prediction Algorithm Based on Network Representation Learning and Random Walk[J]. Journal of Computer Applications, 2017, 37(8): 2234-2239.)
doi: 10.11772/j.issn.1001-9081.2017.08.2234
[31] 曹志威, 樊志杰, 王青杨, 等. 一种降噪自编码器的复杂网络链路预测算法[J]. 小型微型计算机系统, 2023, 44(3): 665-672.
[31] (Cao Zhiwei, Fan Zhijie, Wang Qingyang, et al. Link Prediction Algorithm Based on Denoising Autoencoder in Complex Networks[J]. Journal of Chinese Computer Systems, 2023, 44(3): 665-672.)
[32] 吕琳媛. 复杂网络链路预测[J]. 电子科技大学学报, 2010, 39(5): 651-661.
[32] (Lv Linyuan. Link Prediction on Complex Networks[J]. Journal of University of Electronic Science and Technology of China, 2010, 39(5): 651-661.)
[33] Xiong T, Zhou L, Zhao Y, et al. Mining Semantic Information of Co-word Network to Improve Link Prediction Performance[J]. Scientometrics, 2022, 127(6): 2981-3004.
doi: 10.1007/s11192-021-04247-9
[34] Zachary W W. An Information Flow Model for Conflict and Fission in Small Groups[J]. Journal of Anthropological Research, 1977, 33(4): 452-473.
doi: 10.1086/jar.33.4.3629752
[35] Lusseau D, Schneider K, Boisseau O J, et al. The Bottlenose Dolphin Community of Doubtful Sound Features A Large Proportion of Long-Lasting Associations[J]. Behavioral Ecology and Sociobiology, 2003, 54 (4): 396-405.
doi: 10.1007/s00265-003-0651-y
[36] Guimerà R, Mossa S, Turtschi A, et al. The Worldwide Air Transportation Network: Anomalous Centrality, Community Structure, and Cities' Global Roles[J]. PNAS, 2005, 102(22): 7794-7799.
pmid: 15911778
[37] White J G, Southgate E, Thomson J N, et al. The Structure of the Nervous System of the Nematode Caenorhabditis Elegans[J]. Philosophical Transactions of the Royal Society of London Series B, Biological Sciences, 1986, 314(1165): 1-340.
[38] Reed J L, Vo T D, Schilling C H, et al. An Expanded Genome-scale Model of Escherichia Coli K-12 (iJR904 GSM/GPR)[J]. Genome Biology, 2003, 4 (9): R54.
doi: 10.1186/gb-2003-4-9-r54 pmid: 12952533
[39] Guimerà R, Sales-Pardo M. Missing and Spurious Interactions and the Reconstruction of Complex Networks[J]. Proceedings of the National Academy of Sciences of the United States of America, 2009, 106(52): 22073-22078.
doi: 10.1073/pnas.0908366106 pmid: 20018705
[40] Clauset A, Moore C, Newman M E J. Hierarchical Structure and the Prediction of Missing Links in Networks[J]. Nature, 2008, 453(7191): 98-101.
doi: 10.1038/nature06830
[41] Li H J, An H Z, Wang Y, et al. Evolutionary Features of Academic Articles Co-keyword Network and Keywords Co-occurrence Network: Based on Two-Mode Affiliation Network[J]. Physica A: Statistical Mechanics and Its Applications, 2016, 450: 657-669.
doi: 10.1016/j.physa.2016.01.017
[42] Liben-Nowell D, Kleinberg J. The Link Prediction Problem for Social Networks[C]// Proceedings of the 12th International Conference on Information and Knowledge Management. 2003: 556-559.
[43] Sun J C, Feng L, Xie J R, et al. Revealing the Predictability of Intrinsic Structure in Complex Networks[J]. Nature Communications, 2020, 11: 574.
doi: 10.1038/s41467-020-14418-6 pmid: 31996676
[44] 张斌, 马费成. 科学知识网络中的链路预测研究述评[J]. 中国图书馆学报, 2015, 41(3): 99-113.
[44] (Zhang Bin, Ma Feicheng. A Review on Link Prediction of Scientific Knowledge Network[J]. Journal of Library Science in China, 2015, 41 (3): 99-113.)
[45] 岳增慧, 许海云, 王倩飞. 基于局部信息相似性的学科引证知识扩散动态链路预测研究[J]. 情报理论与实践, 2020, 43(2): 84-91, 99.
[45] (Yue Zenghui, Xu Haiyun, Wang Qianfei. Dynamic Link Prediction of Knowledge Diffusion in Disciplinary Citation Networks Based on Local Information[J]. Information Studies: Theory & Application, 2020, 43(2): 84-91, 99.)
[46] Euler L. The Solution of a Problem Relating to the Geometry of Position[J]. Commentarii Academiae Scientiarum Petropolitanae, 1741, 8: 128-140.
[47] Euler L. The Seven Bridges of Königsberg[J]. The World of Mathematics, 1956, 1: 573-580.
[48] Watts D J, Strogatz S H. Collective Dynamics of ‘Small-World’ Networks.[J]. Nature, 1998, 393(6684): 440-442.
doi: 10.1038/30918
[49] Moreno J L. Who Shall Survive? A New Approach to the Problem of Human Interrelations[J]. The Journal of Nervous & Mental Disease, 1934, 80(6): 724-725.
[50] Erdős P, Rényi A. On Random Graphs I[J]. Publicationes Mathematicae, 1959, 6: 290-297.
[51] Freeman L C. Centrality in Social Networks: Conceptual Clarification[J]. Social Networks, 1978-1979, 1(3): 215-239.
doi: 10.1016/0378-8733(78)90021-7
[52] Bavelas A. Communication Patterns in Task-Oriented Groups[J]. The Journal of the Acoustical Society of America, 1950, 22(6): 725-730.
doi: 10.1121/1.1906679
[53] 刘敏娟, 张学福, 颜蕴. 基于词频、词量、累积词频占比的共词分析词集范围选取方法研究[J]. 图书情报工作, 2016, 60(23): 135-142.
doi: 10.13266/j.issn.0252-3116.2016.23.017
[53] (Liu Minjuan, Zhang Xuefu, Yan Yun. Research on Method of Determining Scope of Word Set in Co-word Analysis Based on Word Frequency, Number of Words, Cumulative Word Frequency in Proportion[J]. Library and Information Service, 2016, 60(23): 135-142.)
doi: 10.13266/j.issn.0252-3116.2016.23.017
[54] Hardy M. Pareto's Law[J]. The Mathematical Intelligencer, 2010, 32(3): 38-43.
doi: 10.1007/s00283-010-9159-2
[55] 盛积良, 黄毅, 李居超. 我国行业风险敞口与行业网络结构的相关性研究[J/OL]. 中国管理科学: 1-15 [2022-09-12]. DOI:10.16381/j.cnki.issn1003-207x.2021.2369.
[55] (Sheng Jiliang, Huang Yi, Li Juchao. Research on the Correlation Between Industry Risk and Industry Network Structure in China[J/OL]. Chinese Journal of Management Science:: 1-15 [2022-09-12]. DOI:10.16381/j.cnki.issn1003-207x.2021.2369.)
[56] Lemaître G, Nogueira F, Aridas C K. Imbalanced-Learn: A Python Toolbox to Tackle the Curse of Imbalanced Datasets in Machine Learning[OL]. arXiv Preprint, arXiv: 1609.06570.
[57] Khoshgoftaar T M, Golawala M, Van Hulse J. An Empirical Study of Learning from Imbalanced Data Using Random Forest[C]// Proceedings of 19th IEEE International Conference on Tools with Artificial Intelligence. 2007: 310-317.
[58] Sun Y Z, Han J W. Mining Heterogeneous Information Networks: Principles and Methodologies[J]. Synthesis Lectures on Data Mining and Knowledge Discovery, 2012, 3(2):1-159.
[59] Abro M, Nawaz H, Abro W A. Performance Analysis of Dissimilar Classification Methods using RapidMiner[J]. Sindh University Research Journal, 2016, 48(1): 185-188.
[60] Ristoski P, Bizer C, Paulheim H. Mining the Web of Linked Data with RapidMiner[J]. Journal of Web Semantics, 2015, 35(P3): 142-151.
doi: 10.1016/j.websem.2015.06.004
[61] 黄丹阳, 张力文. 基于局部社团结构平衡的双模符号网络链路预测研究[J]. 统计研究, 2021, 38(12): 131-144.
[61] (Huang Danyang, Zhang Liwen. Link Prediction of Bipartite Signed Network Based on Structural Balance in Local Communities[J]. Statistical Research, 2021, 38(12): 131-144.)
[62] 吴翼腾, 于洪涛, 顾泽宇. 基于统一描述网络结构模型的链路预测方法[J]. 计算机工程, 2022, 48(7): 51-58.
doi: 10.19678/j.issn.1000-3428.0061523
[62] (Wu Yiteng, Yu Hongtao, Gu Zeyu. Link Prediction Method Based on Network Structure Model for Unified Description[J]. Computer Engineering, 2022, 48(7): 51-58.)
doi: 10.19678/j.issn.1000-3428.0061523
[63] Hanley J A, McNeil B J. The Meaning and Use of the Area under a Receiver Operating Characteristic (ROC) Curve[J]. Radiology, 1982, 143(1): 29-36.
doi: 10.1148/radiology.143.1.7063747 pmid: 7063747
[64] Chauhan R, Kaur H. Predictive Analytics and Data Mining: A Framework for Optimizing Decisions with R Tool[A]// Advances in Secure Computing, Internet Services, and Applications[M]. DOI: 10.4018/978-1-4666-4940-8.ch004.
[65] 万杨晔, 郭进利. 基于资源分配与图嵌入加权的链路预测算法[J]. 计算机与现代化, 2021(7): 12-17.
[65] (Wan Yangye, Guo Jinli. Link Prediction Algorithm Based on Resource Allocation and Graph Embedding Weighting[J]. Computer and Modernization, 2021(7): 12-17.)
[66] 吴晨程, 周银座. 基于图嵌入法的时序网络链路预测研究[J]. 杭州师范大学学报(自然科学版), 2020, 19(5): 472-480.
[66] Wu Chencheng, Zhou Yinzuo. Link Prediction Research on Temporal Network Based on Graph Embedding Method[J]. Journal of Hangzhou Normal University (Natural Science Edition), 2020, 19(5): 472-480.)
[1] 单晓红,王春稳,刘晓燕,韩晟熙,杨娟. 开放式创新社区领先用户识别——知识基础观视角*[J]. 数据分析与知识发现, 2021, 5(9): 85-96.
[2] 吴胜男, 田若楠, 蒲虹君, 梁雯琪, 张亚飞, 于琦, 贺培凤. 基于社交媒体的医药领域关联主题预测方法研究*[J]. 数据分析与知识发现, 2021, 5(12): 98-109.
[3] 吴胜男, 蒲虹君, 田若楠, 梁雯琪, 于琦. 网络结构对链路预测算法的影响研究*——基于元分析视角[J]. 数据分析与知识发现, 2021, 5(11): 102-113.
[4] 王松, 杨洋, 刘新民. 基于图注意力网络的开放式创新社区用户创意潜在价值发现研究*[J]. 数据分析与知识发现, 2021, 5(11): 89-101.
[5] 曾庆田,胡晓慧,李超. 融合主题词嵌入和网络结构分析的主题关键词提取方法 *[J]. 数据分析与知识发现, 2019, 3(7): 52-60.
[6] 程齐凯,王佳敏,陆伟. 基于引用共词网络的领域基础词汇发现研究*[J]. 数据分析与知识发现, 2019, 3(6): 57-65.
[7] 操玉杰,毛进,潘荣清,巴志超,李纲. 学科交叉研究的演化阶段特征分析*——以医学信息学为例[J]. 数据分析与知识发现, 2019, 3(5): 107-116.
[8] 刘俊婉,龙志昕,王菲菲. 基于LDA主题模型与链路预测的新兴主题关联机会发现研究*[J]. 数据分析与知识发现, 2019, 3(1): 104-117.
[9] 吕伟民, 王小梅, 韩涛. 结合链路预测和ET机器学习的科研合作推荐方法研究*[J]. 数据分析与知识发现, 2017, 1(4): 38-45.
[10] 马红, 蔡永明. 共词网络LDA模型的中文文本主题分析: 以交通法学文献(2000-2016)为例*[J]. 数据分析与知识发现, 2016, 32(12): 17-26.
[11] 黄炜,余辉,李岳峰. 国内网络反恐研究的现状、问题和展望*[J]. 现代图书情报技术, 2016, 32(11): 1-10.
[12] 魏静,朱恒民,宋瑞晓,蒋世兵. 个体视角下的网络舆情传递链路预测分析*[J]. 现代图书情报技术, 2016, 32(1): 55-64.
[13] 夏立新,谭荧. LOD的网络结构分析与可视化*[J]. 现代图书情报技术, 2016, 32(1): 65-72.
[14] 王咏,倪波,袁勤俭,朱为勇,梁伟. 20世纪网络通信业的发展——IT技术世纪回眸之四[J]. 现代图书情报技术, 2000, 16(6): 14-17.
[15] 钱华林. 中国科技网的发展[J]. 现代图书情报技术, 1997, 13(3): 3-6.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
版权所有 © 2015 《数据分析与知识发现》编辑部
地址:北京市海淀区中关村北四环西路33号 邮编:100190
电话/传真:(010)82626611-6626,82624938
E-mail:jishu@mail.las.ac.cn