Please wait a minute...
Advanced Search
数据分析与知识发现  2018, Vol. 2 Issue (3): 60-69     https://doi.org/10.11925/infotech.2096-3467.2017.0964
  研究论文 本期目录 | 过刊浏览 | 高级检索 |
基于速度优化和社区偏向的标签传播算法*
张素琪1(), 高星2, 霍士杰2, 郭京津2, 顾军华2
1(天津商业大学信息工程学院 天津 300134)
2(河北工业大学计算机科学与软件学院 天津 300401)
A Label Propagation Algorithm Based on Speed Optimization and Community Preference
Zhang Suqi1(), Gao Xing2, Huo Shijie2, Guo Jingjin2, Gu Junhua2
1(School of Information Engineering, Tianjin University of Commerce, Tianjin 300134, China)
2(School of Computer Science and Software, Hebei University of Technology, Tianjin 300401, China)
全文: HTML ( 4
输出: BibTeX | EndNote (RIS)      
摘要 

目的】减少标签传播算法的无效更新、解决算法准确率低的问题。【方法】引入节点信息列表以指导更新过程, 避免不必要的更新, 从而加快执行速度; 采取基于节点对社区偏向程度的更新规则, 提高社区划分的准确率。【结果】实验结果表明, 相比标签传播算法和两种较好的改进算法, 本文提出的基于速度优化和社区偏向的标签传播算法在较大规模网络上的迭代次数减少了几十倍, 在真实网络数据集的模块度相对较高, 在LFR基准网络数据集的归一化互信息值和F-measure值分别有明显提高。【局限】更新顺序具有随机性, 需进一步研究。【结论】本文算法在提高执行速度的基础上, 提高了社区发现的准确率。

服务
把本文推荐给朋友
加入引用管理器
E-mail Alert
RSS
作者相关文章
张素琪
高星
霍士杰
郭京津
顾军华
关键词 标签传播算法节点信息列表节点对社区偏向程度    
Abstract

[Objective] This paper aims to reduce the unnecessary updates and improve the accuracy of Label Propagation Algorithm. [Methods] First, we used the node information list to direct the update process and increase the execution speed. Then, we proposed new updating rules based on the node preference to improve the accuracy of community detection. [Results] Compared with the classic label propagation algorithm and two improved algorithms, the proposed one significantly reduced the number of iterations on large-scale social networks, as well as improved the value of Normalized Mutual Information and F-measure of LFR benchmark network. [Limitations] The new algorithm’s updating sequence is random, which needs to be investigated in further studies. [Conclusions] The SOCP_LPA improves the accuracy of community detection and the processing speed.

Key wordsLabel Propagation Algorithm    Node Information List    Node Preference to Community
收稿日期: 2017-09-22      出版日期: 2020-08-31
ZTFLH:  TP391  
基金资助:*本文系河北省科技计划项目“智慧热网大数据分析方法及节能技术研究”(项目编号: 17210305D)、天津市科技计划项目“智慧热网节能技术及应用”(项目编号: 16ZXHLSF0023)和天津市自然科学基金项目“基于图模式的云案例检索技术研究”(项目编号: 15JCQNJC00600)的研究成果之一
引用本文:   
张素琪, 高星, 霍士杰, 郭京津, 顾军华. 基于速度优化和社区偏向的标签传播算法*[J]. 数据分析与知识发现, 2018, 2(3): 60-69.
Zhang Suqi,Gao Xing,Huo Shijie,Guo Jingjin,Gu Junhua. A Label Propagation Algorithm Based on Speed Optimization and Community Preference. Data Analysis and Knowledge Discovery, 2018, 2(3): 60-69.
链接本文:  
https://manu44.magtech.com.cn/Jwk_infotech_wk3/CN/10.11925/infotech.2096-3467.2017.0964      或      https://manu44.magtech.com.cn/Jwk_infotech_wk3/CN/Y2018/V2/I3/60
网络名称 节点数 边数 描述
email 1 133 5 451 美国大学生email社交网络[21]
CA-GrQc 4 158 13 422 广义相对论研究团体网络[22]
CA-Hepth 8 638 24 807 高能物理学理论研究团体网络[22]
PGP 10 680 24 316 Pretty-Good-Privacy算法研究网络[23]
  数据集介绍
网络名称 5次迭代后正确划分节点
所占百分比
算法收敛所需的
迭代次数
email 96.64% 17.89
CA-GrQc 99.37% 15.7
CA-Hepth 98.97% 59.75
PGP 99.37% 21.9
  网络的收敛信息
网络名称 节点数 边数 描述
karate 34 78 美国空手道俱乐部网络[24]
dolphins 62 159 海豚数据网络[18]
polbooks 105 441 亚马逊美国政治书销售网络[25]
football 115 613 美国秋季大学生足球队网络[26]
  数据集介绍
参数 意义
N 网络中的节点个数
k 网络的平均度数
maxk 网络的最大度数
minc 社区内的最少节点个数
maxc 社区内的最多节点个数
mu 节点与社区外部连接的概率
  LFR基准网络的主要参数
网络名称 N k maxk minc maxc mu
1000S 1 000 10 50 10 50 0.1~0.8
1000B 1 000 10 50 20 100 0.1~0.8
5000S 5 000 10 50 10 50 0.1~0.8
5000B 5 000 10 50 20 100 0.1~0.8
  LFR基准网络数据集描述
  真实网络数据集上的迭代次数对比
网络名称 SOCP_LPA LPA Cen_LP NIBLPA
karate 0.371 0.345 0.416 0.423
dolphins 0.523 0.483 0.523 0.521
polbooks 0.518 0.493 0.518 0.497
football 0.573 0.588 0.589 0.582
email 0.497 0.234 0.427 0.427
CA-GrQc 0.788 0.775 0.740 0.710
CA-Hepth 0.671 0.628 0.621 0.610
PGP 0.819 0.802 0.750 0.783
  平均模块度对比
网络名称 SOCP_LPA LPA Cen_LP NIBLPA
karate 0.371 0.416 0.416 0.423
dolphins 0.526 0.527 0.526 0.521
polbooks 0.523 0.526 0.523 0.497
football 0.603 0.605 0.601 0.582
email 0.545 0.443 0.536 0.427
CA-GrQc 0.796 0.783 0.743 0.710
CA-Hepth 0.685 0.647 0.633 0.610
PGP 0.827 0.816 0.756 0.783
  最高模块度对比
  LFR基准网络上不同算法的NMI对比
  LFR基准网络上不同算法的F-measure对比
[1] Radicchi F, Castellano C, Cecconi F, et al.Defining and Identifying Communities in Networks[J]. Proceedings of the National Academy of Sciences of the United States of America, 2004, 101(9): 2658-2663.
doi: 10.1073/pnas.0400054101
[2] Girvan M, Newman M E J. Community Structure in Social and Biological Networks[J]. Proceedings of the National Academy of Sciences of the United States of America, 2002, 99(12): 7821-7826.
doi: 10.1073/pnas.122653799
[3] Newman M E J. Fast Algorithm for Detecting Community Structure in Networks[J]. Physical Review E, 2004, 69(6): 66-73.
doi: 10.1103/PhysRevE.69.066133 pmid: 15244693
[4] Pons P, Latapy M.Computing Communities in Large Networks Using Random Walks[J]. Journal of Graph Algorithms and Applications, 2006, 10(2): 191-218.
doi: 10.1007/11569596_31
[5] Shang R, Bai J, Jiao L, et al.Community Detection Based on Modularity and an Improved Genetic Algorithm[J]. Physica A: Statistical Mechanics and Its Applications, 2013, 392(5): 1215-1231.
doi: 10.1016/j.physa.2012.11.003
[6] Brandes U, Delling D, Gaertler M, et al.On Modularity Clustering[J]. IEEE Transactions on Knowledge and Data Engineering, 2008, 20(2): 172-188.
doi: 10.1109/TKDE.2007.190689
[7] Bu Z, Zhang C, Xia Z, et al.A Fast Parallel Modularity Optimization Algorithm (FPMQA) for Community Detection in Online Social Network[J]. Knowledge-Based Systems, 2013, 50(3): 246-259.
doi: 10.1016/j.knosys.2013.06.014
[8] Zhang S, Zhang H.Normalized Modularity Optimization Method for Community Identification with Degree Adjustment[J]. Physical Review E, 2013, 88(5): 447-453.
doi: 10.1103/PhysRevE.88.052802 pmid: 24329313
[9] Shiga M, Takigawa I, Mamitsuka H.A Spectral Clustering Approach to Optimally Combining Numerical Vectors with a Modular Network[C]// Proceedings of the 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Jose, California, USA. New York, NY, USA: ACM, 2007: 647-656.
[10] Shen H, Cheng X.Spectral Methods for the Detection of Network Community Structure: A Comparative Analysis[J]. Journal of Statistical Mechanics: Theory and Experiment, 2010(10): 100-108
doi: 10.1088/1742-5468/2010/10/P10020
[11] Huang L, Li R, Chen H, et al.Detecting Network Communities Using Regularized Spectral Clustering Algorithm[J]. Artificial Intelligence Review, 2014,41(4): 579-594.
doi: 10.1007/s10462-012-9325-3
[12] Raghavan U N, Albert R, Kumara S.Near Linear Time Algorithm to Detect Community Structures in Large-scale Networks[J]. Physical Review E, 2007, 76(3): 96-106.
doi: 10.1103/PhysRevE.76.036106 pmid: 17930305
[13] Zhu X, Ghahramani Z.Learning from Labeled and Unlabeled Data with Label Propagation [R]. Technical Report CMU-CALD-02-107. Pittsburghers, USA: Carnegie Mellon University, 2002.
[14] Xie J, Szymanski B K.LabelRank: A Stabilized Label Propagation Algorithm for Community Detection in Networks[C]//Proceedings of 2013 IEEE 2nd Network Science Workshop (NSW). IEEE, 2013: 138-143.
[15] Xing Y, Meng F, Zhou Y, et al.A Node Influence Based Label Propagation Algorithm for Community Detection in Networks[J]. The Scientific World Journal, 2014(5): 210-223.
doi: 10.1155/2014/627581 pmid: 4066938
[16] Sun H, Liu J, Huang J, et al.CenLP: A Centrality-based Label Propagation Algorithm for Community Detection in Networks[J]. Physica A: Statistical Mechanics and Its Applications, 2015, 436(15): 767-780.
doi: 10.1016/j.physa.2015.05.080
[17] Chin J H, Ratnavelu K.Detecting Community Structure by Using a Constrained Label Propagation Algorithm[J]. PLoS One, 2016, 11(5): 214-223.
doi: 10.1371/journal.pone.0155320 pmid: 27176470
[18] Shang R, Zhang W, Jiao L, et al.Circularly Searching Core Nodes Based Label Propagation Algorithm for Community Detection[J]. International Journal of Pattern Recognition and Artificial Intelligence, 2016, 30(8): 165-187.
doi: 10.1142/S0218001416590242
[19] Li W, Huang C, Wang M, et al.Stepping Community Detection Algorithm Based on Label Propagation and Similarity[J]. Physica A: Statistical Mechanics and Its Applications, 2017, 472: 145-155.
doi: 10.1016/j.physa.2017.01.030
[20] Ma T, Xia Z.An Improved Label Propagation Algorithm Based on Node Importance and Random Walk for Community Detection[J]. Modern Physics Letters B, 2017, 31(14): 98-116.
doi: 10.1142/S0217984917501627
[21] Zhang X K, Song C, Jia J, et al.An Improved Label Propagation Algorithm Based on the Similarity Matrix Using Random Walk[J]. International Journal of Modern Physics B, 2016, 30(16): 93-108.
[22] Leskovec J, Kleinberg J, Faloutsos C.Graph Evolution: Densification and Shrinking Diameters[J]. ACM Transactions on Knowledge Discovery from Data (TKDD), 2007, 1(1): 2-9.
doi: 10.1145/1217299
[23] Zhang X K, Fei S, Song C, et al.Label Propagation Algorithm Based on Local Cycles for Community Detection[J]. International Journal of Modern Physics B, 2015, 29(5): 15-28.
doi: 10.1142/S0217979215500290
[24] Liu S, Zhu F, Liu H, et al.A Core Leader Based Label Propagation Algorithm for Community Detection[J]. China Communications, 2016, 13(12): 97-106.
[25] Shang R, Liu H, Jiao L.Multi-objective Clustering Technique Based on K-nodes Update Policy and Similarity Matrix for Mining Communities in Social Networks[J]. Physica A: Statistical Mechanics and Its Applications, 2017, 486: 1-24.
doi: 10.1016/j.physa.2017.05.026
[26] Zhou K, Martin A, Pan Q, et al.Evidential Label Propagation Algorithm for Graphs[C] // Proceedings of the 19th International Conference on Information Fusion, Heidelberg, Germany.IEEE, 2016: 1316-1323.
[27] Lancichinetti A, Fortunato S, Radicchi F.Benchmark Graphs for Testing Community Detection Algorithms[J]. Physical Review E, 2008, 78(4): 46-61.
doi: 10.1103/PhysRevE.78.046110 pmid: 18999496
[28] Danon L, Diaz-Guilera A, Duch J, et al.Comparing Community Structure Identification[J]. Journal of Statistical Mechanics: Theory and Experiment, 2005, 15(9): 98-108.
[29] Zhu M, Meng F, Zhou Y.Semisupervised Clustering for Networks Based on Fast Affinity Propagation[J]. Mathematical Problems in Engineering, 2013: 10-23.
doi: 10.1155/2013/385265
[1] 王鸿, 舒展, 高印权, 田文洪. 一种单分类器联合多任务网络的隐式句间关系分析方法*[J]. 数据分析与知识发现, 2021, 5(11): 80-88.
[2] 吴彦文, 蔡秋亭, 刘智, 邓云泽. 融合多源数据和场景相似度计算的数字资源推荐研究*[J]. 数据分析与知识发现, 2021, 5(11): 114-123.
[3] 李振宇, 李树青. 嵌入隐式相似群的深度协同过滤算法*[J]. 数据分析与知识发现, 2021, 5(11): 124-134.
[4] 董淼, 苏中琪, 周晓北, 兰雪, 崔志刚, 崔雷. 利用Text-CNN改进PubMedBERT在化学诱导性疾病实体关系分类效果的尝试[J]. 数据分析与知识发现, 2021, 5(11): 145-152.
[5] 余传明, 张贞港, 孔令格. 面向链接预测的知识图谱表示模型对比研究*[J]. 数据分析与知识发现, 2021, 5(11): 29-44.
[6] 丁浩, 艾文华, 胡广伟, 李树青, 索炜. 融合用户兴趣波动时序的个性化推荐模型*[J]. 数据分析与知识发现, 2021, 5(11): 45-58.
[7] 华斌, 吴诺, 贺欣. 基于知识融合的政务信息化项目多专家审批意见整合*[J]. 数据分析与知识发现, 2021, 5(10): 124-136.
[8] 王媛, 时恺泽, 牛振东. 一种用于实体关系三元组抽取的位置辅助分步标记方法*[J]. 数据分析与知识发现, 2021, 5(10): 71-80.
[9] 杨辰, 陈晓虹, 王楚涵, 刘婷婷. 基于用户细粒度属性偏好聚类的推荐策略*[J]. 数据分析与知识发现, 2021, 5(10): 94-102.
[10] 戴志宏, 郝晓玲. 上下位关系抽取方法及其在金融市场的应用*[J]. 数据分析与知识发现, 2021, 5(10): 60-70.
[11] 汪雪锋, 任惠超, 刘玉琴. 融合聚类信息的技术主题图可视化方法研究 [J]. 数据分析与知识发现, 0, (): 1-.
[12] 王一钒,李博,史话,苗威,姜斌. 古汉语实体关系联合抽取的标注方法*[J]. 数据分析与知识发现, 2021, 5(9): 63-74.
[13] 车宏鑫,王桐,王伟. 前列腺癌预测模型对比研究*[J]. 数据分析与知识发现, 2021, 5(9): 107-114.
[14] 周阳,李学俊,王冬磊,陈方,彭莉娟. 炸药配方设计知识图谱的构建与可视分析方法研究*[J]. 数据分析与知识发现, 2021, 5(9): 42-53.
[15] 马江微, 吕学强, 游新冬, 肖刚, 韩君妹. 融合BERT与关系位置特征的军事领域关系抽取方法*[J]. 数据分析与知识发现, 2021, 5(8): 1-12.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
版权所有 © 2015 《数据分析与知识发现》编辑部
地址:北京市海淀区中关村北四环西路33号 邮编:100190
电话/传真:(010)82626611-6626,82624938
E-mail:jishu@mail.las.ac.cn