Please wait a minute...
Data Analysis and Knowledge Discovery  2018, Vol. 2 Issue (10): 84-94    DOI: 10.11925/infotech.2096-3467.2018.0542
Current Issue | Archive | Adv Search |
Comparing on Community Detection Algorithms for Information Mining
Chen Yunwei1(), Zhang Ruihong1,2
1Chengdu Library and Information Center, Chinese Academy of Sciences, Chengdu 610041, China
2University of Chinese Academy of Sciences, Beijing 101408, China
Download: PDF (1869 KB)   HTML ( 1
Export: BibTeX | EndNote (RIS)      
Abstract  

[Objective] This paper compares community detection algorithms in the field of complex network analysis, aiming to support related information science studies. [Methods] First, we identified the similarities and differences of several community detection algorithms (i.e. theoretical frameworks and calculation methods). Then, we examined these algorithms with small data sets. Third, we expanded the sample size, and evaluated the performance of Louvain algorithm, Louvain algorithm with multilevel refinement, and the SLM algorithm with the collaboration and citation networks. [Results] On small dataset, the detection results of GN and FN algorithms were similar, and the results of SLM algorithm were better than those of the Louvain algorithm and Louvain algorithm with multilevel refinement. In the field of library and information science, setting the resolution at 0.5 could help us analyze the detection results. The results of SLM algorithm were different to those of the Louvain algorithm or Louvain algorithm with multilevel refinement. Results of the latter two were almost the same, which were different with the resolution of 1.0. [Limitations] The dataset needs to be expanded. [Conclusions] The Louvain algorithm, Louvain algorithm with multilevel refinement and SLM algorithm are better than traditional algorithms. Among them, the SLM algorithm is the best option for us to analyze the community of citation network.

Key wordsComplex Network      Community      Collaboration Network      Citation Network     
Received: 14 May 2018      Published: 12 November 2018
ZTFLH:  G350  

Cite this article:

Chen Yunwei,Zhang Ruihong. Comparing on Community Detection Algorithms for Information Mining. Data Analysis and Knowledge Discovery, 2018, 2(10): 84-94.

URL:

https://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/10.11925/infotech.2096-3467.2018.0542     OR     https://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/Y2018/V2/I10/84

算法 类型 核心思想 时间复杂度
(m条边, n个节点)
模块度
(针对空手道俱乐部数据)
GN算法 分裂法 将边介数最高的边从网络中移除 O(mn2) 0.4左右
FN算法 聚合法 将社团向着模块度增量最多的方向合并 O((m+n)n) 0.381
Louvain算法 聚合法 基于模块度的LMH算法 O(n) 0.4151
Louvain多级细分算法 聚合法 基于模块度的LMH算法 -- 0.4198
SLM算法 聚合法 基于模块度的LMH算法 -- --
网络 算法 分辨率
0.1 0.5 1.0
无权网络 Louvain 7 15 19
Louvain多级细分 7 15 19
SLM 7 16 21
加权网络
(基于合作论文数)
Louvain 8 13 23
Louvain多级细分 6 13 22
SLM 6 13 22
序号 机构 论文数 分辨率=0.5 分辨率=1
未加权 加权 未加权 加权
L LM SLM L LM SLM L LM SLM L LM SLM
1 Katholieke Univ Leuven 85 0 0 0 0 0 0 0 0 3 0 1 0
2 Hungarian Acad Sci 85 0 0 0 0 0 0 2 1 2 0 1 0
3 Natl Inst Sci Technol &
Dev Studies
61 0 0 0 9 0 10 7 7 9 13 14 14
4 Leiden Univ 59 9 9 8 2 1 1 2 1 2 8 7 7
5 Csic 57 1 1 2 2 1 1 4 6 4 2 3 3
6 Univ Granada 35 1 1 2 3 3 3 11 10 11 3 2 2
7 Univ Sussex 30 0 0 0 0 0 0 3 4 3 0 1 0
8 Khbo 26 0 0 0 0 0 0 0 0 0 1 0 1
9 Univ Amsterdam 25 3 3* 1* 1 2 2 1 2 1 4 9 4
10 Univ Instelling Antwerp 24 0 0 0 0 0 0 0 0 0 1 0 1
11 Wolverhampton Univ 21 3 3* 1* 1 2 2 1 2 1 4 5 4
12 Royal Sch Lib & Informat Sci 19 5 5 4 6 7 7 5 5 5 6 4 5
13 Res Assoc Sci Commun
& Informat Ev
19 0 0 0 0 0 0 2 1 2 0 1 0
14 Univ New S Wales 18 6 6 5 3 3 3 0 0 0 1 0 1
15 Limburgs Univ Ctr 18 0 0 0 0 0 0 8 8 7 3 2 2
16 Univ Hasselt 17 0 0 0 0 0 0 0 0 0 1 0 1
17 Univ Antwerp 17 0 0 0 0 0 0 0 0 0 1 0 1
18 Helsinki Univ Technol 16 0 0 0 0 0 0 3 4 3 0 1 0
19 Univ Fed Rio De Janeiro 16 10 10 9 10 10 11 12 12 12 11 10 10
20 Inra 15 2 2 3 3 3 3 0 0 0 1 0 1
21 Henan Normal Univ 15 0 0 0 0 0 0 10 3 10 3 2 2
22 Karnatak Univ 14 0 0 0 9 0 10 2 1 2 0 1 0
23 Lorand Eotvos Univ 14 0 0 0 0 0 0 7 7 9 13 14 14
24 Umea Univ 13 9 9 8 0 0 0 5 5 5 6 4 5
25 Fraunhofer Inst Syst &
Innovat Res
13 4 4* 11* 5 6 6 4 6 4 2 3 3
26 Bar Ilan Univ 13 1 1 2 2 1 1 2 1 2 0 1 0
27 Univ Tokyo 13 5 5 4 6 7 7 13 13 15 12 13 13
28 Indiana Univ 12 8 8 7 8 9 8 6 3 6 9 8 8
29 Cnrs 12 2 2 3 4 5 5 16 15 17 15 16 16
30 Inst Sci & Tech Informat China 12 0 0 0 0 0 0 0 0 0 0 1 0
31 City Univ London 12 2 2* 10* 11 4 4 14 11 14 5 12 12
32 Max Planck Inst Festkorperforsch 11 0 0 0 0 0 0 4 6 4 2 3 3
33 Drexel Univ 11 4 4* 1* 5 6 6 3 4 13 10 11 11
34 Univ Politecn Valencia 11 15 15 16 2 1 1 3 4 13 10 11 11
35 Georgia Inst Technol 11 4 4* 1* 5 6 6 10 3 10 3 2 2
36 Univ Western Ontario 11 8 8 7 8 9 8 15 14 16 14 15 15
37 Hebrew Univ Jerusalem 11 1 1 2 2 1 1 17 17 18 18 19 19
38 Observ Sci & Tech 11 2 2 3 3 3 3 2 1 2 0 1 0
39 Eth 11 13 13 14 12 12 12 19 19 20 2 3 3
40 Univ Carlos Iii Madrid 11 1 1 2 2 1 1 4 6 4 2 3 3
网络 算法 分辨率
0.1 0.5 1.0
无权引文网络 Louvain 2 7 13
Louvain多级细分 2 7 14
SLM 2 7 13
标签n 被引频次 分辨率=0.5 分辨率=1.0 标签n 被引频次 分辨率=0.5 分辨率=1.0
L LM SLM L LM SLM L LM SLM L LM SLM
131 42 1* 1* 1* 0 0 1 210 15 4 4 1 0 0 1
469 29 2 2 4 3 3 3 315 15 4 4 1 0 0 1
59 27 3 3 3 4 4 4 268 14 2 2 4 3 3 3
125 27 5 5 5 5 6 5 304 13 5 5 5 5 6 5
341 26 1 1 2 2 2 2 72 12 0 0 0 1 1 0
79 20 3 3 3 4 4 4 383 12 0 0 0 1 1 0
318 20 1 1 2 2 2 2 492 12 0 0 0 1 1 0
207 18 1* 1* 1* 0 0 1 29 11 0 0 0 1 1 0
303 18 2 2 4 3 3 3 130 11 5 5 5 5 6 5
150 17 2* 2* 2* 2 3 2 180 11 0 0 0 1 1 0
194 17 0 0 0 1 1 0 259 11 3 3 3 4 4 4
364 16 3 3 3 4 4 4 444 11 6 6 6 6 5 6
522 16 1 1 2 2 2 2 447 10 2 2 4 3 3 3
分辨率 Louvain Louvain多级细分 SLM
未加权 加权 未加权 加权 未加权 加权
0.1 123 125 123 125 124 124
0.2 127 130 127 130 131 131
0.3 131 133 132 131 133 135
0.4 135 134 136 135 135 135
0.5 138 139 138 140 138 139
0.6 137 140 140 141 139 141
0.7 141 145 141 142 141 144
0.8 140 143 141 143 143 144
0.9 146 144 144 147 146 146
1.0 144 147 148 148 148 151
[1] Fortunato S, Castellano C. Community Structure in Graphs [OL]. [2009-03-10]. .
[2] Kernighan B W, Lin S.An Efficient Heuristic Procedure for Partitioning Graphs[J]. Bell System Technical Journal, 1970, 49(2): 291-307.
doi: 10.1002/bltj.1970.49.issue-2
[3] Fildler M.Algebraic Connectivity of Graphs[J]. Czechoslovak Mathematical Journal, 1973, 23(98): 298-305.
[4] Phothen A, Simon H D,Liou K P.Partitioning Sparse Matrices with Eigenvectors of Graphs[J]. SIAM Journal on Matrix Analysis and Applications, 1990, 11(3): 430-452.
doi: 10.1137/0611030
[5] Boccaletti S, Latora V, Moreno Y, et al.Complex Networks: Structure and Dynamics[J]. Physics Reports, 2006, 424(4-5): 175-308.
doi: 10.1016/j.physrep.2005.10.009
[6] 时京晶. 三种经典复杂网络社区结构划分算法研究[J]. 电脑与信息技术, 2011, 19(4): 42-43, 79.
doi: 10.3969/j.issn.1005-1228.2011.04.014
[6] (Shi Jingjing.The Research of Three Typical Community Detection Algorithmsin Complex Networks[J]. Computer and Information Technology, 2011, 19(4): 42-43, 79.)
doi: 10.3969/j.issn.1005-1228.2011.04.014
[7] Girvan M, Newman M E J. Community Structure in Social and Biological Networks[J]. PNAS, 2002, 99(12): 7821-7826.
doi: 10.1073/pnas.122653799 pmid: 12060727
[8] Newman M E J. Fast Algorithm for Detecting Community Structure in Networks[J]. Physical Review E, 2004, 69(6): 066133.
doi: 10.1103/PhysRevE.69.066133
[9] Newman M E J, Girvan M. Finding and Evaluating Community Structure in Networks[J]. Physical Review E, 2004, 69(2): 026113.
doi: 10.1103/PhysRevE.69.026113
[10] Blondel V D, Guillaume J L, Lambiotte R, et al.Fast Unfolding of Communities in Large Networks[J]. Journal of Statistical Mechanics: Theory and Experiment, 2008(10): P10008.
[11] Rotta R, Noack A. Multilevel Local Search Algorithms for Modularity Clustering [J]. Journal of Experimental Algorithmics, 2011, 16(2): Article No. 2.3.
doi: 10.1145/1963190.1970376
[12] Waltman L, Jan Van Eck N J. A Smart Local Moving Algorithm for Large-scale Modularity-based Community Detection[J]. The European Physical Journal B, 2013, 86(11): 471.
doi: 10.1140/epjb/e2013-40829-0
[13] 吴卫江, 李沐南, 李国和. Louvain算法的并行化处理[J]. 计算机与数字工程, 2016, 44(8): 1402-1406.
[13] (Wu Weijiang, Li Munan, Li Guohe.Parallel Processing of the Louvain Algorithm[J]. Computer & Digital Engineering, 2016, 44(8): 1402-1406.)
[14] 吴祖峰, 王鹏飞, 秦志光, 等. 改进的Louvain社团划分算法[J]. 电子科技大学学报, 2013, 42(1): 105-108.
doi: 10.3969/j.issn.1001-0548.2012.06.022
[14] (Wu Zufeng, Wang Pengfei, Qin Zhiguang, et al.Improved Algorithm of Louvain Communities Dipartition[J]. Journal of University of Electronic Science and Technology of China, 2013, 42(1): 105-108.)
doi: 10.3969/j.issn.1001-0548.2012.06.022
[15] 夏玮, 杨鹤标. 改进的Louvain算法及其在推荐领域的研究[J]. 信息技术, 2017(11): 125-128.
doi: 10.13274/j.cnki.hdzj.2017.11.032
[15] (Xia Wei, Yang Hebiao.Optimization of Louvain Algorithm and Its Application in Personalized Recommendation[J]. Information Technology, 2017(11): 125-128.)
doi: 10.13274/j.cnki.hdzj.2017.11.032
[16] Zachary W W.An Information Flow Model for Conflict and Fission in Small Groups[J]. Journal of Anthropological Research, 1977, 33(4): 452-473.
[17] Chen P, Redner S.Community Structure of the Physical Review Citation Network[J]. Journal of Informetrics, 2010, 4(3): 278-290.
doi: 10.1016/j.joi.2010.01.001
[18] Newman M E J. Scientific Collaboration Networks. II. Shortest Paths, Weighted Networks, and Centrality[J]. Physical Review E, 2001, 64(1): 016132.
doi: 10.1103/PhysRevE.64.016132 pmid: 11461356
[19] Chen Y W, Börner K, Fang S.Evolving Collaboration Networks in Scientometrics in 1978-2010: A Micro-Macro Analysis[J]. Scientometrics, 2013, 95(3): 1051-1070.
doi: 10.1007/s11192-012-0895-2
[20] 陈云伟. 引文网络演化研究进展分析[J]. 情报科学, 2016, 34(8): 171-176.
[20] (Chen Yunwei.Development of Evolving Citation Network Analysis[J]. Information Science, 2016, 34(8): 171-176.)
[1] Shan Xiaohong,Wang Chunwen,Liu Xiaoyan,Han Shengxi,Yang Juan. Identifying Lead Users in Open Innovation Community from Knowledge-based Perspectives[J]. 数据分析与知识发现, 2021, 5(9): 85-96.
[2] Lu Yunmeng,Liu Tiezhong. Diffusion Model for Tacit Knowledge of Scientific Cooperation Network Based on Relevance: Case Study of Major Sci-Tech Projects[J]. 数据分析与知识发现, 2021, 5(9): 10-20.
[3] Chen Wenjie,Wen Yi,Yang Ning. Fuzzy Overlapping Community Detection Algorithm Based on Node Vector Representation[J]. 数据分析与知识发现, 2021, 5(5): 41-50.
[4] Li Ming, Li Ying, Zhou Qing, Wang Jun. Analyzing Knowledge Demand and Supply of Community Question Answering with TF-PIDF[J]. 数据分析与知识发现, 2021, 5(2): 106-115.
[5] Shen Wang, Li Shiyu, Liu Jiayu, Li He. Optimizing Quality Evaluation for Answers of Q&A Community[J]. 数据分析与知识发现, 2021, 5(2): 83-93.
[6] Wang Song, Yang Yang, Liu Xinmin. Discovering Potentialities of User Ideas from Open Innovation Communities with Graph Attention Network[J]. 数据分析与知识发现, 2021, 5(11): 89-101.
[7] Cai Yongming,Liu Lu,Wang Kewei. Identifying Key Users and Topics from Online Learning Community[J]. 数据分析与知识发现, 2020, 4(6): 69-79.
[8] Tao Xing,Zhang Xiangxian,Guo Shunli,Zhang Liman. Automatic Summarization of User-Generated Content in Academic Q&A Community Based on Word2Vec and MMR[J]. 数据分析与知识发现, 2020, 4(4): 109-118.
[9] Li Wenzheng,Gu Yijun,Yan Hongli. Predicting Community Numbers with Network Bayesian Information Criterion[J]. 数据分析与知识发现, 2020, 4(4): 72-82.
[10] Li He,Liu Jiayu,Shen Wang,Liu Rui,Jin Shuaiqi. Recommending Knowledge for Online Health Community Users Based on Fuzzy Cognitive Map[J]. 数据分析与知识发现, 2020, 4(12): 55-67.
[11] Ding Heng,Li Yingxuan. Improving Online Q&A Service with Deep Learning[J]. 数据分析与知识发现, 2020, 4(10): 37-46.
[12] Liqing Qiu,Wei Jia,Xin Fan. Influence Maximization Algorithm Based on Overlapping Community[J]. 数据分析与知识发现, 2019, 3(7): 94-102.
[13] Ming Yi,Tingting Zhang. Ranking Answer Quality of Popular Q&A Community[J]. 数据分析与知识发现, 2019, 3(6): 12-20.
[14] Xiaolan Wu,Chengzhi Zhang. Analysis of Knowledge Flow Based on Academic Social Networks:
A Case Study of ScienceNet.cn
[J]. 数据分析与知识发现, 2019, 3(4): 107-116.
[15] Quan Lu,Anqi Zhu,Jiyue Zhang,Jing Chen. Research on User Information Requirement in Chinese Network Health Community: Taking Tumor-forum Data of Qiuyi as an Example[J]. 数据分析与知识发现, 2019, 3(4): 22-32.
  Copyright © 2016 Data Analysis and Knowledge Discovery   Tel/Fax:(010)82626611-6626,82624938   E-mail:jishu@mail.las.ac.cn