Please wait a minute...
Data Analysis and Knowledge Discovery  2018, Vol. 2 Issue (10): 84-94    DOI: 10.11925/infotech.2096-3467.2018.0542
Current Issue | Archive | Adv Search |
Comparing on Community Detection Algorithms for Information Mining
Yunwei Chen1(),Ruihong Zhang1,2
1Chengdu Library and Information Center, Chinese Academy of Sciences, Chengdu 610041, China
2University of Chinese Academy of Sciences, Beijing 101408, China
Download: PDF(1869 KB)   HTML
Export: BibTeX | EndNote (RIS)      
Abstract  

[Objective] This paper compares community detection algorithms in the field of complex network analysis, aiming to support related information science studies. [Methods] First, we identified the similarities and differences of several community detection algorithms (i.e. theoretical frameworks and calculation methods). Then, we examined these algorithms with small data sets. Third, we expanded the sample size, and evaluated the performance of Louvain algorithm, Louvain algorithm with multilevel refinement, and the SLM algorithm with the collaboration and citation networks. [Results] On small dataset, the detection results of GN and FN algorithms were similar, and the results of SLM algorithm were better than those of the Louvain algorithm and Louvain algorithm with multilevel refinement. In the field of library and information science, setting the resolution at 0.5 could help us analyze the detection results. The results of SLM algorithm were different to those of the Louvain algorithm or Louvain algorithm with multilevel refinement. Results of the latter two were almost the same, which were different with the resolution of 1.0. [Limitations] The dataset needs to be expanded. [Conclusions] The Louvain algorithm, Louvain algorithm with multilevel refinement and SLM algorithm are better than traditional algorithms. Among them, the SLM algorithm is the best option for us to analyze the community of citation network.

Key wordsComplex Network      Community      Collaboration Network      Citation Network     
Received: 14 May 2018      Published: 12 November 2018

Cite this article:

Yunwei Chen,Ruihong Zhang. Comparing on Community Detection Algorithms for Information Mining. Data Analysis and Knowledge Discovery, 2018, 2(10): 84-94.

URL:

http://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/10.11925/infotech.2096-3467.2018.0542     OR     http://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/Y2018/V2/I10/84

算法 类型 核心思想 时间复杂度
(m条边, n个节点)
模块度
(针对空手道俱乐部数据)
GN算法 分裂法 将边介数最高的边从网络中移除 O(mn2) 0.4左右
FN算法 聚合法 将社团向着模块度增量最多的方向合并 O((m+n)n) 0.381
Louvain算法 聚合法 基于模块度的LMH算法 O(n) 0.4151
Louvain多级细分算法 聚合法 基于模块度的LMH算法 -- 0.4198
SLM算法 聚合法 基于模块度的LMH算法 -- --
网络 算法 分辨率
0.1 0.5 1.0
无权网络 Louvain 7 15 19
Louvain多级细分 7 15 19
SLM 7 16 21
加权网络
(基于合作论文数)
Louvain 8 13 23
Louvain多级细分 6 13 22
SLM 6 13 22
序号 机构 论文数 分辨率=0.5 分辨率=1
未加权 加权 未加权 加权
L LM SLM L LM SLM L LM SLM L LM SLM
1 Katholieke Univ Leuven 85 0 0 0 0 0 0 0 0 3 0 1 0
2 Hungarian Acad Sci 85 0 0 0 0 0 0 2 1 2 0 1 0
3 Natl Inst Sci Technol &
Dev Studies
61 0 0 0 9 0 10 7 7 9 13 14 14
4 Leiden Univ 59 9 9 8 2 1 1 2 1 2 8 7 7
5 Csic 57 1 1 2 2 1 1 4 6 4 2 3 3
6 Univ Granada 35 1 1 2 3 3 3 11 10 11 3 2 2
7 Univ Sussex 30 0 0 0 0 0 0 3 4 3 0 1 0
8 Khbo 26 0 0 0 0 0 0 0 0 0 1 0 1
9 Univ Amsterdam 25 3 3* 1* 1 2 2 1 2 1 4 9 4
10 Univ Instelling Antwerp 24 0 0 0 0 0 0 0 0 0 1 0 1
11 Wolverhampton Univ 21 3 3* 1* 1 2 2 1 2 1 4 5 4
12 Royal Sch Lib & Informat Sci 19 5 5 4 6 7 7 5 5 5 6 4 5
13 Res Assoc Sci Commun
& Informat Ev
19 0 0 0 0 0 0 2 1 2 0 1 0
14 Univ New S Wales 18 6 6 5 3 3 3 0 0 0 1 0 1
15 Limburgs Univ Ctr 18 0 0 0 0 0 0 8 8 7 3 2 2
16 Univ Hasselt 17 0 0 0 0 0 0 0 0 0 1 0 1
17 Univ Antwerp 17 0 0 0 0 0 0 0 0 0 1 0 1
18 Helsinki Univ Technol 16 0 0 0 0 0 0 3 4 3 0 1 0
19 Univ Fed Rio De Janeiro 16 10 10 9 10 10 11 12 12 12 11 10 10
20 Inra 15 2 2 3 3 3 3 0 0 0 1 0 1
21 Henan Normal Univ 15 0 0 0 0 0 0 10 3 10 3 2 2
22 Karnatak Univ 14 0 0 0 9 0 10 2 1 2 0 1 0
23 Lorand Eotvos Univ 14 0 0 0 0 0 0 7 7 9 13 14 14
24 Umea Univ 13 9 9 8 0 0 0 5 5 5 6 4 5
25 Fraunhofer Inst Syst &
Innovat Res
13 4 4* 11* 5 6 6 4 6 4 2 3 3
26 Bar Ilan Univ 13 1 1 2 2 1 1 2 1 2 0 1 0
27 Univ Tokyo 13 5 5 4 6 7 7 13 13 15 12 13 13
28 Indiana Univ 12 8 8 7 8 9 8 6 3 6 9 8 8
29 Cnrs 12 2 2 3 4 5 5 16 15 17 15 16 16
30 Inst Sci & Tech Informat China 12 0 0 0 0 0 0 0 0 0 0 1 0
31 City Univ London 12 2 2* 10* 11 4 4 14 11 14 5 12 12
32 Max Planck Inst Festkorperforsch 11 0 0 0 0 0 0 4 6 4 2 3 3
33 Drexel Univ 11 4 4* 1* 5 6 6 3 4 13 10 11 11
34 Univ Politecn Valencia 11 15 15 16 2 1 1 3 4 13 10 11 11
35 Georgia Inst Technol 11 4 4* 1* 5 6 6 10 3 10 3 2 2
36 Univ Western Ontario 11 8 8 7 8 9 8 15 14 16 14 15 15
37 Hebrew Univ Jerusalem 11 1 1 2 2 1 1 17 17 18 18 19 19
38 Observ Sci & Tech 11 2 2 3 3 3 3 2 1 2 0 1 0
39 Eth 11 13 13 14 12 12 12 19 19 20 2 3 3
40 Univ Carlos Iii Madrid 11 1 1 2 2 1 1 4 6 4 2 3 3
网络 算法 分辨率
0.1 0.5 1.0
无权引文网络 Louvain 2 7 13
Louvain多级细分 2 7 14
SLM 2 7 13
标签n 被引频次 分辨率=0.5 分辨率=1.0 标签n 被引频次 分辨率=0.5 分辨率=1.0
L LM SLM L LM SLM L LM SLM L LM SLM
131 42 1* 1* 1* 0 0 1 210 15 4 4 1 0 0 1
469 29 2 2 4 3 3 3 315 15 4 4 1 0 0 1
59 27 3 3 3 4 4 4 268 14 2 2 4 3 3 3
125 27 5 5 5 5 6 5 304 13 5 5 5 5 6 5
341 26 1 1 2 2 2 2 72 12 0 0 0 1 1 0
79 20 3 3 3 4 4 4 383 12 0 0 0 1 1 0
318 20 1 1 2 2 2 2 492 12 0 0 0 1 1 0
207 18 1* 1* 1* 0 0 1 29 11 0 0 0 1 1 0
303 18 2 2 4 3 3 3 130 11 5 5 5 5 6 5
150 17 2* 2* 2* 2 3 2 180 11 0 0 0 1 1 0
194 17 0 0 0 1 1 0 259 11 3 3 3 4 4 4
364 16 3 3 3 4 4 4 444 11 6 6 6 6 5 6
522 16 1 1 2 2 2 2 447 10 2 2 4 3 3 3
分辨率 Louvain Louvain多级细分 SLM
未加权 加权 未加权 加权 未加权 加权
0.1 123 125 123 125 124 124
0.2 127 130 127 130 131 131
0.3 131 133 132 131 133 135
0.4 135 134 136 135 135 135
0.5 138 139 138 140 138 139
0.6 137 140 140 141 139 141
0.7 141 145 141 142 141 144
0.8 140 143 141 143 143 144
0.9 146 144 144 147 146 146
1.0 144 147 148 148 148 151
[1] Fortunato S, Castellano C. Community Structure in Graphs [OL]. [2009-03-10]. .
[2] Kernighan B W, Lin S.An Efficient Heuristic Procedure for Partitioning Graphs[J]. Bell System Technical Journal, 1970, 49(2): 291-307.
doi: 10.1002/bltj.1970.49.issue-2
[3] Fildler M.Algebraic Connectivity of Graphs[J]. Czechoslovak Mathematical Journal, 1973, 23(98): 298-305.
[4] Phothen A, Simon H D,Liou K P.Partitioning Sparse Matrices with Eigenvectors of Graphs[J]. SIAM Journal on Matrix Analysis and Applications, 1990, 11(3): 430-452.
doi: 10.1137/0611030
[5] Boccaletti S, Latora V, Moreno Y, et al.Complex Networks: Structure and Dynamics[J]. Physics Reports, 2006, 424(4-5): 175-308.
doi: 10.1016/j.physrep.2005.10.009
[6] 时京晶. 三种经典复杂网络社区结构划分算法研究[J]. 电脑与信息技术, 2011, 19(4): 42-43, 79.
doi: 10.3969/j.issn.1005-1228.2011.04.014
[6] (Shi Jingjing.The Research of Three Typical Community Detection Algorithmsin Complex Networks[J]. Computer and Information Technology, 2011, 19(4): 42-43, 79.)
[7] Girvan M, Newman M E J. Community Structure in Social and Biological Networks[J]. PNAS, 2002, 99(12): 7821-7826.
doi: 10.1073/pnas.122653799 pmid: 12060727
[8] Newman M E J. Fast Algorithm for Detecting Community Structure in Networks[J]. Physical Review E, 2004, 69(6): 066133.
doi: 10.1103/PhysRevE.69.066133
[9] Newman M E J, Girvan M. Finding and Evaluating Community Structure in Networks[J]. Physical Review E, 2004, 69(2): 026113.
doi: 10.1103/PhysRevE.69.026113
[10] Blondel V D, Guillaume J L, Lambiotte R, et al.Fast Unfolding of Communities in Large Networks[J]. Journal of Statistical Mechanics: Theory and Experiment, 2008(10): P10008.
[11] Rotta R, Noack A. Multilevel Local Search Algorithms for Modularity Clustering [J]. Journal of Experimental Algorithmics, 2011, 16(2): Article No. 2.3.
doi: 10.1145/1963190.1970376
[12] Waltman L, Jan Van Eck N J. A Smart Local Moving Algorithm for Large-scale Modularity-based Community Detection[J]. The European Physical Journal B, 2013, 86(11): 471.
doi: 10.1140/epjb/e2013-40829-0
[13] 吴卫江, 李沐南, 李国和. Louvain算法的并行化处理[J]. 计算机与数字工程, 2016, 44(8): 1402-1406.
[13] (Wu Weijiang, Li Munan, Li Guohe.Parallel Processing of the Louvain Algorithm[J]. Computer & Digital Engineering, 2016, 44(8): 1402-1406.)
[14] 吴祖峰, 王鹏飞, 秦志光, 等. 改进的Louvain社团划分算法[J]. 电子科技大学学报, 2013, 42(1): 105-108.
doi: 10.3969/j.issn.1001-0548.2012.06.022
[14] (Wu Zufeng, Wang Pengfei, Qin Zhiguang, et al.Improved Algorithm of Louvain Communities Dipartition[J]. Journal of University of Electronic Science and Technology of China, 2013, 42(1): 105-108.)
[15] 夏玮, 杨鹤标. 改进的Louvain算法及其在推荐领域的研究[J]. 信息技术, 2017(11): 125-128.
doi: 10.13274/j.cnki.hdzj.2017.11.032
[15] (Xia Wei, Yang Hebiao.Optimization of Louvain Algorithm and Its Application in Personalized Recommendation[J]. Information Technology, 2017(11): 125-128.)
[16] Zachary W W.An Information Flow Model for Conflict and Fission in Small Groups[J]. Journal of Anthropological Research, 1977, 33(4): 452-473.
[17] Chen P, Redner S.Community Structure of the Physical Review Citation Network[J]. Journal of Informetrics, 2010, 4(3): 278-290.
doi: 10.1016/j.joi.2010.01.001
[18] Newman M E J. Scientific Collaboration Networks. II. Shortest Paths, Weighted Networks, and Centrality[J]. Physical Review E, 2001, 64(1): 016132.
doi: 10.1103/PhysRevE.64.016132 pmid: 11461356
[19] Chen Y W, Börner K, Fang S.Evolving Collaboration Networks in Scientometrics in 1978-2010: A Micro-Macro Analysis[J]. Scientometrics, 2013, 95(3): 1051-1070.
doi: 10.1007/s11192-012-0895-2
[20] 陈云伟. 引文网络演化研究进展分析[J]. 情报科学, 2016, 34(8): 171-176.
[20] (Chen Yunwei.Development of Evolving Citation Network Analysis[J]. Information Science, 2016, 34(8): 171-176.)
[1] Xiaodong Qian,Min Li. Identifying E-commerce User Types Based on Complex Network Overlapping Community[J]. 数据分析与知识发现, 2018, 2(6): 79-91.
[2] Junwan Liu,Bo Yang,Feifei Wang. Ranking Scholarly Impacts Based on Citations and Academic Similarity[J]. 数据分析与知识发现, 2018, 2(4): 59-70.
[3] Suqi Zhang,Xing Gao,Shijie Huo,Jingjin Guo,Junhua Gu. A Label Propagation Algorithm Based on Speed Optimization and Community Preference[J]. 数据分析与知识发现, 2018, 2(3): 60-69.
[4] Xiaohua Shi,Hongtao Lu. Detecting Community in Scientific Collaboration Network with Bayesian Symmetric NMF[J]. 数据分析与知识发现, 2017, 1(9): 49-56.
[5] Chuanming Yu,Yutian Gong,Xiaoli Zhao,Lu An. Collaboration Recommendation of Finance Research Based on Multi-feature Fusion[J]. 数据分析与知识发现, 2017, 1(8): 39-47.
[6] Weimin Lv,Xiaomei Wang,Tao Han. Recommending Scientific Research Collaborators with Link Prediction and Extremely Randomized Trees Algorithm[J]. 数据分析与知识发现, 2017, 1(4): 38-45.
[7] Yaxian Qing,Rui Li,Huayi Wu. Analyzing Academic Community Based on Co-author Network[J]. 数据分析与知识发现, 2017, 1(4): 20-29.
[8] Xinwei Yuan,Shaohua Yang,Chaochao Wang,Zhanhe Du. Identifying Lead Players of User Innovation Communities Based on Feature Extraction and Random Forest Classification[J]. 数据分析与知识发现, 2017, 1(11): 62-74.
[9] Bingyao Liu,Jing Ma,Xiaofeng Li. Topic Representation Model Based on “Feature Dimensionality Reduction”[J]. 数据分析与知识发现, 2017, 1(11): 53-61.
[10] Guo Chen,Lu Xiao. Linking Knowledge Elements from Online Community[J]. 数据分析与知识发现, 2017, 1(11): 75-83.
[11] Wu Jiang,Chen Jun,Zhang Jinfan. A Knowledge Supply-Demand Simulation System for Collaborative Innovation[J]. 现代图书情报技术, 2016, 32(9): 27-33.
[12] Ye Teng,Han Lichuan,Xing Chunxiao,Zhang Yan. Knowledge Dissemination Mechanism in Virtual Communities: Case Study Based on Complex Network Theory[J]. 现代图书情报技术, 2016, 32(7-8): 70-77.
[13] Niu Liang. New Research and Application with Co-topics Network[J]. 现代图书情报技术, 2016, 32(7-8): 137-146.
[14] Wu Xiaolan,Zhang Chengzhi. Analyzing Food Community with Recipes and Weibo User Reviews[J]. 现代图书情报技术, 2016, 32(6): 54-62.
[15] Sun He,Li Shuqin,Lv Xueqiang,Liu Kehui. Retrieving Geographic Information for Micro-blog’s City Complaints[J]. 现代图书情报技术, 2016, 32(3): 58-66.
  Copyright © 2016 Data Analysis and Knowledge Discovery   Tel/Fax:(010)82626611-6626,82624938   E-mail:jishu@mail.las.ac.cn