Please wait a minute...
Data Analysis and Knowledge Discovery  2018, Vol. 2 Issue (7): 81-88    DOI: 10.11925/infotech.2096-3467.2017.1333
Current Issue | Archive | Adv Search |
A Fuzzy C-Means Algorithm Based on Huffman Tree
Xiao Mansheng, Zhou Lijuan(), Wen Zhicheng
School of Computer Science, Hunan University of Technology, Zhuzhou 412007, China
Download: PDF (3038 KB)   HTML ( 3
Export: BibTeX | EndNote (RIS)      
Abstract  

[Objective] This paper tries to solve the issues facing traditional FCM algorithm, such as randomly choosing initial cluster center, sensitive to noise, and only capable of clustering the equally distributed samples. [Methods] We proposed a new FCM clustering algorithm based on Huffman tree with dissimilarity degree matrix of high density sample sets. The new algorithm could get initial clustering centers, and then generate the membership function of the non-normalized constraint samples. [Results] We examined the proposed algorithm with man-made samples, images, and UCI datasets. The clustering accuracy and the computation time of the new algorithm were better than algorithms based on the Gauss kernel or traditional FCM. [Limitations] The $\beta $ of the sample density adjustment factor was decided by experiment or experience without theoretical supports. [Conclusions] The proposed algorithm could be used for clustering data sets with high level of noise and distributed unequally.

Key wordsDensity of Samples      Dissimilarity Degree      Huffman Tree      Membership     
Received: 28 December 2017      Published: 15 August 2018
ZTFLH:  TP391 G35  

Cite this article:

Xiao Mansheng,Zhou Lijuan,Wen Zhicheng. A Fuzzy C-Means Algorithm Based on Huffman Tree. Data Analysis and Knowledge Discovery, 2018, 2(7): 81-88.

URL:

https://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/10.11925/infotech.2096-3467.2017.1333     OR     https://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/Y2018/V2/I7/81

样本子集 均值坐标 协方差矩阵 样本数
第一类 (4.5,6.2) 18.7 1.5 200
1.5 19.6
第二类 (10.9,4.6) 2.3 0.3 50
-0.3 1.8
第三类 (10.8,7.6) 2.3 -0.5 50
-0.15 2.5
算法 Max(%) Min(%) Ave(%) Run(ms)
GFCM 51.7 42.6 47.15 33.6
GKFCM 87.40 75.33 81.37 43.4
HFCM 94.5 94.5 94.5 30.3
算法 Max(%) Min(%) Ave(%) Run(ms)
GFCM 82.67 46.33 64.5 123.4
GKFCM 84.66 54.22 69.44 165.5
HFCM 93.67 93.67 93.67 132.4
算法 Max(%) Min(%) Ave(%) Run(ms)
GFCM 88.67 50.33 69.50 3.34
GKFCM 82.62 75.82 79.22 4.88
HFCM 88.67 88.67 88.67 2.76
[1] Chaturvedi K, Patel R, Swami D K.Fuzzy C-Means Based Inference Mechanism for Association Rule Mining: A Clinical Data Mining Approach[J]. International Journal of Advanced Computer Science and Applications, 2015, 6(6): 103-110.
[2] He L H, Wen Y, Wan M, et al.Multi-Channel Features Based Automated Segmentation of Diffusion Tensor Imaging Using an Improved FCM with Spatial Constraints[J]. Neurocomputing, 2014, 137(5): 107-114.
doi: 10.1016/j.neucom.2013.09.051
[3] Li M, Gao Z, Pei Z, et al.Fuzzy Markov Model Based on FCM for Electromagnetic Environment Parameters Prediction[J]. Journal of Information and Computational Science, 2015, 12(5): 1713-1722.
doi: 10.12733/jics20105534
[4] Kalam R, Thomas C, Rahiman M A.Gaussian Kernel Based Fuzzy C-Means Clustering Algorithm for Image Segmentation[J]. Computer Science & Information Technology, 2016, 6(4): 47-56.
doi: 10.5121/csit.2016.60405
[5] Bezdek J C.Fuzzy Mathematics in Pattern Classification PHP Thesis[D]. Applied Math Center, Ithaca, Cornell University, 1973: 67-84.
[6] Dunn J C.A Fuzzy Relative of the ISODATA Process and Its Use in Detecting Compact Well-Separated Clusters[J]. Journal of Cybernetics, 1973(1): 32-57
doi: 10.1080/01969727308546046
[7] Oh S, Ahn C W, Jeon M.A New Evolutionary Approach to Cluster Validation Index[J]. Journal of Computational and Theoretical Nanoscience, 2010, 7(5): 806-812.
doi: 10.1166/jctn.2010.1424
[8] Lin B, Wen W, Liu S, et al.Incremental Kernel Fuzzy C-Means with Optimizing Cluster Center Initialization and Delivery[J]. Kybernetes, 2016, 45(8): 1273-1291.
doi: 10.1108/K-08-2015-0209
[9] Zhang B, Qin S, Wang W, et al.Data Stream Clustering Based on Fuzzy C-Mean Algorithm and Entropy Theory[J]. Signal Processing, 2016: 111-116.
doi: 10.1016/j.sigpro.2015.10.014
[10] Kannan S R, Ramathilagam S, Chung P C.Effective Fuzzy C-Means Clustering Algorithms for Data Clustering Problems[J]. Expert Systems with Applications, 2012, 39(7): 6292-6300.
doi: 10.1016/j.eswa.2011.11.063
[11] Mohamad F, Nosratallah F, Mohammad T.Parameter Optimization of Improved Fuzzy C-Means Clustering Algorithm for Brain MR Image Segmentation[J]. Journal Engineering Applications of Artificial Intelligence, 2010, 23(2): 160-168.
doi: 10.1016/j.engappai.2009.10.002
[12] Usman Q.A Dissimilarity Measure Based Fuzzy C-Means (FCM) Clustering Algorithm[J]. Journal of Intelligent & Fuzzy Systems: Applications in Engineering and Technology, 2014, 26(1): 229-238.
doi: 10.3233/IFS-120730
[13] 孟海东, 马娜娜, 宋宇晨, 等. 基于密度函数加权的模糊C均值聚类算法研究[J]. 计算机工程与应用, 2012, 48(27): 123-127.
doi: 10.3778/j.issn.1002-8331.2012.27.026
[13] (Meng Haidong, Ma Nana, Song Yuchen, et al.Research on Fuzzy C-Means Clustering Algorithm Based on Density Function Weighted[J]. Computer Engineering and Applications, 2012, 48(27): 123-127.)
doi: 10.3778/j.issn.1002-8331.2012.27.026
[14] Zhu C J, Yang S Z, Zhao Q, et al.Robust Semi-Supervised Kernel-FCM Algorithm Incorporating Local Spatial Information for Remote Sensing Image Classification[J]. Journal of the Indian Society of Remote Sensing, 2014, 42(1): 35-49.
doi: 10.1007/s12524-013-0296-x
[15] Zhang Q, Lu J, Wei H, et al.Dynamic Hand Gesture Segmentation Method Based on Unequai-Probabilities Background Difference and Improved FCM Algorithm[J]. International Journal of Innovative Computing, Information and Control, 2015, 11(5): 1823-1834
[16] Siarry P, Oulhadj H.A Benaichouche Multiobjective Improved Spatial Fuzzy C-Means Clustering for Image Segmentation Combining Pareto-Optimal Clusters[J]. Journal of Heuristics, 2016, 22(4): 1-22.
doi: 10.1007/s10732-015-9300-7
[17] Bache K, Lichman M. UCI Machine Learning Repository [EB/OL]. [2017-10-10]. .
[1] Qian Xiaodong,Li Min. Identifying E-commerce User Types Based on Complex Network Overlapping Community[J]. 数据分析与知识发现, 2018, 2(6): 79-91.
[2] Liu Kaidi,Pang Yanjun,Sui Huiqiang. A New Algorithm of Fuzzy Synthetic Evaluation on IS Service Quality[J]. 现代图书情报技术, 2008, 24(5): 89-93.
[3] Zhang Ting. The Design and Reality of Fuzzy Query Tool Based on Exact Data[J]. 现代图书情报技术, 2004, 20(12): 36-39.
  Copyright © 2016 Data Analysis and Knowledge Discovery   Tel/Fax:(010)82626611-6626,82624938   E-mail:jishu@mail.las.ac.cn