%A Gong Kaile,Cheng Ying,Sun Jianjun %T Clustering Blog Posts with Co-occurrence Analysis %0 Journal Article %D 2016 %J Data Analysis and Knowledge Discovery %R 10.11925/infotech.1003-3513.2016.10.06 %P 50-58 %V 32 %N 10 %U {https://manu44.magtech.com.cn/Jwk_infotech_wk3/CN/abstract/article_4280.shtml} %8 2016-10-25 %X

[Objective] This study investigates the co-occurrence of blog comment contributors, aiming to explore their roles in blog posts clustering. [Methods] We developed a method of two-step clustering. First, we constructed the co-occurrence matrix of the contributors from different blog posts and then transform it to a correlation matrix. Then finished the first-step clustering with the help of Affinity Propagation (AP) algorithm. Second, we calculated the terms’ position weight based on the centers of AP clustering, and then finished the second-stage blog post content clustering with K-means algorithm. [Results] The average precision and recall ratio of the proposed method were 0.66 and 0.57, which were significantly higher than those of the traditional ones. [Limitations] The blog comment contributors co-occurrence improved the quality of clustering, but it has limited value in blog posts with few comments. [Conclusions] The proposed method improves the quality of blog posts clustering by combining terms and contributors’ co-occurrence. The two-step clustering method is a better option to select the initial cluster centers of the K-means algorithm.