Please wait a minute...
Data Analysis and Knowledge Discovery  2018, Vol. 2 Issue (12): 77-88    DOI: 10.11925/infotech.2096-3467.2018.0358
Current Issue | Archive | Adv Search |
Clustering Social Tags with Improved DBSCAN Algorithm
Huixiang Xiong(),Jiaxin Ye,Wuxuan Jiang
School of Information Management, Central China Normal University, Wuhan 430079, China
Download: PDF(631 KB)   HTML ( 1
Export: BibTeX | EndNote (RIS)      
Abstract  

[Objective] This paper tries to improve the DBSCAN algorithm and verify its feasibility and effectiveness in social tagging. [Methods] First, we analyzed the frequency of social tags for resources and their total appearances. Then, we examined the relationship between tags and resources to improve the DBSCAN clustering algorithm. Finally, we applied the new algorithm to cluster tags, and users. [Results] We ran our experiment with data from Douban Movies. The modified DBSCAN algorithm improved the inter-object and inter-cluster correlations of social taggings. [Limitations] The sample datasets need more in-depth mining. [Conclusions] The improved DBSCAN algorithm could effectively cluster social tags.

Key wordsDBSCAN      Tag Clustering      User Clustering      Tag Expansion     
Received: 30 March 2018      Published: 16 January 2019

Cite this article:

Huixiang Xiong,Jiaxin Ye,Wuxuan Jiang. Clustering Social Tags with Improved DBSCAN Algorithm. Data Analysis and Knowledge Discovery, 2018, 2(12): 77-88.

URL:

http://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/10.11925/infotech.2096-3467.2018.0358     OR     http://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/Y2018/V2/I12/77

[1] Hotho A, Jäschke R, Schmitz C, et al.Information Retrieval in Folksonomies: Search and Ranking[C]// Proceedings of the 3rd European Conference on the Semantic Web: Research and Applications. 2006: 411-426.
[2] 熊回香. 面向Web3.0的大众分类研究[D]. 武汉: 华中师范大学, 2011.
[2] (Xiong Huixiang.Research on Folksonomy Oriented to Web3.0[D]. Wuhan: Central China Normal University, 2011.)
[3] Hayman S.Folksonomies and Tagging: New Developments in Social Bookmarking[C]// Proceedings of the 2007 Ark Group Conference: Developing and Improving Classification Schemes. 2007.
[4] 苏新宁, 杨建林, 江念南, 等. 数据仓库和数据挖掘[M]. 北京: 清华大学出版社, 2006.
[4] (Su Xinning, Yang Jianlin, Jiang Niannan, et al.Data Warehouse and Data Mining[M]. Beijing: Tsinghua University Press, 2006.)
[5] Martin P, Eklund P.Embedding Knowledge in Web Documents: CGs Versus XML-based Metadata Languages[C]// Proceedings of the 7th International Conference on Conceptual Structures: Standards and Practices. 1999: 230-246.
[6] Razmerita L, Lytras M D.Ontology-Based User Modelling Personalization: Analyzing the Requirements of a Semantic Learning Portal[C]// Proceedings of the 1st World Summit on Knowledge Society. Springer, 2008: 354-363.
[7] 房小可, 纪春光. 基于标签主题和概念空间的个性化推荐研究[J]. 情报理论与实践, 2015, 38(5): 105-111.
[7] (Fang Xiaoke, Ji Chunguang.Research on the Personalized Recommendation Based on Tag Topic and Concept Space[J]. Information Studies: Theory & Application, 2015, 38(5): 105-111.)
[8] Sood S, Owsley S, Hammond K J, et al.TagAssist: Automatic Tag Suggestion for Blog Posts[C]//Proceedings of ICWSM’ 2007, Boulder, Colorado, USA. 2007.
[9] Zhang Z K, Liu C.A Hypergraph Model of Social Tagging Networks[J]. Journal of Statistical Mechanics: Theory and Experiment, 2010(10): P10005.
[10] 钟青燕, 苏一丹, 梁胜勇. 基于层次聚类和语义的标签推荐研究[J]. 微计算机信息, 2010, 26(12-3): 199-203.
[10] (Zhong Qingyan, Su Yidan, Liang Shengyong.Tag Recommendation Research Base on Hierarchical Clustering and Semantic[J]. Microcomputer Information, 2010, 26(12-3): 199-203.)
[11] 廖志芳, 王超群, 李小庆, 等. 张量分解的标签推荐及新用户标签推荐算法[J]. 小型微型计算机系统, 2013, 34(11): 2472-2476.
[11] (Liao Zhifang, Wang Chaoqun, Li Xiaoqing, et al.Tag Recommendation and New User Tag Recommendation Algorithms Based on Tensor Decomposition[J]. Journal of Chinese Computer Systems, 2013, 34(11): 2472-2476.)
[12] 张斌, 张引, 高克宁, 等. 融合关系与内容分析的社会标签推荐[J]. 软件学报, 2012, 23(3): 476-488.
[12] (Zhang Bin, Zhang Yin, Gao Kening, et al.Combining Relation and Content Analysis for Social Tagging Recommendation[J]. Journal of Software, 2012, 23(3): 476-488.)
[13] 易明, 操玉杰, 沈劲枝, 等. 社会化标签系统中基于密度聚类的Web 用户兴趣建模方法[J]. 情报学报, 2011, 30(1): 37-43.
[13] (Yi Ming, Cao Yujie, Shen Jinzhi, et al.An Approach to Web User Interest Modeling Based on Density-based Clustering Algorithm in the Social Tag System[J]. Journal of the China Society for Scientific and Technical Information, 2011, 30(1): 37-43.)
[14] Begelman G, Keller P, Smadja F.Automated Tag Clustering: Improving Search and Exploration in the Tag Space[C]// Proceedings of the Collaborative Web Tagging Workshop at WWW2006. 2006: 15-33.
[15] 曹高辉, 焦玉英, 成全. 基于凝聚式层次聚类算法的标签聚类研究[J]. 现代图书情报技术, 2008(4): 23-28.
[15] (Cao Gaohui, Jiao Yuying, Cheng Quan.Research on Tag Cluster Based on Hierarchical Agglomerative Clustering Algorithm[J]. New Technology of Library and Information Service, 2008(4): 23-28.)
[16] Gemmell J, Shepitsen A, Mobasher B, et al.Personalizing Navigation in Folksonomies Using Hierarchical Tag Clustering[C]// Proceedings of the 10th International Conference on Data Warehousing and Knowledge Discovery. Springer, 2008: 196-205.
[17] 王翠英. 标签的聚类分析研究[J]. 现代图书情报技术, 2008(5): 67-71.
[17] (Wang Cuiying.Study on Tag Clustering Analysis[J]. New Technology of Library and Information Service, 2008(5): 67-71.)
[18] 石陆魁, 何丕廉. 一种基于密度的高效聚类算法[J]. 计算机应用, 2005, 25(8): 1824-1826.
[18] (Shi Lukui, He Pilian.Efficient Density-Based Clustering Algorithm[J]. Computer Applications, 2005, 25(8): 1824-1826.)
[19] 李双庆, 慕升弟. 一种改进的DBSCAN算法及其应用[J]. 计算机工程与应用, 2014, 50(8):72-76.
[19] (Li Shuangqing, Mu Shengdi.Improved DBSCAN Algorithm and Its Application[J]. Computer Engineering and Applications, 2014, 50(8): 72-76.)
[20] Li P, Wang B, Jin W, et al.User-Related Tag Expansion for Web Document Clustering[C]// Proceedings of the 33rd European Conference on Information Retrieval. Springer, 2011: 19-31.
[21] Zezula P, Amato G, Dohnal V, et al.Similarity Search: The Metric Space Approach[M]. Springer Science & Business Media, 2006.
[1] Zhang Chengzhi, Gu Xiaoxue. Clustering Machine-Generated Tags with Different Quality[J]. 现代图书情报技术, 2015, 31(10): 22-29.
[2] Gu Xiaoxue, Zhang Chengzhi. Combined with Annotated Content and User Attributes for Tag Clustering[J]. 现代图书情报技术, 2015, 31(10): 30-39.
[3] Wang Xiaoyun, Qian Lu, Huang Shiyou. Collaborative Filtering Recommendation Model Based on Rough User Clustering[J]. 现代图书情报技术, 2015, 31(1): 45-51.
[4] Yan Duanwu,Luo Shengyang,Cheng Xiao . Toward User-Document Matrix Based User Clustering for Collaborative Recommendation[J]. 现代图书情报技术, 2007, 2(3): 25-28.
  Copyright © 2016 Data Analysis and Knowledge Discovery   Tel/Fax:(010)82626611-6626,82624938   E-mail:jishu@mail.las.ac.cn