New Technology of Library and Information Service  2015, Vol. 31 Issue (10): 30-39    DOI: 10.11925/infotech.1003-3513.2015.10.05
Combined with Annotated Content and User Attributes for Tag Clustering
Gu Xiaoxue1, Zhang Chengzhi1,2
1 School of Economics & Management, Nanjing University of Science and Technology, Nanjing 210094, China;
2 Jiangsu Key Laboratory of Data Engineering and Knowledge Service (Nanjing University), Nanjing 210093, China
[Objective] Explore the impact of tags' annotated content and tags' user attributes and their combinations in tag clustering. [Methods] Using blogs, extract tag feature, build a vector space model and calculate the similarities between tags where linear method and Sigmod method are used to weight them, finally use the AP algorithm to cluster the tags. [Results] Experimental evaluation results show that in subject classification, in combination of annotated content and user attributes, two types of weighting methods can improve the clustering results, and the performace of Sigmod method is optimal; while in systematic classification, the combination of these two features can't perform as well as the former one and even worse than the content feature. [Limitations] The data selected for experiment is small and the classification for estimating the clustering results is not perfect. What's more, AP clustering algorithm lacks the ability to deal with big data. [Conclusions] The combination of these two features can improve the tag clustering results in some cases, and we should focus more on tag's content in tag clustering.

Received: 29 April 2015      Published: 06 April 2016
Gu Xiaoxue, Zhang Chengzhi. Combined with Annotated Content and User Attributes for Tag Clustering. New Technology of Library and Information Service, 2015, 31(10): 30-39.

