[Objective] This study aims to explore the relationship between the user tags and microblog post topics, with the purpose of improving subject identification and automatic tag recommendation services. [Methods] We first used crawlers to retrieve user profiles and posts in the field of “natural language processing” from the Sina Weibo. Second, extracted words from the posts and semantically extended user tags. Finally, matched the tags and posts by the edit distance algorithm. [Results] There was correlation between user tags and posts in natural language processing field. [Limitations] We only studied one academic field and the Sina Weibo, more research is needed in the future to generalize the results. [Conclusions] The tag recommendation system can use microblog posts as an important source to provide more personalized services, which in turn will improve the microblog content analysis.
朱玲,薛春香,章成志,傅柱. 微博用户标签与博文内容相关度研究*[J]. 现代图书情报技术, 2016, 32(3): 18-24.
Zhu Ling,Xue Chunxiang,Zhang Chengzhi,Fu Zhu. User Tags and Microblog Posts: Case Study of Sina Weibo. New Technology of Library and Information Service, 2016, 32(3): 18-24.
Al-Khalifa H S, Davis H C. Folksonomies Versus Automatic Keyword Extraction: An Empirical Study[J]. IADIS International Journal on Computer Science and Information Systems, 2006, 1(2): 132-143.
[2]
Rolla P J.User Tags Versus Subject Headings[J]. Library Resources & Technical Services, 2011, 53(3): 174-184.
[3]
Thomas M, Caudle D M, Schmitz C M.To Tag or not to Tag?[J]. Library Hi Tech, 2009, 27(3): 411-434.
[4]
Lu C, Park J R, Hu X.User Tags Versus Expert-assigned Subject Terms: A Comparison of LibraryThing Tags and Library of Congress Subject Headings[J]. Journal of Information Science, 2010, 36(6): 763-779.
(Pan Chan, Feng Lifei, Ding Wanying.Tag and Keyword-Based Analysis of Users’ Behavior[J]. Journal of Intelligence, 2010, 29(3): 139-142.)
[6]
Kipp M E I. Tagging of Biomedical Articles on CiteULike: A Comparison of User, Author and Professional Indexing[J]. Knowledge Organization, 2011, 38(3): 245-261.
[7]
Lee D H, Schleyer T.Social Tagging is no Substitute for Controlled Indexing: A Comparison of Medical Subject Headings and CiteULike Tags Assigned to 231, 388 Papers[J]. Journal of the American Society for Information Science and Technology, 2012, 63(9): 1747-1757.
(Huang Hongxia, Zhang Chengzhi.Investigation and Analysis of Chinese Microblog User Tags——Using Sina Weibo as Example[J]. New Technology of Library and Information Service, 2012(10): 49-54.)
(Zhang Chengzhi, He Lulin, Ding Peihong.Difference of Subject Expression Function of User Tags in Different Domains——Using Chinese Microblogging as Example[J]. Information Studies: Theory & Application, 2013, 36(4): 68-71.)
(Xing Qianli, Liu Lie, Liu Yiqun, et al.Study on User Tags in Weibo[J]. Journal of Software, 2015, 26(7): 1626-1637.)
[11]
Baeza-Yates R, Ribeiro-Neto B.Modern Information Retrieval [M]. New York: ACM Press, 1999.
[12]
Kozima H, Furugori T.Similarity Between Words Computed by Spreading Activation on an English Dictionary [C]. In: Proceedings of the 6th Conference on European Chapter of the Association for Computational Linguistics. Association for Computational Linguistics, 1993: 232-239.
(Jiang Min, Xiao Shibin, Wang Hongwei, et al.An Improved Word Similarity Computing Method Based on HowNet[J]. Journal of Chinese Information Processing, 2008, 22(5): 84-89.)
[14]
Budanitsky A, Hirst G.Semantic Distance in WordNet: An Experimental, Application-oriented Evaluation of Five Measures [C]. In: Proceedings of the Workshop on WordNet and Other Lexical Resources, the 2nd Meeting of the North American Chapter of the Association for Computational Linguistics, Pittsburgh. 2001.
[15]
Levenshtein V I.Binary Codes Capable of Correcting Deletions, Insertions, and Reversals[J]. Soviet Physics Doklady, 1966, 10(8): 707-710.