Please wait a minute...
Advanced Search
现代图书情报技术  2015, Vol. 31 Issue (10): 2-12    DOI: 10.11925/infotech.1003-3513.2015.10.02
  专题 本期目录 | 过刊浏览 | 高级检索 |
社会化标签质量自动评估研究
章成志1,2, 李蕾1
1 南京理工大学经济管理学院 南京 210094;
2 江苏省数据工程与知识服务重点实验室(南京大学) 南京 210093
Automatic Quality Evaluation of Social Tags
Zhang Chengzhi1,2, Li Lei1
1 School of Economics & Management, Nanjing University of Science and Technology, Nanjing 210094, China;
2 Jiangsu Key Laboratory of Data Engineering and Knowledge Service (Nanjing University), Nanjing 210093, China
全文: PDF(581 KB)   HTML  
输出: BibTeX | EndNote (RIS)      
摘要 

[目的] 对用户标注的大量标签实现自动评估, 自动选择或推荐高质量的标签, 提高社会化标签应用效果。[方法] 现有的标签质量评估研究割裂了标签的内容属性与社会化属性, 没有结合标签多方面属性进行综合评估。因此本文以博文标签作为研究对象, 融合社会化标签内容属性与社会化属性, 利用统计机器模型对社会化标签质量进行自动评估研究。[结果] 结果显示, 结合标签的内容属性特征和社会化属性特征, 支持向量机标签质量评估模型评估结果明显优于多元回归和朴素贝叶斯评估结果。[局限] 仅使用科学网博文的标签数据, 其社会化功能还不够完善, 一些社会化属性并不能有效地提高社会化标签质量自动分类效果。[结论] 该工作为进一步提升社会化标签的组织与应用质量打下基础。

服务
把本文推荐给朋友
加入引用管理器
E-mail Alert
RSS
作者相关文章
Abstract

[Objective] It's important to improve application performance of social tags by selecting or recommending tags with high quality automatically. [Methods] The existing research on quality evaluation of social tags are separated into content and social attributes of tags, which don't combine these two attributes to evaluate the social tags. In this paper, the authors use tag's content and social attributes to evaluate the quality of tags by statistical machine learning model. [Results] Exprimental results show that with combining content and social attributes of tags, the quality evaluaton model based on SVM outperforms other models. [Limitations] Only use the blog tag data to evaluate the quality of social tags. The performance based on the social attributes are not perfect. Some social attributes can not effectively improve the automatic classification of social tags' quality. [Conclusions] This work is useful for improving the performance of the tags organization and related application.

收稿日期: 2015-07-21     
:  G350  
基金资助:

本文系国家社会科学基金项目“在线社交网络中基于用户的知识组织模式研究”(项目编号:14BTQ033)、教育部人文社会科学基金规划项目“多语言高质量社会化标签生成及聚类研究”(项目编号:13YJA870020)和国家社会科学基金重大项目“面向突发事件应急决策的快速响应情报体系研究”(项目编号:13&ZD174)的研究成果之一。

通讯作者: 章成志, ORCID: 0000-0001-8121-4796, E-mail: zhangcz@njust.edu.cn。     E-mail: zhangcz@njust.edu.cn
作者简介: 作者贡献声明:章成志: 提出研究思路, 讨论研究方案, 数据采集, 论文起草及最终版本修订; 李蕾: 设计研究方案, 实验设计与实施, 数据分析。
引用本文:   
章成志, 李蕾. 社会化标签质量自动评估研究[J]. 现代图书情报技术, 2015, 31(10): 2-12.
Zhang Chengzhi, Li Lei. Automatic Quality Evaluation of Social Tags. New Technology of Library and Information Service, DOI:10.11925/infotech.1003-3513.2015.10.02.
链接本文:  
http://manu44.magtech.com.cn/Jwk_infotech_wk3/CN/10.11925/infotech.1003-3513.2015.10.02

[1] Trivedi A, Rai P, Daumé H, et al. Leveraging Social Bookmarks from Partially Tagged Corpus for Improved Webpage Clustering [J]. ACM Transactions on Intelligent Systems and Technology, 2012, 3(4): Article No. 67.
[2] Zubiaga A, Martinez R, Fresno V. Getting the Most out of Social Annotations for Web Page Classification [C]. In: Proceedings of the 9th ACM Symposium on Document Engineering (DocEng2009), Munich, Germany. 2009: 74-83.
[3] Zhou D, Bian J, Zheng S, et al. Exploring Social Annotations for Information Retrieval [C]. In: Proceedings of the 17th International Conference on World Wide Web, Beijing, China. 2008: 715-724.
[4] Zhao S W, Du N, Nauerz A, et al. Improved Recommendation Based on Collaborative Tagging Behaviors [C]. In: Proceedings of the 13th International Conference on Intelligent User Interfaces (IUI'08), Canary Islands, Spain. 2008: 413-416.
[5] Lee S E, Han S S. Qtag: Introducing the Qualitative Tagging System [C]. In: Proceedings of the 18th Conference on Hypertext and Hypermedia (HT'07), Manchester, United Kingdom. 2007: 35-36.
[6] Sen S, Harper F M, LaPitz A, et al. The Quest for Quality Tags [C]. In: Proceedings of the 2007 International ACM Conference on Supporting Group Work (GROUP'07). 2007: 361-370.
[7] Van Damme C, Hepp M, Coenen T. Quality Metrics for Tags of Broad Folksonomies [C]. In: Proceedings of International Conference on Semantic Systems (I-SEMANTICS'08), Graz, Austria.2008: 118-125.
[8] Zhang S, Farooq U, Carroll J M. Enhancing Information Scent: Identifying and Recommending Quality Tags [C]. In: Proceedings of the ACM 2009 International Conference on Supporting Group Work (GROUP'09), Sanibel Island, USA. 2009: 1-10.
[9] Belém F M, Martins E F, Almeida J M, et al. Exploiting Co-occurrence and Information Quality Metrics to Recommend Tags in Web 2.0 Applications [C]. In: Proceedings of the 19th ACM International Conference on Information and Knowledge Management (CIKM2010), Toronto, Canada. 2010: 1793-1796.
[10] 孙珂. 大规模文档标签自动标注技术研究[D]. 哈尔滨:哈尔滨工业大学, 2011. (Sun Ke. Research on Large-scale Document Automatic Tagging Technologies [D]. Harbin: Harbin Institute of Technology, 2011.)
[11] Guy M, Tonkin E. Folksonomies: Tidying up Tags? [J]. D-Lib Magazine, 2006, 12(1). http://www.dlib.org/dlib/january06/guy/01guy.html.
[12] Wu D, He D, Qiu J, et al. Comparing Social Tags with Subject Headings on Annotating Books: A Study Comparing the Information Science Domain in English and Chinese [J]. Journal of Information Science, 2013, 39(2): 169-187.
[13] Lee D H, Schleyer T. Social Tagging is no Substitute for Controlled Indexing: A Comparison of Medical Subject Headings and CiteULike Tags Assigned to 231, 388 Papers [J]. Journal of the American Society for Information Science and Technology, 2012, 63(9): 1747-1757.
[14] Hall C, Zarro M. What do You Call It?: A Comparison of Library-created and User-created Tags [C]. In: Procee­dings of the 11th Annual International ACM/IEEE Joint Conference on Digital Libraries (JCDL'11), Ottawa, Canada.2011: 53-56.
[15] Chen S J. User Tagging for Digital Archives: The Case of Commercial Keywords from the Grand Secretariat[C]. In: Proceedings of the International Conference on Asia-Pacific Digital Libraries (ICADL 2011), Beijing, China. 2011: 158-167.
[16] Syn S Y, Spring M B. Tags as Keywords - Comparison of the Relative Quality of Tags and Keywords [J]. Proceedings of the American Society for Information Science and Technology, 2009, 46(1):1-19.
[17] Lai V, Rajashekar C, Rand W. Comparing Social Tags to Microblogs [C]. In: Proceedings of 2011 IEEE 3rd International Conference on Privacy, Security, Risk and Trust and 2011 IEEE 3rd International Conference on Social Computing, Boston, USA. 2011: 1380-1383.
[18] Noh T G, Lee J K, Park S B, et al. Tag Quality Feedback: A Framework for Quantitative and Qualitative Feedback on Tags of Social Web [C]. In: Proceedings of the 11th Pacific Rim International Conference on Artificial Intelligence (PRICAI'10), Daegu, Korea. 2010: 637-642.
[19] Yi K, Yoo C Y. An Empirical Examination of the Associations Between Social Tags and Web Queries [J]. Information Research, 2012, 17(3). http://InformationR.net/ir/17-3/paper527. html
[20] Krestel R, Chen L. The Art of Tagging: Measuring the Quality of Tags [C]. In: Proceedings of the 3rd Asian Semantic Web Conference (ASWC'08), Bangkok, Thailand. 2008: 257-271.
[21] Gu X, Wang X, Li R, et al. Measuring Social Tag Confidence: Is It a Good or Bad Tag? [C]. In: Proceedings of the 12th International Conference on Web-Age Information Management (WAIM2011), Wuhan, China.2011: 94-105.
[22] 李蕾, 章成志. 社会化标签质量评估研究综述[J]. 现代图书情报技术, 2013(11): 22-29. (Li Lei, Zhang Chengzhi. Survey on Quality Measurement of Social Tags [J]. New Technology of Library and Information Service, 2013(11): 22-29.)
[23] 李蕾, 王冕, 章成志. 区分标签类型的社会化标签质量测评研究[J]. 图书情报工作, 2013, 57(23): 11-16. (Li Lei, Wang Mian, Zhang Chengzhi. Quality Evaluation of Social Tagging Based on the Type of Tags [J]. Library and Information Service, 2013,57(23): 11-16.)
[24] Li L, Zhang C Z. Quality Evaluation of Social Tags According to Web Resource Types [C]. In: Proceedings of the 23rd International Conference on World Wide Web (WWW'14 Companion), Seoul, Korea.2014: 1123-1128.
[25] Jones K S. A Statistical Interpretation of Term Specificity and Its Application in Retrieval [J]. Journal of Documentation, 1972, 28: 11-21.
[26] Kageura K, Umino B. Methods of Automatic Term Recogni­tion: A Review [J]. Terminology, 1996, 3(2): 259-289.
[27] 章成志. 多语言领域本体学习研究[M]. 南京: 南京大学出版社, 2012. (Zhang Chengzhi. Multilingual Domain Ontology Learning [M]. Nanjing: Nanjing University Press, 2012.)
[28] Wu X, Kumar V, Quinlan J R, et al. Top 10 Algorithms in Data Mining [J]. Knowledge and Information Systems, 2008, 14(1): 1-37.
[29] Vapnik V N. Statistical Learning Theory [M]. New York: Wiley, 1998.
[30] 魏宗舒, 等. 概率论与数理统计教程[M]. 北京: 高等教育出版社, 2010. (Wei Zongshu, et al. Textbook of Probability Theory and Mathematical Statistics [M]. Beijing: Higher Education Press, 2010.)

[1] 文秀贤,徐健. 基于用户评论的商品特征提取及特征价格研究 *[J]. 数据分析与知识发现, 2019, 3(7): 42-51.
[2] 邓诗琦,洪亮. 面向智能应用的领域本体构建研究*——以反电话诈骗领域为例[J]. 数据分析与知识发现, 2019, 3(7): 73-84.
[3] 彭浩, 徐健, 肖卓. 基于比较句的网络用户评论情感分析[J]. 现代图书情报技术, 2015, 31(12): 48-56.
[4] 段宇锋, 黄思思. 基于BFO构建中文植物物种多样性领域本体的研究[J]. 现代图书情报技术, 2015, 31(12): 72-79.
[5] 邵健, 章成志, 李蕾. Hashtag研究综述[J]. 现代图书情报技术, 2015, 31(10): 40-49.
[6] 祝婷, 秦春秀, 李祖海. 基于用户分类的协同过滤个性化推荐方法研究[J]. 现代图书情报技术, 2015, 31(6): 13-19.
[7] 李纲, 叶光辉, 张岩. “小众专家”特征识别——基于MetaFilter的实证分析[J]. 现代图书情报技术, 2015, 31(6): 71-77.
[8] 李慧, 相华婷, 汤强. 基于结构和编辑历史的Wikipedia信任模型[J]. 现代图书情报技术, 2015, 31(3): 33-38.
[9] 王睿, 胡文静, 郭玮. 常用Altmetrics工具比较[J]. 现代图书情报技术, 2014, 30(12): 18-26.
[10] 杨志墨, 刘怀亮, 赵辉. 一种基于复杂网络的中文文本表示算法[J]. 现代图书情报技术, 2014, 30(11): 38-44.
[11] 邱均平, 余厚强. 从VAST会议解读可视分析学新进展[J]. 现代图书情报技术, 2014, 30(10): 14-24.
[12] 孙鸿飞, 侯伟. 改进TFIDF算法在潜在合作关系挖掘中的应用研究[J]. 现代图书情报技术, 2014, 30(10): 84-92.
[13] 徐孝娟, 张莉, 朱庆华, 梁茹. 三螺旋模型视角下的高科技人才机构属性及学科交叉研究[J]. 现代图书情报技术, 2014, 30(9): 99-107.
[14] 毕达宇, 夏晓旭, 王靖. 用户在线评论数据挖掘的网商信用度分析[J]. 现代图书情报技术, 2014, 30(7): 77-83.
[15] 毕强, 周姗姗, 马志强, 滕广青. 面向知识关联的标签云优化机理研究*[J]. 现代图书情报技术, 2014, 30(5): 33-40.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
版权所有 © 2015 《数据分析与知识发现》编辑部
地址:北京市海淀区中关村北四环西路33号 邮编:100190
电话/传真:(010)82626611-6626,82624938
E-mail:jishu@mail.las.ac.cn