Please wait a minute...
Advanced Search
现代图书情报技术  2009, Vol. 3 Issue (3): 38-45     https://doi.org/10.11925/infotech.1003-3513.2009.03.07
  专题 本期目录 | 过刊浏览 | 高级检索 |
从社会性标签中进行语义关系抽取——一种元数据生成方法
Miao Chen  Xiaozhong Liu  Jian Qin
(美国雪城大学   美国)
Semantic Relation Extraction from Socially-generated Tags:A Methodology for Metadata Generation
Miao Chen  Xiaozhong Liu  Jian Qin
(Syracuse University, USA)
全文: PDF (538 KB)  
输出: BibTeX | EndNote (RIS)      
摘要 

标签形式的社会性语义越来越占据主导地位,使元数据界在这种新形式的信息内容表达和检索方面面临机遇和挑战。其中,主要的挑战是与标签相关的语境信息的缺失。以Flickr标签为例,对如何利用社会性语义资源来丰富主题元数据进行了实验。实验过程包含4个步骤:收集Flickr标签样本;通过共有信息计算标签间的同现情况;通过Google检索结果来追踪标签对的语境信息;用自然语言处理和机器学习技术来抽取标签间的语义关系。本实验能够利用Google搜索结果构建语境库,并且以自然语言处理和机器学习算法对这些语句进行处理。这种新方法对于赋予标签对以一定语义关系有相当高的准确率。也探讨该方法在利用社会性语义丰富的主题元数据方面的意义。

服务
把本文推荐给朋友
加入引用管理器
E-mail Alert
RSS
作者相关文章
Miao Chen
Xiaozhong Liu
秦健
关键词 关系抽取标签搜索引擎社会性语义元数据    
Abstract

The growing predominance of social semantics in the form of tagging presents the metadata community with both opportunities and challenges as for leveraging this new form of information content representation and for retrieval. One key challenge is the absence of contextual information associated with these tags. This paper presents an experiment working with Flickr tags as an example of utilizing social semantics sources for enriching subject metadata. The procedure included four steps:1) Collecting a sample of Flickr tags, 2) Calculating cooccurrences between tags through mutual information, 3) Tracing contextual information of tag pairs via Google search results,4) Applying natural language processing and machine learning techniques to extract semantic relations between tags. The experiment helped us to build a context sentence collection from the Google search results, which was then processed by natural language processing and machine learning algorithms. This new approach achieved a reasonably good rate of accuracy in assigning semantic relations to tag pairs. This paper also explores the implications of this approach for using social semantics to enrich subject metadata.

Key wordsRelation extraction    Tags    Search engine    Social semantics    Metadata
收稿日期: 2009-02-09      出版日期: 2009-03-25
ZTFLH: 

G250

 
通讯作者: Miao Chen     E-mail: mchen14@syr.edu
作者简介: Miao Chen,Xiaozhong Liu,Jian Qin
引用本文:   
Miao Chen,Xiaozhong Liu,Jian Qin . 从社会性标签中进行语义关系抽取——一种元数据生成方法[J]. 现代图书情报技术, 2009, 3(3): 38-45.
Miao Chen,Xiaozhong Liu,Jian Qin. Semantic Relation Extraction from Socially-generated Tags:A Methodology for Metadata Generation. New Technology of Library and Information Service, 2009, 3(3): 38-45.
链接本文:  
http://manu44.magtech.com.cn/Jwk_infotech_wk3/CN/10.11925/infotech.1003-3513.2009.03.07      或      http://manu44.magtech.com.cn/Jwk_infotech_wk3/CN/Y2009/V3/I3/38

[1] Agichtein, Eugene, and Luis Gravano. (2000). Snowball: Extracting Relations from Large Plain-text Collections. In Kenneth M. Anderson, et al. (Ed.), Proceedings of the 5th ACM Conference on Digital Libraries, (pp. 85-94). New York: Association for Computing Machinery.
[2] Brin, Sergey. (1998). Extracting Patterns and Relations from the World Wide Web. In Paolo Atzeni et al. (Ed.), Selected Papers from the International Workshop on the World Wide Web and Databases, (pp. 172-183). London: Springer.
[3] Bunescu, Razvan C., and Raymond J. Mooney. (2007). Extracting Relations from Text from Word Sequences to Dependency Paths. In Anne Kao, et al. (Ed.), Text Mining and Natural Language Processing, (pp. 29-44). London: Springer. 
[4] Culotta, Aron, and Jeffrey Sorensen. (2004). Dependency Tree Kernels for Relation Extraction. Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics. Retrieved April 13, 2008, from http://acl.ldc.upenn.edu/P/P04/P04-1054.pdf
[5] Culotta, Aron, Andrew McCallum, and Jonathan Betz. (2006). Integrating Probabilistic Extraction Models and Data Mining to Discover Relations and Patterns in Text. Proceedings of the Main Conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics, (pp. 296-303).
[6] Guy, Marieke, and Emma Tonkin. (2006). Folksonomies: Tidying up tags? D-Lib Magazine, 12(1). Retrieved April 13, 2008, from http://www.dlib.org/dlib/january06/guy/01guy.html.
[7] Heymann, Paul, and Hector Garcia-Molina. (2006). Collaborative Creation of Communal Hierarchical Taxonomies in Social Tagging Systems. Technical Report 2006-10. Department of Computer Science, Stanford University. Retrieved April 13, 2008, from http://labs.rightnow.com/colloquium/papers/tag_hier_mining.pdf.
[8] Iria, Jose, and Fabio Ciravegna. (2005). Relation Extraction for Mining the Semantic Web. Dagstuhl Seminar on Machine Learning for the Semantic Web. Retrieved April 13, 2008, from http://tyne.shef.ac.uk/t-rex/pdocs/dagstuhl.pdf.
[9] Liu, Hugo and Pattie Maes. (2007). Introduction to the Semantics of People & Culture (Editorial preface). International Journal on Semantic Web and Information Systems, Special Issue on Semantics of People and Culture, 3(1). Retrieved March 28, 2008, from http://larifari.org/writing/IJSWIS2007-SPC-EditorialPreface.pdf.
[10] Mathes, Adam. (2004). Folksonomies-Cooperative Classification and Communication Through Shared Metadata. Unpublished manuscript. Retrieved April 13, 2008, from  http://www.adammathes.com/academic/computer-mediated-communication/folksonomies.html.
[11] Mika, Peter. (2005). Ontologies are Us: A Unified Model of Social Networks and Semantics. In Yolanda Gil, et al. (Eds.), Proceedings of the 4th International Semantic Web Conference (ISWC 2005), (pp. 522–536). Berlin: Springer. Retrieved March 28, 2008, from http://ebi.seu.edu.cn/ISWC2005/papers/3729/37290522.pdf.
[12] Michlmayr, Elke, Sabine Graf, Wolf Siberski, and Wolfgang Nejdl. (2005). A Case Study on Emergent Semantics in Communities. In Yolanda Gil, et al. (Eds.), Proceedings of the Workshop on Social Network Analysis, the 4th International Semantic Web Conference (ISWC 2005). Berlin: Springer.
[13] Nahm, Un Y., and Raymond J. Mooney. (2000). A Mutually Beneficial Integration of Data Mining and Information Extraction. Proceedings of the 17th National Conference on Artificial Intelligence and 12th Conference on Innovative Applications of Artificial Intelligence, (pp. 627-632). Menlo Park, CA: AAAI Press.
[14] Nguyen, Dat P., Yutaka Matsuo, and Mitsuru Ishizuka. (2007). Relation Extraction from Wikipedia Using Subtree Mining. Proceedings of the National Conference on Artificial Intelligence Ontology Learning in conjunction with the 14th European Conference on Artificial Intelligence, Berlin, Germany. Retrieved April 13, 2008, from http://acl.ldc.upenn.edu/N/N07/N07-2032.pdf.
[15] Qin, Jian. (2008). Controlled Semantics vs. Social Semantics: An Epistemological Analysis. Proceedings of the 10th International ISKO Conference: Culture and Identity in Knowledge Organization, Montreal, 5.-8. August, 2008. Retrieved March 28, 2008, from http://web.syr.edu/~jqin/pubs/isko2008_qin.pdf.
[16] Rattenbury, Tye, Nathaniel Good, and Mor Naaman. (2007). Towards Automatic Extraction of Event and Place Semantics from Flickr Tags. In Charles L. Clarke, et al. (Ed.), Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, (pp. 103-110). New York: Association for Computing Machinery.
[17] Roth, Dan, and Wen-tau Yih. (2002). Probabilistic Reasoning for Entity & Relation Recognition. Proceedings of 19th International Conference on Computational Linguistics, 1-7. New Brunswick: ACL.
[18] Sanderson, Mark, and Bruce Croft. (1999). Deriving Concept Hierarchies from Text. In M. Hearst, et al. (Ed.): Proceedings of the 22nd ACM Conference of the Special Interest Group in Information Retrieval, (pp. 206-213). New York: Association from Computing Machinery.
[19] Schmitz, Patrick. (2006). Inducing Ontology from Flickr Tags. Collaborative Web Tagging Workshop at WWW 2006, Edinburgh, UK. Retrieved April 13, 2008, from http://www.topixa.com/www2006/22.pdf.
[20] Shannon, Claude E. (1948). The Mathematical Theory of Communication. Bell System Technology Journal, 27, 379-423.
[21] Zelenko, Dmitry, Chinatsu Aone, and Anthony Richardella. (2003). Kernel Methods for Relation Extraction. Journal of Machine Learning Research, 3, 1083-1106. 知识组织与知识管理

[1] 盛嘉祺, 许鑫. 融合主题相似度与合著网络的学者标签扩展方法研究*[J]. 数据分析与知识发现, 2020, 4(8): 75-85.
[2] 叶佳鑫,熊回香,童兆莉,孟秋晴. 在线医疗社区中面向医生的协同标注研究*[J]. 数据分析与知识发现, 2020, 4(6): 118-128.
[3] 李旭晖,于滔,李婷,李逸文,顾进广. 一种面向演化的模式元数据描述机制*[J]. 数据分析与知识发现, 2020, 4(1): 76-88.
[4] 马娜,张智雄,吴朋民. 基于特征融合的术语型引用对象自动识别方法研究*[J]. 数据分析与知识发现, 2020, 4(1): 89-98.
[5] 李博诚,张云秋,杨铠西. 面向微博商品评论的情感标签抽取研究 *[J]. 数据分析与知识发现, 2019, 3(9): 115-123.
[6] 夏立新,曾杰妍,毕崇武,叶光辉. 基于LDA主题模型的用户兴趣层级演化研究 *[J]. 数据分析与知识发现, 2019, 3(7): 1-13.
[7] 张金柱,胡一鸣. 融合表示学习与机器学习的专利科学引文标题自动抽取研究*[J]. 数据分析与知识发现, 2019, 3(5): 68-76.
[8] 吴粤敏,丁港归,胡滨. 基于注意力机制的农业金融文本关系抽取研究*[J]. 数据分析与知识发现, 2019, 3(5): 86-92.
[9] 叶佳鑫,熊回香. 基于标签的跨领域资源个性化推荐研究*[J]. 数据分析与知识发现, 2019, 3(2): 21-32.
[10] 毕崇武,叶光辉,李明倩,曾杰妍. 基于标签语义挖掘的城市画像感知研究 *[J]. 数据分析与知识发现, 2019, 3(12): 41-51.
[11] 李钰曼,陈志泊,许福. 基于KACC模型的文本分类研究 *[J]. 数据分析与知识发现, 2019, 3(10): 89-97.
[12] 蒋武轩,熊回香,叶佳鑫,安宁. 网络社交平台中社群标签动态生成研究 *[J]. 数据分析与知识发现, 2019, 3(10): 98-109.
[13] 叶光辉, 胡婧岚, 徐健, 夏立新. 社交博客标签增长态势与连接模式分析*[J]. 数据分析与知识发现, 2018, 2(6): 70-78.
[14] 陆伟, 罗梦奇, 丁恒, 李信. 深度学习图像标注与用户标注比较研究*[J]. 数据分析与知识发现, 2018, 2(5): 1-10.
[15] 张素琪, 高星, 霍士杰, 郭京津, 顾军华. 基于速度优化和社区偏向的标签传播算法*[J]. 数据分析与知识发现, 2018, 2(3): 60-69.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
版权所有 © 2015 《数据分析与知识发现》编辑部
地址:北京市海淀区中关村北四环西路33号 邮编:100190
电话/传真:(010)82626611-6626,82624938
E-mail:jishu@mail.las.ac.cn