Please wait a minute...
New Technology of Library and Information Service  2009, Vol. 3 Issue (3): 38-45    DOI: 10.11925/infotech.1003-3513.2009.03.07
Current Issue | Archive | Adv Search |
Semantic Relation Extraction from Socially-generated Tags:A Methodology for Metadata Generation
Miao Chen  Xiaozhong Liu  Jian Qin
(Syracuse University, USA)
Download: PDF (538 KB)  
Export: BibTeX | EndNote (RIS)      
Abstract  

The growing predominance of social semantics in the form of tagging presents the metadata community with both opportunities and challenges as for leveraging this new form of information content representation and for retrieval. One key challenge is the absence of contextual information associated with these tags. This paper presents an experiment working with Flickr tags as an example of utilizing social semantics sources for enriching subject metadata. The procedure included four steps:1) Collecting a sample of Flickr tags, 2) Calculating cooccurrences between tags through mutual information, 3) Tracing contextual information of tag pairs via Google search results,4) Applying natural language processing and machine learning techniques to extract semantic relations between tags. The experiment helped us to build a context sentence collection from the Google search results, which was then processed by natural language processing and machine learning algorithms. This new approach achieved a reasonably good rate of accuracy in assigning semantic relations to tag pairs. This paper also explores the implications of this approach for using social semantics to enrich subject metadata.

Key wordsRelation extraction      Tags      Search engine      Social semantics      Metadata     
Received: 09 February 2009      Published: 25 March 2009
ZTFLH: 

G250

 
Corresponding Authors: Miao Chen     E-mail: mchen14@syr.edu
About author:: Miao Chen,Xiaozhong Liu,Jian Qin

Cite this article:

Miao Chen,Xiaozhong Liu,Jian Qin. Semantic Relation Extraction from Socially-generated Tags:A Methodology for Metadata Generation. New Technology of Library and Information Service, 2009, 3(3): 38-45.

URL:

http://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/10.11925/infotech.1003-3513.2009.03.07     OR     http://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/Y2009/V3/I3/38

[1] Agichtein, Eugene, and Luis Gravano. (2000). Snowball: Extracting Relations from Large Plain-text Collections. In Kenneth M. Anderson, et al. (Ed.), Proceedings of the 5th ACM Conference on Digital Libraries, (pp. 85-94). New York: Association for Computing Machinery.
[2] Brin, Sergey. (1998). Extracting Patterns and Relations from the World Wide Web. In Paolo Atzeni et al. (Ed.), Selected Papers from the International Workshop on the World Wide Web and Databases, (pp. 172-183). London: Springer.
[3] Bunescu, Razvan C., and Raymond J. Mooney. (2007). Extracting Relations from Text from Word Sequences to Dependency Paths. In Anne Kao, et al. (Ed.), Text Mining and Natural Language Processing, (pp. 29-44). London: Springer. 
[4] Culotta, Aron, and Jeffrey Sorensen. (2004). Dependency Tree Kernels for Relation Extraction. Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics. Retrieved April 13, 2008, from http://acl.ldc.upenn.edu/P/P04/P04-1054.pdf
[5] Culotta, Aron, Andrew McCallum, and Jonathan Betz. (2006). Integrating Probabilistic Extraction Models and Data Mining to Discover Relations and Patterns in Text. Proceedings of the Main Conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics, (pp. 296-303).
[6] Guy, Marieke, and Emma Tonkin. (2006). Folksonomies: Tidying up tags? D-Lib Magazine, 12(1). Retrieved April 13, 2008, from http://www.dlib.org/dlib/january06/guy/01guy.html.
[7] Heymann, Paul, and Hector Garcia-Molina. (2006). Collaborative Creation of Communal Hierarchical Taxonomies in Social Tagging Systems. Technical Report 2006-10. Department of Computer Science, Stanford University. Retrieved April 13, 2008, from http://labs.rightnow.com/colloquium/papers/tag_hier_mining.pdf.
[8] Iria, Jose, and Fabio Ciravegna. (2005). Relation Extraction for Mining the Semantic Web. Dagstuhl Seminar on Machine Learning for the Semantic Web. Retrieved April 13, 2008, from http://tyne.shef.ac.uk/t-rex/pdocs/dagstuhl.pdf.
[9] Liu, Hugo and Pattie Maes. (2007). Introduction to the Semantics of People & Culture (Editorial preface). International Journal on Semantic Web and Information Systems, Special Issue on Semantics of People and Culture, 3(1). Retrieved March 28, 2008, from http://larifari.org/writing/IJSWIS2007-SPC-EditorialPreface.pdf.
[10] Mathes, Adam. (2004). Folksonomies-Cooperative Classification and Communication Through Shared Metadata. Unpublished manuscript. Retrieved April 13, 2008, from  http://www.adammathes.com/academic/computer-mediated-communication/folksonomies.html.
[11] Mika, Peter. (2005). Ontologies are Us: A Unified Model of Social Networks and Semantics. In Yolanda Gil, et al. (Eds.), Proceedings of the 4th International Semantic Web Conference (ISWC 2005), (pp. 522–536). Berlin: Springer. Retrieved March 28, 2008, from http://ebi.seu.edu.cn/ISWC2005/papers/3729/37290522.pdf.
[12] Michlmayr, Elke, Sabine Graf, Wolf Siberski, and Wolfgang Nejdl. (2005). A Case Study on Emergent Semantics in Communities. In Yolanda Gil, et al. (Eds.), Proceedings of the Workshop on Social Network Analysis, the 4th International Semantic Web Conference (ISWC 2005). Berlin: Springer.
[13] Nahm, Un Y., and Raymond J. Mooney. (2000). A Mutually Beneficial Integration of Data Mining and Information Extraction. Proceedings of the 17th National Conference on Artificial Intelligence and 12th Conference on Innovative Applications of Artificial Intelligence, (pp. 627-632). Menlo Park, CA: AAAI Press.
[14] Nguyen, Dat P., Yutaka Matsuo, and Mitsuru Ishizuka. (2007). Relation Extraction from Wikipedia Using Subtree Mining. Proceedings of the National Conference on Artificial Intelligence Ontology Learning in conjunction with the 14th European Conference on Artificial Intelligence, Berlin, Germany. Retrieved April 13, 2008, from http://acl.ldc.upenn.edu/N/N07/N07-2032.pdf.
[15] Qin, Jian. (2008). Controlled Semantics vs. Social Semantics: An Epistemological Analysis. Proceedings of the 10th International ISKO Conference: Culture and Identity in Knowledge Organization, Montreal, 5.-8. August, 2008. Retrieved March 28, 2008, from http://web.syr.edu/~jqin/pubs/isko2008_qin.pdf.
[16] Rattenbury, Tye, Nathaniel Good, and Mor Naaman. (2007). Towards Automatic Extraction of Event and Place Semantics from Flickr Tags. In Charles L. Clarke, et al. (Ed.), Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, (pp. 103-110). New York: Association for Computing Machinery.
[17] Roth, Dan, and Wen-tau Yih. (2002). Probabilistic Reasoning for Entity & Relation Recognition. Proceedings of 19th International Conference on Computational Linguistics, 1-7. New Brunswick: ACL.
[18] Sanderson, Mark, and Bruce Croft. (1999). Deriving Concept Hierarchies from Text. In M. Hearst, et al. (Ed.): Proceedings of the 22nd ACM Conference of the Special Interest Group in Information Retrieval, (pp. 206-213). New York: Association from Computing Machinery.
[19] Schmitz, Patrick. (2006). Inducing Ontology from Flickr Tags. Collaborative Web Tagging Workshop at WWW 2006, Edinburgh, UK. Retrieved April 13, 2008, from http://www.topixa.com/www2006/22.pdf.
[20] Shannon, Claude E. (1948). The Mathematical Theory of Communication. Bell System Technology Journal, 27, 379-423.
[21] Zelenko, Dmitry, Chinatsu Aone, and Anthony Richardella. (2003). Kernel Methods for Relation Extraction. Journal of Machine Learning Research, 3, 1083-1106. 知识组织与知识管理

[1] Xuhui Li,Tao Yu,Ting Li,Yiwen Li,Jinguang Gu. An Evolutionary Schema for Metadata Description[J]. 数据分析与知识发现, 2020, 4(1): 76-88.
[2] Bocheng Li,Yunqiu Zhang,Kaixi Yang. Extracting Emotion Tags from Comments of Microblog Commodities[J]. 数据分析与知识发现, 2019, 3(9): 115-123.
[3] Lixin Xia,Jieyan Zeng,Chongwu Bi,Guanghui Ye. Identifying Hierarchy Evolution of User Interests with LDA Topic Model[J]. 数据分析与知识发现, 2019, 3(7): 1-13.
[4] Jinzhu Zhang,Yiming Hu. Extracting Titles from Scientific References in Patents with Fusion of Representation Learning and Machine Learning[J]. 数据分析与知识发现, 2019, 3(5): 68-76.
[5] Chongwu Bi,Guanghui Ye,Mingqian Li,Jieyan Zeng. Discovering City Profile Based on Tag Semantic Mining[J]. 数据分析与知识发现, 2019, 3(12): 41-51.
[6] Lu Wei,Luo Mengqi,Ding Heng,Li Xin. Image Annotation Tags by Deep Learning and Real Users: A Comparative Study[J]. 数据分析与知识发现, 2018, 2(5): 1-10.
[7] Zhang Qin,Guo Hongmei,Zhang Zhixiong. Extracting Entity Relationship with Word Embedding Representation Features[J]. 数据分析与知识发现, 2017, 1(9): 8-15.
[8] Liu Jianhua,Wang Ying,Zhang Zhixiong,Li Chuanxi. Extracting Semantic Knowledge from Plant Species Diversity Collections[J]. 数据分析与知识发现, 2017, 1(1): 37-46.
[9] Jiang Lin,Wang Dongbo. Automatically Detecting and Tagging Foreign Language Citation Metadata[J]. 数据分析与知识发现, 2017, 1(1): 47-54.
[10] Zhu Ling,Xue Chunxiang,Zhang Chengzhi,Fu Zhu. User Tags and Microblog Posts: Case Study of Sina Weibo[J]. 现代图书情报技术, 2016, 32(3): 18-24.
[11] Liu Tong,Ni Weijian,Liu Mei. Identifying Terminology from Search Engine Query Logs[J]. 现代图书情报技术, 2016, 32(2): 25-33.
[12] Qianqian Yu,Jianyong Zhang. Practices of NSTL Integrating and Using Third-party Metadata[J]. 现代图书情报技术, 2016, 32(1): 97-102.
[13] Tong Guoping, Sun Jianjun. User Behavior Analysis Based on Search Engine Log[J]. 现代图书情报技术, 2015, 31(7-8): 80-88.
[14] Wang Zhongqun, Jiang Sheng, Xiu Yu, Huang Subin, Wang Qiansong. Information Resource Recommendation Method Based on Dynamic Tag-Resource Network[J]. 现代图书情报技术, 2015, 31(3): 49-57.
[15] Wang Xiwei, Zhao Dan, Yang Mengqing, Wei Junwei. Indices and Empirical Research on Search Engine Optimization of the Industry Websites: An Analysis from the Perspective of Information Ecology[J]. 现代图书情报技术, 2015, 31(3): 75-83.
  Copyright © 2016 Data Analysis and Knowledge Discovery   Tel/Fax:(010)82626611-6626,82624938   E-mail:jishu@mail.las.ac.cn