|
|
Semantic Relation Extraction from Socially-generated Tags:A Methodology for Metadata Generation |
Miao Chen Xiaozhong Liu Jian Qin |
(Syracuse University, USA) |
|
|
Abstract The growing predominance of social semantics in the form of tagging presents the metadata community with both opportunities and challenges as for leveraging this new form of information content representation and for retrieval. One key challenge is the absence of contextual information associated with these tags. This paper presents an experiment working with Flickr tags as an example of utilizing social semantics sources for enriching subject metadata. The procedure included four steps:1) Collecting a sample of Flickr tags, 2) Calculating cooccurrences between tags through mutual information, 3) Tracing contextual information of tag pairs via Google search results,4) Applying natural language processing and machine learning techniques to extract semantic relations between tags. The experiment helped us to build a context sentence collection from the Google search results, which was then processed by natural language processing and machine learning algorithms. This new approach achieved a reasonably good rate of accuracy in assigning semantic relations to tag pairs. This paper also explores the implications of this approach for using social semantics to enrich subject metadata.
|
Received: 09 February 2009
Published: 25 March 2009
|
|
Corresponding Authors:
Miao Chen
E-mail: mchen14@syr.edu
|
About author:: Miao Chen,Xiaozhong Liu,Jian Qin |
[1] Agichtein, Eugene, and Luis Gravano. (2000). Snowball: Extracting Relations from Large Plain-text Collections. In Kenneth M. Anderson, et al. (Ed.), Proceedings of the 5th ACM Conference on Digital Libraries, (pp. 85-94). New York: Association for Computing Machinery.
[2] Brin, Sergey. (1998). Extracting Patterns and Relations from the World Wide Web. In Paolo Atzeni et al. (Ed.), Selected Papers from the International Workshop on the World Wide Web and Databases, (pp. 172-183). London: Springer.
[3] Bunescu, Razvan C., and Raymond J. Mooney. (2007). Extracting Relations from Text from Word Sequences to Dependency Paths. In Anne Kao, et al. (Ed.), Text Mining and Natural Language Processing, (pp. 29-44). London: Springer.
[4] Culotta, Aron, and Jeffrey Sorensen. (2004). Dependency Tree Kernels for Relation Extraction. Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics. Retrieved April 13, 2008, from http://acl.ldc.upenn.edu/P/P04/P04-1054.pdf.
[5] Culotta, Aron, Andrew McCallum, and Jonathan Betz. (2006). Integrating Probabilistic Extraction Models and Data Mining to Discover Relations and Patterns in Text. Proceedings of the Main Conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics, (pp. 296-303).
[6] Guy, Marieke, and Emma Tonkin. (2006). Folksonomies: Tidying up tags? D-Lib Magazine, 12(1). Retrieved April 13, 2008, from http://www.dlib.org/dlib/january06/guy/01guy.html.
[7] Heymann, Paul, and Hector Garcia-Molina. (2006). Collaborative Creation of Communal Hierarchical Taxonomies in Social Tagging Systems. Technical Report 2006-10. Department of Computer Science, Stanford University. Retrieved April 13, 2008, from http://labs.rightnow.com/colloquium/papers/tag_hier_mining.pdf.
[8] Iria, Jose, and Fabio Ciravegna. (2005). Relation Extraction for Mining the Semantic Web. Dagstuhl Seminar on Machine Learning for the Semantic Web. Retrieved April 13, 2008, from http://tyne.shef.ac.uk/t-rex/pdocs/dagstuhl.pdf.
[9] Liu, Hugo and Pattie Maes. (2007). Introduction to the Semantics of People & Culture (Editorial preface). International Journal on Semantic Web and Information Systems, Special Issue on Semantics of People and Culture, 3(1). Retrieved March 28, 2008, from http://larifari.org/writing/IJSWIS2007-SPC-EditorialPreface.pdf.
[10] Mathes, Adam. (2004). Folksonomies-Cooperative Classification and Communication Through Shared Metadata. Unpublished manuscript. Retrieved April 13, 2008, from http://www.adammathes.com/academic/computer-mediated-communication/folksonomies.html.
[11] Mika, Peter. (2005). Ontologies are Us: A Unified Model of Social Networks and Semantics. In Yolanda Gil, et al. (Eds.), Proceedings of the 4th International Semantic Web Conference (ISWC 2005), (pp. 522–536). Berlin: Springer. Retrieved March 28, 2008, from http://ebi.seu.edu.cn/ISWC2005/papers/3729/37290522.pdf.
[12] Michlmayr, Elke, Sabine Graf, Wolf Siberski, and Wolfgang Nejdl. (2005). A Case Study on Emergent Semantics in Communities. In Yolanda Gil, et al. (Eds.), Proceedings of the Workshop on Social Network Analysis, the 4th International Semantic Web Conference (ISWC 2005). Berlin: Springer.
[13] Nahm, Un Y., and Raymond J. Mooney. (2000). A Mutually Beneficial Integration of Data Mining and Information Extraction. Proceedings of the 17th National Conference on Artificial Intelligence and 12th Conference on Innovative Applications of Artificial Intelligence, (pp. 627-632). Menlo Park, CA: AAAI Press.
[14] Nguyen, Dat P., Yutaka Matsuo, and Mitsuru Ishizuka. (2007). Relation Extraction from Wikipedia Using Subtree Mining. Proceedings of the National Conference on Artificial Intelligence Ontology Learning in conjunction with the 14th European Conference on Artificial Intelligence, Berlin, Germany. Retrieved April 13, 2008, from http://acl.ldc.upenn.edu/N/N07/N07-2032.pdf.
[15] Qin, Jian. (2008). Controlled Semantics vs. Social Semantics: An Epistemological Analysis. Proceedings of the 10th International ISKO Conference: Culture and Identity in Knowledge Organization, Montreal, 5.-8. August, 2008. Retrieved March 28, 2008, from http://web.syr.edu/~jqin/pubs/isko2008_qin.pdf.
[16] Rattenbury, Tye, Nathaniel Good, and Mor Naaman. (2007). Towards Automatic Extraction of Event and Place Semantics from Flickr Tags. In Charles L. Clarke, et al. (Ed.), Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, (pp. 103-110). New York: Association for Computing Machinery.
[17] Roth, Dan, and Wen-tau Yih. (2002). Probabilistic Reasoning for Entity & Relation Recognition. Proceedings of 19th International Conference on Computational Linguistics, 1-7. New Brunswick: ACL.
[18] Sanderson, Mark, and Bruce Croft. (1999). Deriving Concept Hierarchies from Text. In M. Hearst, et al. (Ed.): Proceedings of the 22nd ACM Conference of the Special Interest Group in Information Retrieval, (pp. 206-213). New York: Association from Computing Machinery.
[19] Schmitz, Patrick. (2006). Inducing Ontology from Flickr Tags. Collaborative Web Tagging Workshop at WWW 2006, Edinburgh, UK. Retrieved April 13, 2008, from http://www.topixa.com/www2006/22.pdf.
[20] Shannon, Claude E. (1948). The Mathematical Theory of Communication. Bell System Technology Journal, 27, 379-423.
[21] Zelenko, Dmitry, Chinatsu Aone, and Anthony Richardella. (2003). Kernel Methods for Relation Extraction. Journal of Machine Learning Research, 3, 1083-1106. 知识组织与知识管理 |
|
Viewed |
|
|
|
Full text
|
|
|
|
|
Abstract
|
|
|
|
|
Cited |
|
|
|
|
|
Shared |
|
|
|
|
|
Discussed |
|
|
|
|