Please wait a minute...
New Technology of Library and Information Service  2011, Vol. 27 Issue (1): 31-38    DOI: 10.11925/infotech.1003-3513.2011.01.05
article Current Issue | Archive | Adv Search |
Study on Text Classification Model Based on SUMO and WordNet Ontology Integration
Hu Zewen, Wang Xiaoyue, Bai Rujiang
Institute of Scientific & Technical Information, Shandong University of Technology, Zibo 255049, China
Download: PDF(524 KB)   HTML  
Export: BibTeX | EndNote (RIS)      
Abstract  

Aiming at the existing problems in the traditional text classification methods and the current semantic classification methods, a new text classification model based on SUMO and WordNet Ontology integration is proposed. This model utilizes the mapping relations between WordNet synsets and SUMO Ontology concepts to map terms in document-words vector space into the corresponding concepts in Ontology, and forms document-concepts vector space to classify texts automatically. The experiment results show that the proposed method can greatly decrease the dimensionality of vector space and improve the text classification performance.

Key wordsSUMO Ontology      WordNet      Ontology integration      Text classification model      Word vector space      Concept vector space     
Received: 02 November 2010      Published: 12 February 2011
: 

G250 TP391

 

Cite this article:

Hu Zewen, Wang Xiaoyue, Bai Rujiang. Study on Text Classification Model Based on SUMO and WordNet Ontology Integration. New Technology of Library and Information Service, 2011, 27(1): 31-38.

URL:

http://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/10.11925/infotech.1003-3513.2011.01.05     OR     http://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/Y2011/V27/I1/31


[1] Bloehdorn S, Hotho A.Boosting for Text Classification with Semantic Features. In: Proceedings of the Workshop on the Mining for and from the Semantic Web at the 10th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, Seattle, WA, USA. 2004:70-87.

[2] Mitra V, Wang C J, Banerjee S. A Neuro-SVM Model for Text Classification Using Latent Semantic Indexing. In: Proceedings of the International Joint Conference on Neural Networks, IJCNN 2005, Montreal, QC, Canada. 2005:564-569.

[3] Marina L, Mark L, Slava K. Classification of Web Documents Using Concept Extraction from Ontologies. In: Proceedings of the 2nd International Workshop Autonomous Intelligent Systems: Multi-Agents and Data Mining, AIS-ADM 2007. LNAI 4476. Heidelberg: Springer-Verlag, 2007:287–292.

[4] Carpineto C, Michini C, Nicolussi R. A Concept Lattice-Based Kernel for SVM Text Classification. In: Proceedings of the 7th International Conference on Formal Concept Analysis, ICFCA 2009. LNAI 5548. Heidelberg: Springer-Verlag, 2009:237-250.

[5] Suggested Upper Merged Ontology (SUMO). http://www.ontologyportal.org/.

[6] About WordNet. http://wordnet.princeton.edu/.

[7] Ginte F, Pyysalo S, Boberg J, et al. Ontology-based Feature Transformations: A Data-driven Approach. In: Proceedings of the 4th International Conference, EsTAL 2004-Advances in Natural Language Processing. Berlin: Springer, 2004: 279-290.

[8] 李文,陈叶旺,彭鑫,等.一种有效的基于本体的词语-概念映射方法
[J]. 计算机科学 ,2010,37(10):138-142.

[9] 张剑,李春平. 基于WordNet 概念向量空间模型的文本分类
[J]. 计算机工程与应用 ,2006,42(4):174-178.

[10] Lee Y H, Tsao W J, Chu T H. Use of Ontology to Support Concept-based Text Categorization. In: Proceedings of Designing E-Business Systems: Markets, Services, and Networks - 7th Workshop on E-Business, Web 2008. Heidelberg: Springer-Verlag, 2009: 201-213.

[11] Ontology Portal- Publications. http://www.ontologyportal.org/Pubs.html#FOIS.

[12] Ahrens K, Chung S F,Huang C R. From Lexical Semantics to Conceptual Metaphors: Mapping Principle Verification with WordNet and SUMO. In: Proceedings of the 5th Chinese Lexical Semantics Worksho P(CLSW-5), Singapore. 2004:99-106.

[13] George A M. WordNet: A Lexical Database for English
[J]. Communications of the ACM, 1995, 38(11): 39-41.

[14] Pease A, Niles I, Li J. The Suggested Upper Merged Ontology: A Large Ontology for the Semantic Web and Its Applications. In: Working Notes of the AAAI-2002 Workshop on Ontologies and the Semantic Web, Edmonton, Canada. 2002:2002.

[15] 于娟,党延忠.本体集成研究综述
[J]. 计算机科学 ,2008,35(7):9-13,18.

[16] The DBpedia Data Set. http://wiki.dbpedia.org/Datasets#h18-3.

[17] Reed S L, Lenat D B. Mapping Ontologies into Cyc. http://www.cyc.com/doc/white_papers/mapping-ontologies-into-cyc_v31.pdf.

[18] Image_GraphViz. http://pear.php.net/package/Image_GraphViz/download.

[19] Rapid Miner 4.6. http://rapid-i.com/downloads/tutorial/rapidminer-4.6-tu-torial.pdf.

[20] 20 Newsgroups. http://people.csail.mit.edu/jrennie/20Newsgroups/.

[1] Qu Yunpeng,Wang Wenling. Using Semantic Model to Build Lexical Chains[J]. 现代图书情报技术, 2016, 32(9): 34-41.
[2] Mi Yang, Cao Jindan. A Case Study of Semantic Annotation with Multi-Ontology by Upper-level Ontology Unitive Control[J]. 现代图书情报技术, 2012, (9): 36-41.
[3] Bai Rujiang, Yu Xiaofan, Wang Xiaoyue. The Comparative Analysis of Major Domestic and Foreign Ontology Library[J]. 现代图书情报技术, 2011, 27(1): 3-13.
[4] Yu Xiaofan, Wang Xiaoyue, Bai Rujiang. Review on the Methods and Tools for Ontology Integration[J]. 现代图书情报技术, 2011, 27(1): 14-21.
[5] Wang Xiaoyue, Hu Zewen, Bai Rujiang. Study on the Mapping Mechanism Between WordNet and SUMO Ontology[J]. 现代图书情报技术, 2011, 27(1): 22-30.
[6] Zhai Dongsheng,Liu Chen,Ouyang Yihui. The Design and Implementation of Patent Information Acquiring and Analysis System[J]. 现代图书情报技术, 2009, 25(5): 55-60.
[7] Lu Shengjun,Li Fayong,Qian Jianjun ,Zhen Zhen. WCONS+:An Ontology Integration Approach Based on WCONS[J]. 现代图书情报技术, 2009, 3(2): 18-22.
[8] Rao Yanghui,Ye Liang,Cheng Jie. Research on the Application of WordNet in Text Clustering[J]. 现代图书情报技术, 2009, (10): 67-70.
[9] Jia Junzhi,Dong Gang. The Study on Integration of CFN and VerbNet,WordNet[J]. 现代图书情报技术, 2008, 24(6): 6-10.
[10] Zhang Huiping,Lv Xueqiang,Shi Shuicai,Li Yuqin . Constructing Semantic Distribution Dictionary Based on WordNet[J]. 现代图书情报技术, 2007, 2(3): 55-59.
  Copyright © 2016 Data Analysis and Knowledge Discovery   Tel/Fax:(010)82626611-6626,82624938   E-mail:jishu@mail.las.ac.cn