Please wait a minute...
Advanced Search
现代图书情报技术  2011, Vol. 27 Issue (1): 31-38    DOI: 10.11925/infotech.1003-3513.2011.01.05
  专题 本期目录 | 过刊浏览 | 高级检索 |
胡泽文, 王效岳, 白如江
山东理工大学科技信息研究所 淄博 255049
Study on Text Classification Model Based on SUMO and WordNet Ontology Integration
Hu Zewen, Wang Xiaoyue, Bai Rujiang
Institute of Scientific & Technical Information, Shandong University of Technology, Zibo 255049, China
全文: PDF(524 KB)   HTML  
输出: BibTeX | EndNote (RIS)      


E-mail Alert
关键词 SUMO本体WordNet本体集成文本分类模型词向量空间概念向量空间    

Aiming at the existing problems in the traditional text classification methods and the current semantic classification methods, a new text classification model based on SUMO and WordNet Ontology integration is proposed. This model utilizes the mapping relations between WordNet synsets and SUMO Ontology concepts to map terms in document-words vector space into the corresponding concepts in Ontology, and forms document-concepts vector space to classify texts automatically. The experiment results show that the proposed method can greatly decrease the dimensionality of vector space and improve the text classification performance.

Key wordsSUMO Ontology    WordNet    Ontology integration    Text classification model    Word vector space    Concept vector space
收稿日期: 2010-11-02     

G250 TP391



胡泽文, 王效岳, 白如江. 基于SUMO和WordNet本体集成的文本分类模型研究[J]. 现代图书情报技术, 2011, 27(1): 31-38.
Hu Zewen, Wang Xiaoyue, Bai Rujiang. Study on Text Classification Model Based on SUMO and WordNet Ontology Integration. New Technology of Library and Information Service, DOI:10.11925/infotech.1003-3513.2011.01.05.

[1] Bloehdorn S, Hotho A.Boosting for Text Classification with Semantic Features. In: Proceedings of the Workshop on the Mining for and from the Semantic Web at the 10th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, Seattle, WA, USA. 2004:70-87.

[2] Mitra V, Wang C J, Banerjee S. A Neuro-SVM Model for Text Classification Using Latent Semantic Indexing. In: Proceedings of the International Joint Conference on Neural Networks, IJCNN 2005, Montreal, QC, Canada. 2005:564-569.

[3] Marina L, Mark L, Slava K. Classification of Web Documents Using Concept Extraction from Ontologies. In: Proceedings of the 2nd International Workshop Autonomous Intelligent Systems: Multi-Agents and Data Mining, AIS-ADM 2007. LNAI 4476. Heidelberg: Springer-Verlag, 2007:287–292.

[4] Carpineto C, Michini C, Nicolussi R. A Concept Lattice-Based Kernel for SVM Text Classification. In: Proceedings of the 7th International Conference on Formal Concept Analysis, ICFCA 2009. LNAI 5548. Heidelberg: Springer-Verlag, 2009:237-250.

[5] Suggested Upper Merged Ontology (SUMO).

[6] About WordNet.

[7] Ginte F, Pyysalo S, Boberg J, et al. Ontology-based Feature Transformations: A Data-driven Approach. In: Proceedings of the 4th International Conference, EsTAL 2004-Advances in Natural Language Processing. Berlin: Springer, 2004: 279-290.

[8] 李文,陈叶旺,彭鑫,等.一种有效的基于本体的词语-概念映射方法
[J]. 计算机科学 ,2010,37(10):138-142.

[9] 张剑,李春平. 基于WordNet 概念向量空间模型的文本分类
[J]. 计算机工程与应用 ,2006,42(4):174-178.

[10] Lee Y H, Tsao W J, Chu T H. Use of Ontology to Support Concept-based Text Categorization. In: Proceedings of Designing E-Business Systems: Markets, Services, and Networks - 7th Workshop on E-Business, Web 2008. Heidelberg: Springer-Verlag, 2009: 201-213.

[11] Ontology Portal- Publications.

[12] Ahrens K, Chung S F,Huang C R. From Lexical Semantics to Conceptual Metaphors: Mapping Principle Verification with WordNet and SUMO. In: Proceedings of the 5th Chinese Lexical Semantics Worksho P(CLSW-5), Singapore. 2004:99-106.

[13] George A M. WordNet: A Lexical Database for English
[J]. Communications of the ACM, 1995, 38(11): 39-41.

[14] Pease A, Niles I, Li J. The Suggested Upper Merged Ontology: A Large Ontology for the Semantic Web and Its Applications. In: Working Notes of the AAAI-2002 Workshop on Ontologies and the Semantic Web, Edmonton, Canada. 2002:2002.

[15] 于娟,党延忠.本体集成研究综述
[J]. 计算机科学 ,2008,35(7):9-13,18.

[16] The DBpedia Data Set.

[17] Reed S L, Lenat D B. Mapping Ontologies into Cyc.

[18] Image_GraphViz.

[19] Rapid Miner 4.6.

[20] 20 Newsgroups.

[1] 曲云鹏,王文玲. 一种分布式语义增强的词汇链文本表示模型构建方法[J]. 现代图书情报技术, 2016, 32(9): 34-41.
[2] 白如江, 于晓繁, 王效岳. 国内外主要本体库比较分析研究[J]. 现代图书情报技术, 2011, 27(1): 3-13.
[3] 于晓繁, 王效岳, 白如江. 本体集成方法和工具综述[J]. 现代图书情报技术, 2011, 27(1): 14-21.
[4] 王效岳, 胡泽文, 白如江. WordNet与SUMO本体之间的映射机制研究[J]. 现代图书情报技术, 2011, 27(1): 22-30.
[5] 翟东升,刘晨,欧阳轶慧. 专利信息获取分析系统设计与实现*[J]. 现代图书情报技术, 2009, 25(5): 55-60.
[6] 卢胜军,李法勇,钱建军,真溱. WCONS+:一种基于WCONS的本体集成[J]. 现代图书情报技术, 2009, 3(2): 18-22.
[7] 饶洋辉,叶良,程洁. WordNet在文本聚类中的应用研究*[J]. 现代图书情报技术, 2009, (10): 67-70.
[8] 贾君枝,董刚. 汉语框架网络本体与VerbNet、WordNet集成研究*[J]. 现代图书情报技术, 2008, 24(6): 6-10.
[9] 张会平,吕学强,施水才,李渝勤 . 基于WordNet的语义分布词典建设*[J]. 现代图书情报技术, 2007, 2(3): 55-59.
Full text



版权所有 © 2015 《数据分析与知识发现》编辑部
地址:北京市海淀区中关村北四环西路33号 邮编:100190