Please wait a minute...
New Technology of Library and Information Service  2007, Vol. 2 Issue (2): 56-59    DOI: 10.11925/infotech.1003-3513.2007.02.12
Current Issue | Archive | Adv Search |
An Improved Hierarchical Document Classification Method
Tan Jinbo
(Department of Educational Technology, Shandong Normal University, Jinan 250014,China)
Download: PDF(533 KB)   HTML  
Export: BibTeX | EndNote (RIS)      
Abstract  

On large amount of document category quantity, hierarchical text classification is an effective approach. However, classification methods using the top-down approach suffer from blocking. To address the problem, this paper proposes an improved hierarchical classification method, namely restricted voting method. Our experiments using Rocchio classifiers on the elementary education subjects resource have shown that it can reduce blocking and improve the classification performance.

Key wordsDocument classification      Hierarchical classification      Restricted voting method     
Received: 17 November 2006      Published: 25 February 2007
: 

G354.4

 
Corresponding Authors: Tan Jinbo     E-mail: yttjb@163.com
About author:: Tan Jinbo

Cite this article:

Tan Jinbo . An Improved Hierarchical Document Classification Method. New Technology of Library and Information Service, 2007, 2(2): 56-59.

URL:

http://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/10.11925/infotech.1003-3513.2007.02.12     OR     http://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/Y2007/V2/I2/56

1袁时金,李荣陆,周水庚,胡运法. 层次化中文文档分类. 通信学报,2004(11):55-63
2肖雪,何中市. 基于向量空间模型的中文文本层次分类方法研究. 计算机应用, 2006(5):1125-1126,1133
3朱华宇,孙正兴,张福炎. 一个基于向量空间模型的中文文本自动分类系统. 计算机工程, 2001(2):15-17,63
4高波,赵政. 文本层次分类系统的研究. 计算机工程与应用,2006(11):176-178
5Sun A,Lim E P,Ng W K,Srivastava J. Blocking reduction strategies in hierarchical text classification. IEEE Trans. on Knowledge and Data Eng,2004,16(10): 1305-1308
6Sun A,Lim E P. Hierarchical text classification and evaluation. In Proc. of 1st IEEE ICDM,2001 (11):521-528
7Dumais S T,Chen H. Hierarchical classification of Web content. In Proc. of 23rd ACM SIGIR,2000(7):256-263
8Greiner R,Grove A,Schuurmans D. On learning hierarchical classifications. http://citeseer.nj.nec.com/article/greiner97learning.html (Accessed Mar.5,2005)
9Larkey L S,Croft W B. Combining classifiers in text categorization. In Proc. of 19th ACM SIGIR,1996(8):289-297
10Li Y H,Jian A K. Classification of text documents. The Computer Journal,1998,41(8):537-546
11Sebastiani F. Machine learning in automated text categorization. ACM Computing Surveys,2002,34(1):1-47
12谭金波.基于Web的基础教育资源自动分类技术研究:[学位论文].南京:南京师范大学教育技术学院,2006.

[1] Lin Li,Hui Li. Computing Text Similarity Based on Concept Vector Space[J]. 数据分析与知识发现, 2018, 2(5): 48-58.
[2] Liu Hongguang,Ma Shuanggang,Liu Guifeng. Classifying Chinese News Texts with Denoising Auto Encoder[J]. 现代图书情报技术, 2016, 32(6): 12-19.
[3] Xia Tian. Generating Hierarchical Paths of Chinese Text from Wikipedia[J]. 现代图书情报技术, 2016, 32(3): 25-32.
[4] Wang Xiaoyue,Bai Rujiang. Web Document Classification Method Based on Variable Precision Rough Set Model[J]. 现代图书情报技术, 2005, 21(12): 51-54.
  Copyright © 2016 Data Analysis and Knowledge Discovery   Tel/Fax:(010)82626611-6626,82624938   E-mail:jishu@mail.las.ac.cn