|
|
Development of Text Automatic Categorization Measurement Research. |
Tan Jinbo Li Yi Yang Xiaojiang |
(Department of Educational Technology, Nanjing Normal University, Nanjing 210097, China) |
|
|
Abstract Text categorization is the foundation and core of text-mining, which has been a research focus of data-mining and Internet-mining in recent years. This article introduces domestic and foreign research situation on text categorization from the view of the nature and quantity. It analyzes the important factors affecting text categorization, and hope to find the common problem by evaluating summary of text categorization system and arithmetic. The goal of the article is to provide theory and fact for the optimization and improvement of text automatic categorization.
|
Received: 03 December 2004
Published: 25 May 2005
|
|
Corresponding Authors:
Tan Jinbo
E-mail: yttjb@163.com
|
About author:: Tan Jinbo,Li Yi,Yang Xiaojiang |
1Text retrieval conference. http://trec.nist.gov (Accessed Sep. 20,2004)
2庞剑锋,卜东波,白硕.基于向量空间模型的文本自动分类系统的研究与实现.计算机应用研究,2001(9):23-26
3李小明.北大中文网页自动分类竞赛规则.2003(3)
4黄勇.一个基于具有自学习机制的概念网络的搜索引擎的研究与核心算法的实现.中南工业大学硕士论文,2001(5)
5Yang Y, Pedersen J O. A comparative study on feature selection in text categorization. 1997.http://citeseer.ist.psu.edu/yang97comparative.html. (Accessed Sep. 10,2004)
6程军.基于统计的文本分类技术研究.中国科学院博士论文,2003(5)
7陆玉昌,鲁明羽,李凡等.向量空间法中单词权重函数的分析和构造.计算机研究与发展,2002(10):1205-1210
8张东礼,汪东升,郑纬民.基于VSM 的中文文本分类系统的设计与实现.清华大学学报(自然科学版),2003(9):1288-1291
9黄萱菁,吴立德,石崎洋之等.独立于语种的文本分类方法.中文信息学报,2000(6):1-7
10Franca Debole,Fabrizio Sebastiani. Supervised Term Weighting for Automated Text Categorization. 2003.http://citeseer.ist.psu.edu/
Automated Text Categorization. 2003.http://citeseer.ist.psu.edu/572661.html (Accessed Sep. 10,2004)
11鲁松,李晓黎,白硕等.文档中词语权重计算方法的改进.中文信息学报,2000(6):8-20
12景丽萍,黄厚宽,石洪波.用于文本挖掘的特征选择方法TFIDF及其改进.广西师范大学学报(自然科学版),2003(3):142-145
13Yiming Yang, Xin Liu. A re-examination of text categorization methods. Proceedings of ACM SIGIR Conference on Research and Development in Information Retrieval,1999:42-49
14黄萱菁.大规模中文文本的检索、分类与摘要研究.复旦大学博士论文,1998(5)
15李蓉,叶世伟,史忠植.SVM-KNN分类器——一种提高SVM分类精度的新方法.电子学报,2002(5):745-748 |
|
Viewed |
|
|
|
Full text
|
|
|
|
|
Abstract
|
|
|
|
|
Cited |
|
|
|
|
|
Shared |
|
|
|
|
|
Discussed |
|
|
|
|