Please wait a minute...
New Technology of Library and Information Service  2004, Vol. 20 Issue (7): 27-29    DOI: 10.11925/infotech.1003-3513.2004.07.06
Current Issue | Archive | Adv Search |
Study on Automatic Text Categorization with Support Vector Machine
Shi Jiebin
(Zhejiang University Library, Hangzhou 310029, China)
Download:
Export: BibTeX | EndNote (RIS)      
Abstract  

A new machine learning method of Support Vector Machine (SVM), is applied in automatic text categorization. Comparing with the result achieved by k-nearest neighbor algorithm, the accuracy achieved by support vector machine is better; The effect of feature selection methods is smaller to SVM than the KNN method. The SVM is a potential and competitive method for automatic text categorization. The feature selection methods also affectes the accuracy of text categorization.

Key wordsAutomatic text categorization      Support vector machine      K-nearest neighbor algorithm      Feature selection     
Received: 23 February 2004      Published: 25 July 2004
: 

G254.361

 
Corresponding Authors: Shi Jiebin     E-mail: jbshi@lib.zju.edu.cn
About author:: Shi Jiebin

Cite this article:

Shi Jiebin. Study on Automatic Text Categorization with Support Vector Machine. New Technology of Library and Information Service, 2004, 20(7): 27-29.

URL:

https://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/10.11925/infotech.1003-3513.2004.07.06     OR     https://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/Y2004/V20/I7/27

1史忠植.知识发现.北京:清华大学出版社,2002:334-363
2王梦云等.基于字频向量的中文文本自动分类系统.情报学报,2000,19(6):644-649
3李勇等.网络文本数据分类技术与实现算法.情报学报,2002,21(1):21-26
4庞剑锋等.基于向量空间模型的文本自动分类系统的研究与实现.计算机应用研究,2001(9):23-26
5柳回春等.支持向量机的研究现状.中国图象图形学报,2002,7A(6):618-623
6萧嵘等.支持向量机理论综述.计算机科学,2000,27(3):1-3
7Vapnik, V., Statistical Learning Theory, New York, NY: Wiley, 1998
8陆玉昌等.向量空间法中单词权重函数的分析和构造.计算机研究与发展,2002,39(10):1205-1210
9李凡等.关于文本特征抽取新方法的研究.清华大学学报(自然科学版),2001,41(7):98-101
10朱明等.Web网页设别中的特征选择问题研究.计算机工程,2000,26(8):35-37
11李蓉等.SVM-KNN分类器——一种提高SVM分类精度的新方法,电子学报,2002,30(5):745-748
12Chang, C. et al, The analysis of decomposition methods for support vector machines, IEEE Transactions on Neural Networks,2000, 11 (4): 1003-1008
13孙健等.基于K-最近距离的自动文本分类研究.北京邮电大学学报,2001,24(1):42-46

[1] Liang Jiaming, Zhao Jie, Zheng Peng, Huang Liushen, Ye Minqi, Dong Zhenning. Framework for Computing Trust in Online Short-Rent Platform Using Feature Selection of Images and Texts[J]. 数据分析与知识发现, 2021, 5(2): 129-140.
[2] Feng Hao, Li Shuqing. Multi-layer Cascade Classifier for Credit Scoring with Multiple-Support Vector Machines[J]. 数据分析与知识发现, 2021, 5(10): 28-36.
[3] Ding Shengchun,Yu Fengyang,Li Zhen. Identifying Potential Trending Topics of Online Public Opinion[J]. 数据分析与知识发现, 2020, 4(2/3): 29-38.
[4] Heran Qin,Liu Liu,Bin Li,Dongbo Wang. Automatic Classification of Ancient Classics with Entity Features[J]. 数据分析与知识发现, 2019, 3(9): 68-76.
[5] Ruojia Wang,Lu Zhang,Jimin Wang. Automatic Triage of Online Doctor Services Based on Machine Learning[J]. 数据分析与知识发现, 2019, 3(9): 88-97.
[6] Qingtian Zeng,Mingdi Dai,Chao Li,Hua Duan,Zhongying Zhao. Discovering Important Locations with User Representation and Trace Data[J]. 数据分析与知识发现, 2019, 3(6): 75-82.
[7] Cheng Zhou,Hongqin Wei. Evaluating and Classifying Patent Values Based on Self-Organizing Maps and Support Vector Machine[J]. 数据分析与知识发现, 2019, 3(5): 117-124.
[8] Jiaming Liang,Jie Zhao,Zhou Jianlong,Zhenning Dong. Detecting Collusive Fraudulent Online Transaction with Implicit User Behaviors[J]. 数据分析与知识发现, 2019, 3(5): 125-138.
[9] Tingxin Wen,Yangzi Li,Jingshuang Sun. News Hotspots Discovery Method Based on Multi Factor Feature Selection and AFOA/K-means[J]. 数据分析与知识发现, 2019, 3(4): 97-106.
[10] Zhanglu Tan,Zhaogang Wang,Han Hu. Study on a Method of Feature Classification Selection Based on χ2 Statistics[J]. 数据分析与知识发现, 2019, 3(2): 72-78.
[11] Zhixiong Zhang,Huan Liu,Liangping Ding,Pengmin Wu,Gaihong Yu. Identifying Moves of Research Abstracts with Deep Learning Methods[J]. 数据分析与知识发现, 2019, 3(12): 1-9.
[12] Liangping Ding,Zhixiong Zhang,Huan Liu. Factors Affecting Rhetorical Move Recognition with SVM Model[J]. 数据分析与知识发现, 2019, 3(11): 16-23.
[13] Li Xiangdong,Gao Fan,Li Youhai. Categorizing Documents Automatically within Common Semantic Space[J]. 数据分析与知识发现, 2018, 2(9): 66-73.
[14] Wen Tingxin,Li Yangzi,Sun Jingshuang. Extracting Text Features with Improved Fruit Fly Optimization Algorithm[J]. 数据分析与知识发现, 2018, 2(5): 59-69.
[15] Huang Xiaoxi,Li Hanyu,Wang Rongbo,Wang Xiaohua,Chen Zhiqun. Recognizing Metaphor with Convolution Neural Network and SVM[J]. 数据分析与知识发现, 2018, 2(10): 77-83.
  Copyright © 2016 Data Analysis and Knowledge Discovery   Tel/Fax:(010)82626611-6626,82624938   E-mail:jishu@mail.las.ac.cn