Please wait a minute...
New Technology of Library and Information Service  2005, Vol. 21 Issue (10): 23-27    DOI: 10.11925/infotech.1003-3513.2005.10.06
Current Issue | Archive | Adv Search |
Study on Machine Learning Based Automatic  Text Categorization Model
Chen Lifu   Zhou Ning   Li Dan
(Information Management School, Wuhan University, Wuhan 430072, China)
Export: BibTeX | EndNote (RIS)      

This article develops a theoretical model of machine learning based automatic text categorization, which is widely used in text categorization tasks. First, definition and architecture model of text categorization are given. Then, we choose SVM classifier as a typical example for detail analysis. Finally, a performance result is reported by the author through a Chinese text categorization experiment.

Key wordsText categorization      Machine learning      Support vector machine     
Received: 20 June 2005      Published: 25 October 2005


Corresponding Authors: Chen Lifu     E-mail:
About author:: Chen Lifu,Zhou Ning,Li Dan

Cite this article:

Chen Lifu,Zhou Ning,Li Dan. Study on Machine Learning Based Automatic  Text Categorization Model. New Technology of Library and Information Service, 2005, 21(10): 23-27.

URL:     OR

1Fabrizio Sebastiani: Machine Learning in Automated Text Categorization, ACM Computing Surveys, Vol.34, No.1, 2002
2Kjersti Aas and Line Eikvil: Text Categorisation: A Survey, Technical Report #941, Norwegian Computing Center, 1999
3Yiming Yang and Jan O. Pedersen: A Comparative Study on Feature Selection in Text Categorization, Proceedings of the Fourteenth International Conference on Machine Learning (ICML'97), 1997
4Yiming Yang and Xin Liu: A re-examination of text categorization methods, Proceedings of ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR'99), 1999
5Thorsten Joachims: A Statistical Learning Model of Text Classification for Support Vector Machines, SIGIR '01, 2001
6Yiming Yang: An evaluation of statistical approaches to text categorization, Journal of Information Retrieval, Vol 1, 1999
7Yan-Shi Dong, Ke-Song Han: A Comparison of Several Ensemble Methods for Text Categorization, Proceedings of the 2004 IEEE International Conference on Service Computing (SCC'04), 2004

[1] Wang Hanxue,Cui Wenjuan,Zhou Yuanchun,Du Yi. Identifying Pathogens of Foodborne Diseases with Machine Learning[J]. 数据分析与知识发现, 2021, 5(9): 54-62.
[2] Chen Donghua,Zhao Hongmei,Shang Xiaopu,Zhang Runtong. Optimizing Large Hospital Operating Rooms with Data Analytics[J]. 数据分析与知识发现, 2021, 5(9): 115-128.
[3] Che Hongxin,Wang Tong,Wang Wei. Comparing Prediction Models for Prostate Cancer[J]. 数据分析与知识发现, 2021, 5(9): 107-114.
[4] Su Qiang, Hou Xiaoli, Zou Ni. Predicting Surgical Infections Based on Machine Learning[J]. 数据分析与知识发现, 2021, 5(8): 65-75.
[5] Cao Rui,Liao Bin,Li Min,Sun Ruina. Predicting Prices and Analyzing Features of Online Short-Term Rentals Based on XGBoost[J]. 数据分析与知识发现, 2021, 5(6): 51-65.
[6] Zhong Jiawa,Liu Wei,Wang Sili,Yang Heng. Review of Methods and Applications of Text Sentiment Analysis[J]. 数据分析与知识发现, 2021, 5(6): 1-13.
[7] Xiang Zhuoyuan,Liu Zhicong,Wu Yu. Adaptive Recommendation Model Based on User Behaviors[J]. 数据分析与知识发现, 2021, 5(4): 103-114.
[8] Feng Hao, Li Shuqing. Multi-layer Cascade Classifier for Credit Scoring with Multiple-Support Vector Machines[J]. 数据分析与知识发现, 2021, 5(10): 28-36.
[9] Chai Guorong,Wang Bin,Sha Yongzhong. Public Health Risk Forecasting with Multiple Machine Learning Methods Combined:Case Study of Influenza Forecasting in Lanzhou, China[J]. 数据分析与知识发现, 2021, 5(1): 90-98.
[10] Chen Dong,Wang Jiandong,Li Huiying,Cai Sihang,Huang Qianqian,Yi Chengqi,Cao Pan. Forecasting Poultry Turnovers with Machine Learning and Multiple Factors[J]. 数据分析与知识发现, 2020, 4(7): 18-27.
[11] Liang Ye,Li Xiaoyuan,Xu Hang,Hu Yiran. CLOpin: A Cross-Lingual Knowledge Graph Framework for Public Opinion Analysis and Early Warning[J]. 数据分析与知识发现, 2020, 4(6): 1-14.
[12] Yang Heng,Wang Sili,Zhu Zhongming,Liu Wei,Wang Nan. Recommending Domain Knowledge Based on Parallel Collaborative Filtering Algorithm[J]. 数据分析与知识发现, 2020, 4(6): 15-21.
[13] Ding Shengchun,Yu Fengyang,Li Zhen. Identifying Potential Trending Topics of Online Public Opinion[J]. 数据分析与知识发现, 2020, 4(2/3): 29-38.
[14] Wang Shuyi,Liu Sai,Ma Zheng. Microblog Image Privacy Classification with Deep Transfer Learning[J]. 数据分析与知识发现, 2020, 4(10): 80-92.
[15] Heran Qin,Liu Liu,Bin Li,Dongbo Wang. Automatic Classification of Ancient Classics with Entity Features[J]. 数据分析与知识发现, 2019, 3(9): 68-76.
  Copyright © 2016 Data Analysis and Knowledge Discovery   Tel/Fax:(010)82626611-6626,82624938