New Technology of Library and Information Service  2013, Vol. 29 Issue (7/8): 101-106    DOI: 10.11925/infotech.1003-3513.2013.07-08.15
Research on Chinese Patent Automatic Classification Method Based on Statistical Distribution
Hu Bing1, Zhang Jianli2
1. School of Economics & Management, Xidian University, Xi'an 710071, China;
2. Electronic Technology Information Research Institute, Ministry of Industry and Information Technology of the People's Republic of China, Beijing 100043, China
Abstract  Traditional text automatic classification algorithm based on Vector Space Model fails to take the distribution information of terms among classes and the position information of terms in class into consideration, which leads to a poor performance of the algorithm in patent classification. This paper proposes a Chinese patent automatic classification method based on statistical distribution. Firstly, this paper puts forward distribution information weighting factor to manifest the weighting of the terms that appear frequently but in less class. Then, combining with the structural feature of patent text, this paper introduces position information weighting factor to highlight the legal and technical characteristics of patent and differences of patent's each element in content. Finally, the contrast experiment shows that the classification effect can be improved sufficiently by this proposed method.
Key wordsStatistical distribution      Patent automatic classification      Weighting factor     
Received: 27 March 2013      Published: 02 September 2013



Hu Bing, Zhang Jianli. Research on Chinese Patent Automatic Classification Method Based on Statistical Distribution. New Technology of Library and Information Service, 2013, 29(7/8): 101-106.

