New Technology of Library and Information Service  2013, Vol. 29 Issue (3): 38-44    DOI: 10.11925/infotech.1003-3513.2013.03.07
Fundamental Research Questions in Patent Text Categorization
Qu Peng, Wang Huilin
Institute of Scientific & Technical Information of China, Beijing 100038, China
Abstract  The paper focuses on some fundamental problems in patent text categorization, including the feasibility of using terms for automatic categorization, the research on claim categorization, and the effect of classes with close-related topics on the categorization result. The research is executed on two Naive Bayesian classifiers, kNN, Racchio and SVM classifier, and cross validation is used for testing. The results of the paper are that terms are better than common features under the same settings, that training a classifier with abstracts can improve the claim categorization results, and that classes with close-related topics result in low precision and hierarchical design of classifier is necessary, correspondingly. The paper provides fundamental data for patent text categorization and can be referred by information analysis and other applications using patents.
Key wordsPatent      Text categorization      Text mining     
Received: 08 March 2013      Published: 14 May 2013
Cite this article:

Qu Peng, Wang Huilin. Fundamental Research Questions in Patent Text Categorization. New Technology of Library and Information Service, 2013, 29(3): 38-44.

