|
|
Patent Classification Based on Multi-feature and Multi-classifier Integration |
Jia Shanshan1, Liu Chang2, Sun Lianying3, Liu Xiaoan1, Peng Tao2( ) |
1College of Intellectualized City, Beijing Union University, Beijing 100101, China 2College of Robotics, Beijing Union University, Beijing 100101, China 3College of Urban Rail Transit and Logistics, Beijing Union University, Beijing 100101, China |
|
|
Abstract [Objective] This paper aims to automatically allocate correct IPC to patent applications with the help of multi-feature and multi-classifier integration method. [Methods] First, we extracted the TFIDF features of all dictionaries and information gains, as well as the vector features of document and topic models from patent applications. Then, we used the collected data to train the NB, SVM, and AdaBoost classifiers. Finally, we established the feature-class matrix and predicted the final IPC with the F1 weight matrix. [Results] We examined our new method with 10 patent classes from 2014 to 2016 in the field of engine and pump. The accuracy of top prediction, all categories, and two guesses were 78.9%, 80.1% and 91.2% respectively. [Limitations] The size of training corpus is limited, which only includes 3 years patent data. [Conclusions] The proposed method could effectively improve the accuracy of patent classification in the field of engine and pump.
|
Received: 31 May 2017
Published: 28 September 2017
|
|
[1] |
蔡虹, 蒋仁爱, 吴凯. 知识产权保护对中国技术进步的贡献研究[J]. 系统管理学报, 2015, 24(3): 314-320.
|
[1] |
(Cai Hong, Jiang Renai, Wu Kai.Contribution of Intellectual Property Protection to the Technological Progresses in China[J]. Journal of Systems & Management, 2015, 24(3): 314-320.)
|
[2] |
马芳. 基于RBFNN的专利自动分类研究[J]. 现代图书情报技术, 2011(12): 58-63.
|
[2] |
(Ma Fang.Research of Patent Automatic Classification Based on RBFNN[J]. New Technology of Library and Information Service, 2011(12): 58-63.)
|
[3] |
刘桂锋, 汪满容, 刘海军. 基于概率超图半监督学习的专利文本分类方法研究[J]. 情报杂志, 2016 , 35(9) : 187-191, 173.
doi: 10.3969/j.issn.1002-1965.2016.09.033
|
[3] |
(Liu Guifeng, Wang Manrong, Liu Haijun.Probabilistic Hypergraph Based Semi-supervised Learning Method for Patent Document Categorization[J]. Journal of Intelligence, 2016, 35(9): 187-191, 173.)
doi: 10.3969/j.issn.1002-1965.2016.09.033
|
[4] |
Venugopalan S, Rai V.Topic Based Classification and Pattern Identification in Patents[J]. Technological Forecasting and Social Change, 2015, 94: 236-250.
doi: 10.1016/j.techfore.2014.10.006
|
[5] |
廖列法, 勒孚刚, 朱亚兰. LDA模型在专利文本分类中的应用[J]. 现代情报, 2017, 37(3): 35-39.
|
[5] |
(Liao Liefa, Le Fugang, Zhu Yalan.The Application of LDA Model in Patent Text Classification[J]. Journal of Modern Information, 2017, 37(3): 35-39.)
|
[6] |
马双刚. 基于深度学习理论与方法的中文专利自动分类研究[D]. 镇江: 江苏大学, 2016.
|
[6] |
(Ma Shuanggang.The Study of Automatic Chinese Patent Classification Based on Deep Learning Theory and Method [D]. Zhenjiang: Jiangsu University, 2016. )
|
[7] |
孔旗. 基于并行机器学习的大规模专利分类[D]. 上海: 上海交通大学, 2011.
|
[7] |
(Kong Qi.Large-scale Patent Classification Based on Parallel Machine Learning [D]. Shanghai: Shanghai Jiaotong University, 2011.)
|
[8] |
缪建明, 贾广威, 张运良. 基于摘要文本的专利快速自动分类方法[J]. 情报理论与实践, 2016, 39(8): 103-105, 91.
|
[8] |
(Miu Jianming, Jia Guangwei, Zhang Yunliang.The Rapid Automatic Categorization of Patent Based on Abstract Text[J]. Information Studies: Theory & Application, 2016, 39(8): 103-105, 91.)
|
[9] |
Le Q V, Mikolov T.Distributed Representations of Sentences and Document[OL]. arXiv Preprint, arXiv: 1405.4053.
|
[10] |
Mikolov T.Statistical Language Models Based on Neural Networks[D]. Brno University of Technology, 2012.
|
[11] |
Turian J, Ratinov L, Bengio Y.Word Representations: A Simple and General Method for Semi-supervised Learning[C]////Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics. 2010: 384-394.
|
[12] |
Rosen-Zvi M, Griffiths M, Steyvers M, et al.The Author-topic Model for Authors and Documents[C]//// Proceedings of the 20th Conference on Uncertainty in Artificial Intelligence. 2012: 487-494.
|
[13] |
Fall C J, Törcsvári A, Benzineb K, et al.Automated Categorization in the International Patent Classification[J] . ACM SIGIR Forum, 2003, 37(1): 10-25.
doi: 10.1145/945546.945547
|
|
Viewed |
|
|
|
Full text
|
|
|
|
|
Abstract
|
|
|
|
|
Cited |
|
|
|
|
|
Shared |
|
|
|
|
|
Discussed |
|
|
|
|