|
|
Hierarchical Classification Model for Invention Patents |
Zhai Dongsheng, Hu Dengjin(), Zhang Jie, He Xijun, Liu He |
School of Economics and Management, Beijing University of Technology, Beijing 100124, China |
|
|
Abstract [Objective] This paper proposes a new model to process patent information based on machine learning classification algorithm, aiming to determine the level of invention. [Methods] First, we extracted the technology feature words from the patent texts. Then, we constructed the patent technology feature vector with an algorithm trained by Word2Vec. Third, we calculated patent text indicators and backward references to build the training set. Finally, we constructed the new model with machine learning classification algorithm. [Results] We retrieved patents in the field of speech recognition technology with the proposed model. We found that the proportion of advanced level to entry level patents was around 1:4, which was in line with the actual situation. [Limitations] The WordNet dictionary will limit the results of extraction. [Conclusions] The proposed model could effectively identify the advanced patents and recommend them to the business owners.
|
Received: 15 August 2017
Published: 29 December 2017
|
|
[1] |
Mann D L.Better Technology Forecasting Using Systematic Innovation Methods[J]. Technological Forecasting & Social Change, 2003, 70(8): 779-795.
doi: 10.1016/S0040-1625(02)00357-8
|
[2] |
张剑, 屈丹, 李真. 基于词向量特征的循环神经网络语言模型[J]. 模式识别与人工智能, 2015, 28(4): 299-305.
doi: 10.16451/j.cnki.issn1003-6059.201504002
|
[2] |
(Zhang Jian, Qu Dan, Li Zhen.Recurrent Neural Network Language Model Based on Word Vector Features[J]. Pattern Recognition and Artificial Intelligence, 2015, 28(4): 299-305.)
doi: 10.16451/j.cnki.issn1003-6059.201504002
|
[3] |
Bengio Y.Deep Learning of Representations: Looking Forward[C]// Proceedings of the 1st International Conference on Statistical Language and Speech Processing, Tarragona, Spain. Berlin, Heidelberg: Springer, 2013: 1-37.
|
[4] |
Wolf L, Hanani Y, Bar K, et al.Joint Word2Vec Networks for Bilingual Semantic Representations[J]. International Journal of Computational Linguistics and Applications, 2014, 5(1): 27-44.
|
[5] |
Su Z, Xu H, Zhang D, et al.Chinese Sentiment Classification Using a Neural Network Tool—Word2Vec[C]//Proceedings of the 2014 International Conference on Multisensor Fusion and Information Integration for Intelligent Systems, Beijing, China. Piscataway, USA: IEEE, 2014: 1-6.
|
[6] |
Mikolov T, Chen K, Corrado G, et al.Efficient Estimation of Word Representations in Vector Space[OL]. arXiv Preprint, arXiv: 1301. 3781.
|
[7] |
根里奇·斯拉维奇·阿奇舒勒. 创新算法[M]. 谭培波, 茹海燕, Wenling Babbitt 译. 武汉: 华中科技大学出版社, 2008.
|
[7] |
(Genrikh Altshuller.The Innovation Algorithm: TRIZ, Systematic Innovation and Technical Creativity [M]. Translated by Tan Peibo, Ru Haiyan, Wenling Babbitt. Wuhan: Huazhong University of Science and Technology Press, 2008.)
|
[8] |
Li Z, Tate D, Lane C, et al.A Framework for Automatic TRIZ Level of Invention Estimation of Patents Using Natural Language Processing, Knowledge-transfer and Patent Citation Metrics[J]. Computer-Aided Design, 2012, 44(10): 987-1010.
doi: 10.1016/j.cad.2011.12.006
|
[9] |
王艳领. 专利等级划分方法的研究与实现[D]. 天津: 河北工业大学, 2011.
|
[9] |
(Wang Yanling.Research and Implementation of the Mean of the Patent Classification [D]. Tianjin: Hebei University of Technology, 2011.)
|
[10] |
Regazzoni D, Nani R.TRIZ-Based Patent Investigation by Evaluating Inventiveness[A]// Computer-Aided Innovation (CAI)[M]. Springer US, 2008: 247-258.
|
[11] |
Verbitsky M.Semantic TRIZ[R]. Boston: Invention Machine Corporation, 2004.
|
[12] |
张惠, 邱清盈, 冯培恩, 等. 产品专利设计知识获取方法研究[J]. 哈尔滨工程大学学报, 2009, 30(7): 785-791.
doi: 10.3969/j.issn.1006-7043.2009.07.012
|
[12] |
(Zhang Hui, Qiu Qingying, Feng Peien, et al.An Automated Method for Acquiring Design Knowledge from Product Patents[J]. Journal of Harbin Engineering University, 2009, 30(7): 785-791.)
doi: 10.3969/j.issn.1006-7043.2009.07.012
|
[13] |
袁里驰. 基于改进的隐马尔科夫模型的词性标注方法[J]. 中南大学学报: 自然科学版, 2012, 43(8): 3053-3057.
|
[13] |
(Yuan Lichi.A Part-of-Speech Tagging Method Based on Improved Hidden Markov Model[J]. Jouranl of Central South University: Science and Technology, 2012, 43(8): 3053-3057.)
|
[14] |
Porter M F.An Algorithm for Suffix Stripping[A]// Readings in Information Retrieval[M]. Morgan Kaufmann Publishers Inc., 2006: 130-137.
|
[15] |
吴思竹, 钱庆, 胡铁军, 等. 词形还原方法及实现工具比较分析[J]. 现代图书情报技术, 2012(3): 27-34.
|
[15] |
(Wu Sizhu, Qian Qing, Hu Tiejun, et al.Contrast Analysis of Methods and Tools for Lemmatization[J]. New Technology of Library and Information Service, 2012(3): 27-34.)
|
[16] |
饶齐, 王裴岩, 张桂平. 面向中文专利SAO结构抽取的文本特征比较研究[J]. 北京大学学报: 自然科学版, 2015, 51(2): 349-356.
doi: 10.13209/j.0479-8023.2015.049
|
[16] |
(Rao Qi, Wang Peiyan, Zhang Guiping.Text Feature Analysis on SAO Structure Extraction from Chinese Patent Literatures[J]. Acta Scientiarum Naturalium Universitatis Pekinensis, 2015, 51(2): 349-356.)
doi: 10.13209/j.0479-8023.2015.049
|
[17] |
李欣, 王静静, 杨梓, 等. 基于SAO结构语义分析的新兴技术识别研究[J]. 情报杂志, 2016, 35(3): 80-84.
|
[17] |
(Li Xin, Wang Jingjing, Yang Zi, et al.Identifying Emerging Technologies Based on Subject-Action-Object[J]. Journal of Intelligence, 2016, 35(3): 80-84.)
|
[18] |
许幸荣. 基于SAO结构分析的技术发展路径预测研究[D]. 北京: 北京理工大学, 2015.
|
[18] |
(Xu Xingrong.Research on Forecasting Technological Development Paths Based on SAO Structure Analysis[D]. Beijing: Beijing Institute of Technology, 2015.)
|
|
Viewed |
|
|
|
Full text
|
|
|
|
|
Abstract
|
|
|
|
|
Cited |
|
|
|
|
|
Shared |
|
|
|
|
|
Discussed |
|
|
|
|