[Objective] This paper proposes a new model to process patent information based on machine learning classification algorithm, aiming to determine the level of invention. [Methods] First, we extracted the technology feature words from the patent texts. Then, we constructed the patent technology feature vector with an algorithm trained by Word2Vec. Third, we calculated patent text indicators and backward references to build the training set. Finally, we constructed the new model with machine learning classification algorithm. [Results] We retrieved patents in the field of speech recognition technology with the proposed model. We found that the proportion of advanced level to entry level patents was around 1:4, which was in line with the actual situation. [Limitations] The WordNet dictionary will limit the results of extraction. [Conclusions] The proposed model could effectively identify the advanced patents and recommend them to the business owners.
(Zhang Jian, Qu Dan, Li Zhen.Recurrent Neural Network Language Model Based on Word Vector Features[J]. Pattern Recognition and Artificial Intelligence, 2015, 28(4): 299-305.)
Bengio Y.Deep Learning of Representations: Looking Forward[C]// Proceedings of the 1st International Conference on Statistical Language and Speech Processing, Tarragona, Spain. Berlin, Heidelberg: Springer, 2013: 1-37.
Wolf L, Hanani Y, Bar K, et al.Joint Word2Vec Networks for Bilingual Semantic Representations[J]. International Journal of Computational Linguistics and Applications, 2014, 5(1): 27-44.
Su Z, Xu H, Zhang D, et al.Chinese Sentiment Classification Using a Neural Network Tool—Word2Vec[C]//Proceedings of the 2014 International Conference on Multisensor Fusion and Information Integration for Intelligent Systems, Beijing, China. Piscataway, USA: IEEE, 2014: 1-6.
Mikolov T, Chen K, Corrado G, et al.Efficient Estimation of Word Representations in Vector Space[OL]. arXiv Preprint, arXiv: 1301. 3781.
(Genrikh Altshuller.The Innovation Algorithm: TRIZ, Systematic Innovation and Technical Creativity [M]. Translated by Tan Peibo, Ru Haiyan, Wenling Babbitt. Wuhan: Huazhong University of Science and Technology Press, 2008.)
Li Z, Tate D, Lane C, et al.A Framework for Automatic TRIZ Level of Invention Estimation of Patents Using Natural Language Processing, Knowledge-transfer and Patent Citation Metrics[J]. Computer-Aided Design, 2012, 44(10): 987-1010.
王艳领. 专利等级划分方法的研究与实现[D]. 天津: 河北工业大学, 2011.
(Wang Yanling.Research and Implementation of the Mean of the Patent Classification [D]. Tianjin: Hebei University of Technology, 2011.)
(Rao Qi, Wang Peiyan, Zhang Guiping.Text Feature Analysis on SAO Structure Extraction from Chinese Patent Literatures[J]. Acta Scientiarum Naturalium Universitatis Pekinensis, 2015, 51(2): 349-356.)