%A Yan Yu,Lei Chen,Jinde Jiang,Naixuan Zhao %T Measuring Patent Similarity with Word Embedding and Statistical Features %0 Journal Article %D 2019 %J Data Analysis and Knowledge Discovery %R 10.11925/infotech.2096-3467.2018.1317 %P 53-59 %V 3 %N 9 %U {https://manu44.magtech.com.cn/Jwk_infotech_wk3/CN/abstract/article_4708.shtml} %8 2019-09-25 %X

[Objective] This paper proposes a new method measuring patent similarities, which explores the semantic relationship between words and improves the performance of these tasks. [Methods] First, we introduced a neural network-based word vector model to obtain semantic information from patent words. Then, we computed the word statistical features to gauge their significance. Finally, we combined the word embedding and statistical features to represent patent texts and measure their similarity. [Results] The accuracy of the proposed method was 13.92% higher than those of the traditional methods. [Limitations] More research is needed to study the selection strategy of auxiliary patent texts. [Conclusions] Combining word embedding and statistical features can effectively improve the patent similarity measurement.