%A Jinzhu Zhang,Yiming Hu %T Extracting Titles from Scientific References in Patents with Fusion of Representation Learning and Machine Learning %0 Journal Article %D 2019 %J Data Analysis and Knowledge Discovery %R 10.11925/infotech.2096-3467.2018.0659 %P 68-76 %V 3 %N 5 %U {https://manu44.magtech.com.cn/Jwk_infotech_wk3/CN/abstract/article_4658.shtml} %8 2019-05-25 %X

[Objective] This paper aims to automatically identify scientific references in patent(SRP), and then extract titles from SRP to support in-depth data mining. [Methods] Firstly, we used the Doc2Vec method to generate vectors for the patent citations. Then, we identified the SRPs with support vector machine (SVM). Third, we created vectors for the metadata (such as titles) of SRP, and extracted titles with SVM. [Results] We examined the proposed method with patent citations from the genetic field. The accuracy of SRP recognition and titles extraction reached 99.27% and 92.59% respectively. The latter was 5.96% higher than those of the traditional methods. [Limitations] Manually tagging the training set was very time consuming, and there are format requirements for the experimental data. [Conclusions] The proposed method could effectively identify and extract patent citations and titles.