New Technology of Library and Information Service  2015, Vol. 31 Issue (10): 81-87    DOI: 10.11925/infotech.1003-3513.2015.10.11
Automatic Annotation of Bibliographical References in Chinese Patent Documents
Jiang Chuntao
Department of Computer Science and Technology, Nanjing University, Nanjing 210023, China;
Patent Information and Service Center of Jiangsu Province, Nanjing 210008, China
[Objective] This paper aims to automatically annotate four types of bibliographical references in Chinese patent documents, such as patents, standards, papers, and other monographs public documents. [Methods] Use a pattern matching approach to annotate the references of patents, standards, and public documents, and use a two-phase machine learning approach to annotate the paper references, firstly, automatically detecte the sentences that contain citation information, then extracte 6 categories of bibliographic features from the results. [Results] The results of ten-fold cross validation show that the accuracy for annotating patents is 100%, and the precision and recall for annotating standards is 92% and 94% respectively, while the precision and recall for annotating public documents is 80% and 71% respectively. For annotating paper references, the precision and recall in phase one is 95.7% and 96.0% and in phase two is 95.3% and 94.9% respectively. [Limitations] The pattern matching approach requires analyzing a lot of patent documents manually, and the size of the training model used by the proposed machine learning approach is relatively small. [Conclusions] The performance of annotating patents and standards using a pattern matching approach achieves over 92%, and the performance of annotating papers using a machine learning approach achieves 95%.

Received: 14 April 2015      Published: 06 April 2016
:  TP393  

Jiang Chuntao. Automatic Annotation of Bibliographical References in Chinese Patent Documents. New Technology of Library and Information Service, 2015, 31(10): 81-87.

