New Technology of Library and Information Service  2009, Vol. 3 Issue (2): 102-106    DOI: 10.11925/infotech.1003-3513.2009.02.17
Automatic Extraction of Semantic Metadata from PDF Research Papers
Zhang Xiuxiu   Ma Jianxia
(The Lanzhou Branch of National Science Library, Chinese Academy of Sciences, Lanzhou 730000, China)
This paper analyzes content streams of PDF files based on its structure, and extracts semantic metadata automatically from research papers by way of rule-based matching and format-based locating. Experimental results show that this method can extract important semantic metadata such as title and author effectively.

Key wordsPDF      Research paper      Semantic metadata      Automatic extraction     
Received: 03 November 2008      Published: 25 February 2009


Corresponding Authors: Zhang Xiuxiu     E-mail:
About author:: Zhang Xiuxiu ,Ma Jianxia

Cite this article:

Zhang Xiuxiu ,Ma Jianxia. Automatic Extraction of Semantic Metadata from PDF Research Papers. New Technology of Library and Information Service, 2009, 3(2): 102-106.

