%A Chen Junlin,Zhang Wende %T Optimizing Extraction of Science Documents’ Metadata in PDF Format Based on XSLT %0 Journal Article %D 2007 %J Data Analysis and Knowledge Discovery %R 10.11925/infotech.1003-3513.2007.02.04 %P 18-23 %V 2 %N 2 %U {https://manu44.magtech.com.cn/Jwk_infotech_wk3/CN/abstract/article_365.shtml} %8 2007-02-25 %X

This paper firstly introduces a format transforming tool and XSLT which is the language used to produce extraction rules, then simply analyses the middle documents generated from PDF to HTML. Thirdly, discusses the problem of metadata existed in the science documents in PDF format, finally gives the methods to solve this problem.