New Technology of Library and Information Service  2010, Vol. 26 Issue (12): 76-80    DOI: 10.11925/infotech.1003-3513.2010.12.13
Research on Automatic Archiving System for Institutional Repositories
Cui Yuhong
Beijing Institute of Technology Library, Beijing 100081,China
This paper introduces an experimental system (DAAS) which can automatic harvest the institutional researcher articles and ingest the metadata into the local DSpace platform. The system implements a semi-automatic approach for IRs population which consists of information filtering, metadata extraction, copyright verification, metadata mapping and data archiving. Based on Nutch key component, how to parse the URL and extract the metadata from unstructured Web pages according to the rule-based filter is described in detail. The next research is focus on the computer-learning algorithm.

Key wordsInstitutional      repositories      Automatic      archive      Information      extraction      Nutch      DSpace     
Received: 08 October 2010      Published: 07 January 2011



Cui Yuhong. Research on Automatic Archiving System for Institutional Repositories. New Technology of Library and Information Service, 2010, 26(12): 76-80.

