%A Wu Zhenxin, Zhang Zhixiong, Xie Jing, Hu Jiying %T Developing Web Archive System of International Institutions Based on IIPC Open Source Software %0 Journal Article %D 2015 %J Data Analysis and Knowledge Discovery %R 10.11925/infotech.1003-3513.2015.04.01 %P 1-9 %V 31 %N 4 %U {https://manu44.magtech.com.cn/Jwk_infotech_wk3/CN/abstract/article_4034.shtml} %8 2015-04-25 %X

[Objective] Develope Web Archive System of International Institutions. [Methods] Based on IIPC open source software framework, this paper applies a three layer expansion strategy in the acquisition terminal, provides automatical uploading and reporting function in the acquisition client, develops a WARC parser which can analyze the content of WARC file, uses Solr to be an indexer. [Results] This paper implements acquisition expansion, promotes the automatical level of system workflow by adding more function modules in the acquisition client, extracts more information by developing WARC parser modules, uses Solr to enrich index and retrieval service. [Limitations] Lack of large-scale Web archive to verify this platform. [Conclusions] The expanded Web archive framework becomes distributed, extended and full automatic.