Web Page Snapshot Based on Delta Encoding and It’s Visualization
He Minggui1,2 Zhou Ning1 Rong Huigui1
1(School of Information Management, Wuhan University,Wuhan 430072,China) 2(School of Journalism and Communication, Wuhan University, Wuhan 430072,China)
In order to trace the changes of a Web page, the search engine needs to save many snapshots of it, that will increase the storage usage of the server. This paper introduces the method of delta encoding to save disk space. In order to let users understand global changes of all the snapshots and the detail changes of every two snapshots easily, this paper introduces the method of visualization.
何明贵,周宁,荣辉贵. 基于增量的网页快照及其可视化[J]. 现代图书情报技术, 2009, 25(5): 72-75.
He Minggui,Zhou Ning,Rong Huigui. Web Page Snapshot Based on Delta Encoding and It’s Visualization. New Technology of Library and Information Service, 2009, 25(5): 72-75.
[1] 孙静,赵恒永.搜索引擎网页快照的实现与安全[J].中国科技信息,2007(11):138-141.
[2] Internet Archive [EB/OL].[2009-03-03].http://www.archive.org/index.php.
[3] 阳万安,李彦.通用版本控制系统的研究和设计[J].计算机工程,2008,34(12):283-285.
[4] Khalid S.Lossless Compression Handbook[M]. Academic Press.2003,269-263.
[5] Mogul J,Krishnamurthy B,Douglis F,et al.Delta Encoding in HTTP[EB/OL].(2002-01-02)[2009-03-03].http://tools.ietf.org/html/rfc3229.
[6] James J H, Ktem-Phong V, Walter F T, et al. Delta Algorithms: An Empirical Analysis[J]. Source ACM Transactions on Software Engineering and Methodology (TOSEM) Archive,1998,7(4):449.
[7] Ouyang Z,Memon N, Suel T, et al. Cluster-based Delta Compression of a Collection of Files[M]. Web Information Systems Engineering, 2002. WISE 2002. Proceedings of the Third International Conference, 2002: 257-268.
[8] Dimitre T, Nasir M, Torsten S, et al. zdelta: An Efficient Delta Compression Tool[EB/OL].(2002-06-26).[2009-03-04]. http://cis.poly.edu/tr/tr-cis-2002-02.pdf.
[9] Daniel A K.Information Visualization and Visual Data Mining[J].IEEE Transactions on Visualization and Computer Graphics,2002(7):100-107.
[10] Tortoise Merge Manual[EB/OL].[2009-03-04]. http://tortoisesvn.net/docs/release/TortoiseMerge_en/tmerge-dug.html#tmerge-dug-dia-3pane.
[11] Deng C, Shipeng Y, Ji-Rong W, et al. Extracting Content Structure for Web Pages Based on Visual Representation[J]. Web Technologies and Applications,2003(2642):596.