This paper analyzes the algorithm of Map/Reduce and uses open source Hadoop software to design high fault-tolerant,high-performance distributed search engines,which will be in the face of large-scale data processing and storage problems.
吴宝贵,丁振国. 基于Map/Reduce的分布式搜索引擎研究[J]. 现代图书情报技术, 2007, 2(8): 52-55.
Wu Baogui,Ding Zhenguo. Research of Distributed Search Engine Based on Map/Reduce. New Technology of Library and Information Service, 2007, 2(8): 52-55.
[1] 王斌,张刚,孙健大规模分布式并行信息检索技术[J]信息技术快报,2005,3(2):1-9
[2] 姚树宇,赵少东一种使用分布式技术的搜索引擎[J]计算机应用与软件,2005 ,22(10):127-129
[3] 董华山,孙济庆基于P2P的分布式检索模式的研究[J]情报学报,2004,23(6):683-688
[4] Dean J,Ghemawat SMap/Reduce:Simplied Data Processing on Large Clusters[C].In:OSDI 2004,San Francisco,2004,137-150
[5] Borthakur DThe Hadoop Distributed File System:Architecture and Design[2007][EB/OL][2007-06-15]http://lucene.apache.org/hadoop/index.pdf
[6] Yang H C,Dasdan A,Hsiao R L,et alMap-Reduce-Merge:Simplified Relational Data Processing on Large Vlusters[C]International Conference on Management of Data Proceedings of the 2007 ACM SIGMOD International Conference on Management of Data:1029-1040
[7] 孟岩Map Reduce - the Free Lunch is not over[EB/OL].[2006-06-15]http://www.mengyan.org/blog/archives/2006/11/15/138.html
[8] Cutting DScalable Computing with Hadoop[EB/OL][206-06-12]http://wiki.apache.org/lucene-hadoop-data/attachments/HadoopPresentations/attachments/yahoo-sds.pdf
[9] 江南白衣Hadoop-海量文件的分布式计算处理方案[EB/OL].[2007-06-15]http://www.blogjava.net/calvin/archive/2007/02/08/98688.html
[10] Ghemawat S, Gobioff H, Leung S TThe Google File SystemIn:19th ACM Symposium on Operating Systems Principles[C]Lake George,NY,October,2003.
[11] 李晓明,闫宏飞,王继民搜索引擎[M]北京:科学出版社,2005
[12] 蒋建洪主要分布式搜索引擎技术的研究[J]科学技术与工程,2007,7(10):2418-2424
[13] Hadoop分布式文件系统:体系和设计[EB/OL].[2006-06-25]http://www.renpeicheng.com/html/2007-04/1690.html
[14] Hadoop Garbage Collection机制的实现分析[EB/OL].[2006-06-25]http://tianwang.grids.cn/docman/view.php/38/10/Hadoop-GarbageCollection.doc