Please wait a minute...
New Technology of Library and Information Service  2007, Vol. 2 Issue (8): 52-55    DOI: 10.11925/infotech.1003-3513.2007.08.12
Current Issue | Archive | Adv Search |
Research of Distributed Search Engine Based on Map/Reduce
Wu Baogui   Ding Zhenguo
1(School of Economics and Management,Xidian University,Xi’an 710071,China)
2(School of Network Education,Xidian University, Xi’an  710071,China)
Download:
Export: BibTeX | EndNote (RIS)      
Abstract  

This paper analyzes the algorithm of Map/Reduce and uses open source Hadoop software to design high fault-tolerant,high-performance distributed search engines,which will be in the face of large-scale data processing and storage problems.

Key wordsMap/Reduce      Distributed search engine      Hadoop     
Received: 18 June 2007      Published: 25 August 2007
: 

G350

 
Corresponding Authors: Wu Baogui     E-mail: bg1011@163.com
About author:: Wu Baogui,Ding Zhenguo

Cite this article:

Wu Baogui,Ding Zhenguo. Research of Distributed Search Engine Based on Map/Reduce. New Technology of Library and Information Service, 2007, 2(8): 52-55.

URL:

https://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/10.11925/infotech.1003-3513.2007.08.12     OR     https://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/Y2007/V2/I8/52

[1] 王斌,张刚,孙健大规模分布式并行信息检索技术[J]信息技术快报,2005,3(2):1-9
[2] 姚树宇,赵少东一种使用分布式技术的搜索引擎[J]计算机应用与软件,2005 ,22(10):127-129
[3] 董华山,孙济庆基于P2P的分布式检索模式的研究[J]情报学报,2004,23(6):683-688
[4] Dean J,Ghemawat SMap/Reduce:Simplied Data Processing on Large Clusters[C].In:OSDI 2004,San Francisco,2004,137-150
[5] Borthakur DThe Hadoop Distributed File System:Architecture and Design[2007][EB/OL][2007-06-15]http://lucene.apache.org/hadoop/index.pdf
[6] Yang H C,Dasdan A,Hsiao R L,et alMap-Reduce-Merge:Simplified Relational Data Processing on Large Vlusters[C]International Conference on Management of Data Proceedings of the 2007 ACM SIGMOD International Conference on Management of Data:1029-1040
[7] 孟岩Map Reduce - the Free Lunch is not over[EB/OL].[2006-06-15]http://www.mengyan.org/blog/archives/2006/11/15/138.html
[8] Cutting DScalable Computing with Hadoop[EB/OL][206-06-12]http://wiki.apache.org/lucene-hadoop-data/attachments/HadoopPresentations/attachments/yahoo-sds.pdf
[9] 江南白衣Hadoop-海量文件的分布式计算处理方案[EB/OL].[2007-06-15]http://www.blogjava.net/calvin/archive/2007/02/08/98688.html
[10] Ghemawat S, Gobioff H, Leung S TThe Google File SystemIn:19th ACM Symposium on Operating Systems Principles[C]Lake George,NY,October,2003.
[11] 李晓明,闫宏飞,王继民搜索引擎[M]北京:科学出版社,2005
[12] 蒋建洪主要分布式搜索引擎技术的研究[J]科学技术与工程,2007,7(10):2418-2424
[13] Hadoop分布式文件系统:体系和设计[EB/OL].[2006-06-25]http://www.renpeicheng.com/html/2007-04/1690.html
[14] Hadoop Garbage Collection机制的实现分析[EB/OL].[2006-06-25]http://tianwang.grids.cn/docman/view.php/38/10/Hadoop-GarbageCollection.doc

[1] Yang Aidong,Liu Dongsu. Hadoop Based Public Opinion Monitoring System for Micro-blogs[J]. 现代图书情报技术, 2016, 32(5): 56-63.
[2] Fan Yunman, Hong Na, Qian Qing, Fang An. The Research Practices of DataBase Cloud Storage Using Hadoop/HBase for the Pharmacogenomics Data[J]. 现代图书情报技术, 2015, 31(5): 73-79.
[3] Ma Bin, Yin Lifeng. A Parallel Naive Bayesian Network Public Opinion Fast Classification Algorithm Based on Hadoop Platform[J]. 现代图书情报技术, 2015, 31(2): 78-84.
[4] Zhao Huaming. Research and Implementation of Textual Clustering in Distributed Environment[J]. 现代图书情报技术, 2015, 31(1): 82-88.
[5] Xiao Qiang, Zhu Qinghua, Zheng Hua, Wu Kewen. Design and Implementation of Distributed Collaborative Filtering Algorithm on Hadoop[J]. 现代图书情报技术, 2013, 29(1): 83-89.
[6] Kang Liyun, Wang Xiaoyue, Bai Rujiang. Analysis of MapReduce Principle and Its Main Implementation Platforms[J]. 现代图书情报技术, 2012, 28(2): 60-67.
[7] Zhao Huaming. Research and Implementation of Textual Similarity in Distributed Environment[J]. 现代图书情报技术, 2011, 27(7/8): 14-20.
[8] Zhang Xingwang, Li Chenhui, Qin Xiaozhu. Research and Initial Implementation of Large-scale Data Processing Based on Cloud Computing[J]. 现代图书情报技术, 2011, 27(4): 17-23.
[9] Zhao Huaming. Building the Open Source Mass Data Mining Platform Based on Cloud Computing[J]. 现代图书情报技术, 2010, 26(10): 76-81.
[10] Yang Daiqing,Zhang Zhixiong. A Method for Generating Co-occurrence Matrix of Mass Data Based on Hadoop[J]. 现代图书情报技术, 2009, 25(4): 23-26.
[11] Liu Feng,Shi Shuicai,Xiao Shibin,Wang Hongwei . A Design of Distributed News & Weblog Search Engine Based on RSS[J]. 现代图书情报技术, 2007, 2(1): 29-32.
  Copyright © 2016 Data Analysis and Knowledge Discovery   Tel/Fax:(010)82626611-6626,82624938   E-mail:jishu@mail.las.ac.cn