|
|
Research on Building an Open Access Search Engine with Nutch |
Cui Yuhong, Zhang Kui |
Beijing Institute of Technology Library, Beijing 100081,China |
|
|
Abstract Integrated retrieval mechanism is studied for open access system and the Web crawling is used to build a distributed DSearch system based on Nutch, which can provide a kind of efficient, flexible, customizable search tools. Three key technologies are also introduced,including distributed cluster configuration,Chinese word splitter modification and index settings. Finally,the functions of DSearch are evaluated with the selected feed lists.
|
Received: 12 July 2010
Published: 04 January 2011
|
|
[1] DOAJ . . http://www.doaj.org.
[2] OpenDOAR . .http://www.opendoar.org.
[3] 李春旺. 网络环境下学术信息的开放存取 [J]. 中国图书馆学报 ,2005,31(1):33-37.
[4] The OAIster Database . .http://www.oclc.org/oaister/.
[5] Norris M, Oppenheim C, Rowland F. Finding Open Access Articles Using Google, Google Scholar, OAIster and OpenDOAR [J]. Online Information Review, 2008, 32(6):709-715.
[6] Welcome to Apache Hadoop . .http://hadoop.apache.org/index.pdf.
[7] Welcome to Pig! . .http://hadoop.apache.org/pig/index.pdf.
[8] Dean J, Ghemawat S. MapReduce: Simplified Data Processing on Large Cluster . .http://labs.google.com/papers/mapreduce-osdi04.pdf.
[9] Paoding . .http://code.google.com/p/paoding/.
|
|
Viewed |
|
|
|
Full text
|
|
|
|
|
Abstract
|
|
|
|
|
Cited |
|
|
|
|
|
Shared |
|
|
|
|
|
Discussed |
|
|
|
|