A Method for Generating Co-occurrence Matrix of Mass Data Based on Hadoop
Yang Daiqing1,2 Zhang Zhixiong1
1 (National Science Library, Chinese Academy of Sciences, Beijing 100190, China) 2(Institute of Scientific and Technical Information of China, Beijing 100038, China)
Mass data processing is a focal point of information techniques. This paper introduces architecture of open source parallel system-Hadoop, analyzes the MapReduce programming framework based on Hadoop, and proposes a method for generating co-occurrence matrix of mass data through multiple MapReduce operations.
杨代庆,张智雄. 基于Hadoop的海量共现矩阵生成方法*[J]. 现代图书情报技术, 2009, 25(4): 23-26.
Yang Daiqing,Zhang Zhixiong. A Method for Generating Co-occurrence Matrix of Mass Data Based on Hadoop. New Technology of Library and Information Service, 2009, 25(4): 23-26.