New Technology of Library and Information Service  2015, Vol. 31 Issue (2): 78-84    DOI: 10.11925/infotech.1003-3513.2015.02.11
A Parallel Naive Bayesian Network Public Opinion Fast Classification Algorithm Based on Hadoop Platform
Ma Bin1,2,3, Yin Lifeng1
1. Department of Information Science and Technology, Shandong University of Political Science and Law, Ji'nan 250014, China;
2. Key Laboratory of Forensic Evidence in Shandong Province, Ji'nan 250014, China;
3. School of Electrical Engineering, Shandong University, Ji'nan 250061, China
[Objective] A new Network Public Opinion (NPO) classification method based on parallel Naive Bayesian Classification Algorithm (NBCA) in Hadoop environment is proposed. [Context] The NPO are high-volume, high-distribution and high-variety information assets, thus the accurate and fast classification is difficult to achieve. [Methods] According to the distributed storage and parallel processing features of Hadoop platform, the NBCA is parallel encapsulated and the NPO documents are locally stored under HDFS frame and parallel classified in MapReduce process. [Results] The performance of MapReduce packaged parallel NBCA is testified and the results show that the execution efficiency of proposed algorithm improves 82% compared to centralized method and its classification accuracy rate arrives more than 85%. [Conclusions] The proposed algorithm can effectively improve the NPO classification efficiency and ability.

Key wordsNetwork Public Opinion      Hadoop      MapReduce      Naive Bayes      Classification     
Received: 27 June 2014      Published: 17 March 2015
:  TP391.1  

Cite this article:

Ma Bin, Yin Lifeng. A Parallel Naive Bayesian Network Public Opinion Fast Classification Algorithm Based on Hadoop Platform. New Technology of Library and Information Service, 2015, 31(2): 78-84.

