|
|
Research on Review Spam Recognition |
Li Xiao, Ding Shengchun |
Department of Information and Management, Nanjing University of Science & Technology, Nanjing 210094, China |
|
|
Abstract This paper analyses review spam from the perspective of the usefulness of information, selects digital camera reviews as the research object and builds the data set, then from the three aspects of review, reviewer and product chooses 11 features, uses 4 different kernel functions in SVM model to identify review spam of products, optimizes the parameters C and γ of RBF that has a better identification, which improves accuracy rate of the identification effect of review spam to 78.16% and recall rate to 72.18%. By comparing the selected 4 different combinations of features, the authors find the combination of review, reviewer and product is the best. Finally, it proves that SVM is significantly better than other algorithms compared to the Logistic Regression.
|
Received: 08 January 2013
Published: 29 March 2013
|
|
[1] 周三多,陈传明,鲁明泓,等. 管理学:原理与方法[M]. 上海:复旦大学出版社,2011:36-38. (Zhou Sanduo,Chen Chuanming,Lu Minghong,et al. Management: Theory and Method[M].Shanghai: Fudan University Press, 2011:36-38.)[2] Jindal N, Liu B. Analyzing and Detecting Review Spam[C]. In:Proceeding of the 7th IEEE International Conference on Data Mining(ICDM'07),Omaha, Nebraska, USA.Washington, DC, USA:IEEE Computer Society,2007: 547-552.[3] Jindal N, Liu B. Review Spam Detection[C]. In:Proceedings of the 16th International Conference on World Wide Web, Banff, Alberta, Canada. New York, NY, USA:ACM,2007: 1189-1190.[4] Lim E P, Nguyen V A, Jindal N, et al. Detecting Product Review Spammers Using Rating Behaviors[C].In:Proceedings of the 19th ACM International Conference on Information and Knowledge Management(CIKM'10),Toronto, ON, Canada.New York, NY, USA:ACM, 2010: 930-948.[5] Jindal N, Liu B, Lim E P. Finding a Typical Review Patterns for Detecting Opinion Spammers [R]. 2010.[6] Jindal N, Liu B, Lim E P. Finding Unusual Review Patterns Using Unexpected Rules [C].In:Proceedings of the 19th ACM International Conference on Information and Knowledge Management(CIKM'10),Toronto, ON, Canada. New York, NY, USA:ACM, 2010: 1549-1552.[7] Mukherjee A, Liu B, Wang J, et al. Detecting Group Review Spam[C].In:Proceedings of the 28th ACM International Conference on Information and Knowledge Management,Hyderabad, India. New York, NY, USA:ACM,2011:1123-1126.[8] Wu G, Greene D, Smyth B, et al. Distortion as a Validation Criterion in the Identification of Suspicious Reviews[C]. In:Proceedings of the 1st Workshop on Social Media Analytics. Washington, DC, USA: ACM, 2010:10-13.[9] 何海江. 一种适应短文本的相关测度及其应用[J]. 计算机工程,2009,35(6):88-90. (He Haijiang. Relevancy Coefficient and Its Application Adapted to Short Texts[J]. Computer Engineering,2009, 35(6):88-90.)[10] 何海江,凌云. 由Logistic回归识别Web社区的垃圾评论[J]. 计算机工程与应用,2009,45(23): 140-143. (He Haijiang, Ling Yun. Identifying Comment Spams of Web Forums by Classifier Based Logistic Regression[J]. Computer Engineering and Applications,2009,45(23): 140-143).[11] Bhattarai A, Rus V, Dasgupta D. Characterizing Comment Spam in the Blogosphere Through Content Analysis[C]. In: Proceedings of IEEE Symposium on Computational Intelligence in Cyber Security (CICS). IEEE Computer Society, 2009:37-44.[12] Vapnik V N. An Overview of Statistical Learning Theory [J]. IEEE Transactions on Neural Networks,1999(10):988-999.[13] Vapnik V N. The Nature of Statistical Learning Theory[M]. New York: Springer-Verlag, 1995: 4-80.[14] HowNet [EB/OL]. [2012-05-10]. http://www.keenage.com/html/c_index.html.[15] LibSVM [EB/OL].[2012-05-20].http://www.csie.ntu.edu.tw/~cjlin/libsvm/.[16] Weka [EB/OL]. [2012-06-20].http://www.cs.waikato.ac.nz/ml/weka/. |
|
Viewed |
|
|
|
Full text
|
|
|
|
|
Abstract
|
|
|
|
|
Cited |
|
|
|
|
|
Shared |
|
|
|
|
|
Discussed |
|
|
|
|