Please wait a minute...
New Technology of Library and Information Service  2013, Vol. 29 Issue (1): 63-68    DOI: 10.11925/infotech.1003-3513.2013.01.10
Current Issue | Archive | Adv Search |
Research on Review Spam Recognition
Li Xiao, Ding Shengchun
Department of Information and Management, Nanjing University of Science & Technology, Nanjing 210094, China
Download: PDF(510 KB)   HTML  
Export: BibTeX | EndNote (RIS)      
Abstract  This paper analyses review spam from the perspective of the usefulness of information, selects digital camera reviews as the research object and builds the data set, then from the three aspects of review, reviewer and product chooses 11 features, uses 4 different kernel functions in SVM model to identify review spam of products, optimizes the parameters C and γ of RBF that has a better identification, which improves accuracy rate of the identification effect of review spam to 78.16% and recall rate to 72.18%. By comparing the selected 4 different combinations of features, the authors find the combination of review, reviewer and product is the best. Finally, it proves that SVM is significantly better than other algorithms compared to the Logistic Regression.
Key wordsSVM      Review spam      Feature selection      Kernel function      Product review     
Received: 08 January 2013      Published: 29 March 2013
: 

TP391

 

Cite this article:

Li Xiao, Ding Shengchun. Research on Review Spam Recognition. New Technology of Library and Information Service, 2013, 29(1): 63-68.

URL:

http://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/10.11925/infotech.1003-3513.2013.01.10     OR     http://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/Y2013/V29/I1/63

[1] 周三多,陈传明,鲁明泓,等. 管理学:原理与方法[M]. 上海:复旦大学出版社,2011:36-38. (Zhou Sanduo,Chen Chuanming,Lu Minghong,et al. Management: Theory and Method[M].Shanghai: Fudan University Press, 2011:36-38.)

[2] Jindal N, Liu B. Analyzing and Detecting Review Spam[C]. In:Proceeding of the 7th IEEE International Conference on Data Mining(ICDM'07),Omaha, Nebraska, USA.Washington, DC, USA:IEEE Computer Society,2007: 547-552.

[3] Jindal N, Liu B. Review Spam Detection[C]. In:Proceedings of the 16th International Conference on World Wide Web, Banff, Alberta, Canada. New York, NY, USA:ACM,2007: 1189-1190.

[4] Lim E P, Nguyen V A, Jindal N, et al. Detecting Product Review Spammers Using Rating Behaviors[C].In:Proceedings of the 19th ACM International Conference on Information and Knowledge Management(CIKM'10),Toronto, ON, Canada.New York, NY, USA:ACM, 2010: 930-948.

[5] Jindal N, Liu B, Lim E P. Finding a Typical Review Patterns for Detecting Opinion Spammers [R]. 2010.

[6] Jindal N, Liu B, Lim E P. Finding Unusual Review Patterns Using Unexpected Rules [C].In:Proceedings of the 19th ACM International Conference on Information and Knowledge Management(CIKM'10),Toronto, ON, Canada. New York, NY, USA:ACM, 2010: 1549-1552.

[7] Mukherjee A, Liu B, Wang J, et al. Detecting Group Review Spam[C].In:Proceedings of the 28th ACM International Conference on Information and Knowledge Management,Hyderabad, India. New York, NY, USA:ACM,2011:1123-1126.

[8] Wu G, Greene D, Smyth B, et al. Distortion as a Validation Criterion in the Identification of Suspicious Reviews[C]. In:Proceedings of the 1st Workshop on Social Media Analytics. Washington, DC, USA: ACM, 2010:10-13.

[9] 何海江. 一种适应短文本的相关测度及其应用[J]. 计算机工程,2009,35(6):88-90. (He Haijiang. Relevancy Coefficient and Its Application Adapted to Short Texts[J]. Computer Engineering,2009, 35(6):88-90.)

[10] 何海江,凌云. 由Logistic回归识别Web社区的垃圾评论[J]. 计算机工程与应用,2009,45(23): 140-143. (He Haijiang, Ling Yun. Identifying Comment Spams of Web Forums by Classifier Based Logistic Regression[J]. Computer Engineering and Applications,2009,45(23): 140-143).

[11] Bhattarai A, Rus V, Dasgupta D. Characterizing Comment Spam in the Blogosphere Through Content Analysis[C]. In: Proceedings of IEEE Symposium on Computational Intelligence in Cyber Security (CICS). IEEE Computer Society, 2009:37-44.

[12] Vapnik V N. An Overview of Statistical Learning Theory [J]. IEEE Transactions on Neural Networks,1999(10):988-999.

[13] Vapnik V N. The Nature of Statistical Learning Theory[M]. New York: Springer-Verlag, 1995: 4-80.

[14] HowNet [EB/OL]. [2012-05-10]. http://www.keenage.com/html/c_index.html.

[15] LibSVM [EB/OL].[2012-05-20].http://www.csie.ntu.edu.tw/~cjlin/libsvm/.

[16] Weka [EB/OL]. [2012-06-20].http://www.cs.waikato.ac.nz/ml/weka/.
[1] Cheng Zhou,Hongqin Wei. Evaluating and Classifying Patent Values Based on Self-Organizing Maps and Support Vector Machine[J]. 数据分析与知识发现, 2019, 3(5): 117-124.
[2] Jiaming Liang,Jie Zhao,Zhou Jianlong,Zhenning Dong. Detecting Collusive Fraudulent Online Transaction with Implicit User Behaviors[J]. 数据分析与知识发现, 2019, 3(5): 125-138.
[3] Bengong Yu,Yangnan Chen,Ying Yang. Classifying Short Text Complaints with nBD-SVM Model[J]. 数据分析与知识发现, 2019, 3(5): 77-85.
[4] Tingxin Wen,Yangzi Li,Jingshuang Sun. News Hotspots Discovery Method Based on Multi Factor Feature Selection and AFOA/K-means[J]. 数据分析与知识发现, 2019, 3(4): 97-106.
[5] Zhanglu Tan,Zhaogang Wang,Han Hu. Study on a Method of Feature Classification Selection Based on χ2 Statistics[J]. 数据分析与知识发现, 2019, 3(2): 72-78.
[6] Zixuan Zhang,Hao Wang,Liping Zhu,Sanhong eng. Identifying Risks of HS Codes by China Customs[J]. 数据分析与知识发现, 2019, 3(1): 72-84.
[7] Tingxin Wen,Yangzi Li,Jingshuang Sun. Extracting Text Features with Improved Fruit Fly Optimization Algorithm[J]. 数据分析与知识发现, 2018, 2(5): 59-69.
[8] Jun Hou,Kui Liu,Qianmu Li. Classification Recommendation Based on ESSVM[J]. 数据分析与知识发现, 2018, 2(3): 9-21.
[9] Yang Zhao,Qiqi Li,Yuhan Chen,Wenhang Cao. Examining Consumer Reviews of Overseas Shopping APP with Sentiment Analysis[J]. 数据分析与知识发现, 2018, 2(11): 19-27.
[10] Weiqing Li,Weijun Wang. Building Product Feature Dictionary with Large-scale Review Data[J]. 数据分析与知识发现, 2018, 2(1): 41-50.
[11] Zhipeng Li,Weizhong Li. Feature Selection Based on Modified QPSO Algorithm[J]. 数据分析与知识发现, 2017, 1(7): 82-89.
[12] Shihai Tian,Deli Lyu. An Early Warning Algorithm for Public Opinion of Safety Emergency[J]. 数据分析与知识发现, 2017, 1(2): 11-18.
[13] Yue Zhang,Dongbo Wang,Danhao Zhu. Segmenting Chinese Words from Food Safety Emergencies[J]. 数据分析与知识发现, 2017, 1(2): 64-72.
[14] Zhongqun Wang,Dongsheng Wu,Sheng Jiang,Subin Huang. Ranking Credibility of Online Product Reviews Based on Feature-Opinion Pair[J]. 数据分析与知识发现, 2017, 1(10): 32-42.
[15] Xiangdong Li,Tao Ruan,Kang Liu. Automatic Classification of Documents from Wikipedia[J]. 数据分析与知识发现, 2017, 1(10): 43-52.
  Copyright © 2016 Data Analysis and Knowledge Discovery   Tel/Fax:(010)82626611-6626,82624938   E-mail:jishu@mail.las.ac.cn