Please wait a minute...
New Technology of Library and Information Service  2013, Vol. 29 Issue (1): 63-68    DOI: 10.11925/infotech.1003-3513.2013.01.10
Current Issue | Archive | Adv Search |
Research on Review Spam Recognition
Li Xiao, Ding Shengchun
Department of Information and Management, Nanjing University of Science & Technology, Nanjing 210094, China
Download:
Export: BibTeX | EndNote (RIS)      
Abstract  This paper analyses review spam from the perspective of the usefulness of information, selects digital camera reviews as the research object and builds the data set, then from the three aspects of review, reviewer and product chooses 11 features, uses 4 different kernel functions in SVM model to identify review spam of products, optimizes the parameters C and γ of RBF that has a better identification, which improves accuracy rate of the identification effect of review spam to 78.16% and recall rate to 72.18%. By comparing the selected 4 different combinations of features, the authors find the combination of review, reviewer and product is the best. Finally, it proves that SVM is significantly better than other algorithms compared to the Logistic Regression.
Key wordsSVM      Review spam      Feature selection      Kernel function      Product review     
Received: 08 January 2013      Published: 29 March 2013
: 

TP391

 

Cite this article:

Li Xiao, Ding Shengchun. Research on Review Spam Recognition. New Technology of Library and Information Service, 2013, 29(1): 63-68.

URL:

https://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/10.11925/infotech.1003-3513.2013.01.10     OR     https://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/Y2013/V29/I1/63

[1] 周三多,陈传明,鲁明泓,等. 管理学:原理与方法[M]. 上海:复旦大学出版社,2011:36-38. (Zhou Sanduo,Chen Chuanming,Lu Minghong,et al. Management: Theory and Method[M].Shanghai: Fudan University Press, 2011:36-38.)

[2] Jindal N, Liu B. Analyzing and Detecting Review Spam[C]. In:Proceeding of the 7th IEEE International Conference on Data Mining(ICDM'07),Omaha, Nebraska, USA.Washington, DC, USA:IEEE Computer Society,2007: 547-552.

[3] Jindal N, Liu B. Review Spam Detection[C]. In:Proceedings of the 16th International Conference on World Wide Web, Banff, Alberta, Canada. New York, NY, USA:ACM,2007: 1189-1190.

[4] Lim E P, Nguyen V A, Jindal N, et al. Detecting Product Review Spammers Using Rating Behaviors[C].In:Proceedings of the 19th ACM International Conference on Information and Knowledge Management(CIKM'10),Toronto, ON, Canada.New York, NY, USA:ACM, 2010: 930-948.

[5] Jindal N, Liu B, Lim E P. Finding a Typical Review Patterns for Detecting Opinion Spammers [R]. 2010.

[6] Jindal N, Liu B, Lim E P. Finding Unusual Review Patterns Using Unexpected Rules [C].In:Proceedings of the 19th ACM International Conference on Information and Knowledge Management(CIKM'10),Toronto, ON, Canada. New York, NY, USA:ACM, 2010: 1549-1552.

[7] Mukherjee A, Liu B, Wang J, et al. Detecting Group Review Spam[C].In:Proceedings of the 28th ACM International Conference on Information and Knowledge Management,Hyderabad, India. New York, NY, USA:ACM,2011:1123-1126.

[8] Wu G, Greene D, Smyth B, et al. Distortion as a Validation Criterion in the Identification of Suspicious Reviews[C]. In:Proceedings of the 1st Workshop on Social Media Analytics. Washington, DC, USA: ACM, 2010:10-13.

[9] 何海江. 一种适应短文本的相关测度及其应用[J]. 计算机工程,2009,35(6):88-90. (He Haijiang. Relevancy Coefficient and Its Application Adapted to Short Texts[J]. Computer Engineering,2009, 35(6):88-90.)

[10] 何海江,凌云. 由Logistic回归识别Web社区的垃圾评论[J]. 计算机工程与应用,2009,45(23): 140-143. (He Haijiang, Ling Yun. Identifying Comment Spams of Web Forums by Classifier Based Logistic Regression[J]. Computer Engineering and Applications,2009,45(23): 140-143).

[11] Bhattarai A, Rus V, Dasgupta D. Characterizing Comment Spam in the Blogosphere Through Content Analysis[C]. In: Proceedings of IEEE Symposium on Computational Intelligence in Cyber Security (CICS). IEEE Computer Society, 2009:37-44.

[12] Vapnik V N. An Overview of Statistical Learning Theory [J]. IEEE Transactions on Neural Networks,1999(10):988-999.

[13] Vapnik V N. The Nature of Statistical Learning Theory[M]. New York: Springer-Verlag, 1995: 4-80.

[14] HowNet [EB/OL]. [2012-05-10]. http://www.keenage.com/html/c_index.html.

[15] LibSVM [EB/OL].[2012-05-20].http://www.csie.ntu.edu.tw/~cjlin/libsvm/.

[16] Weka [EB/OL]. [2012-06-20].http://www.cs.waikato.ac.nz/ml/weka/.
[1] Liang Jiaming, Zhao Jie, Zheng Peng, Huang Liushen, Ye Minqi, Dong Zhenning. Framework for Computing Trust in Online Short-Rent Platform Using Feature Selection of Images and Texts[J]. 数据分析与知识发现, 2021, 5(2): 129-140.
[2] Shen Wang, Li Shiyu, Liu Jiayu, Li He. Optimizing Quality Evaluation for Answers of Q&A Community[J]. 数据分析与知识发现, 2021, 5(2): 83-93.
[3] Gong Lijuan,Wang Hao,Zhang Zixuan,Zhu Liping. Reducing Dimensions of Custom Declaration Texts with Word2Vec[J]. 数据分析与知识发现, 2020, 4(2/3): 89-100.
[4] Bengong Yu,Yumeng Cao,Yangnan Chen,Ying Yang. Classification of Short Texts Based on nLD-SVM-RF Model[J]. 数据分析与知识发现, 2020, 4(1): 111-120.
[5] Jiafen Wu,Feicheng Ma. Detecting Product Review Spam: A Survey[J]. 数据分析与知识发现, 2019, 3(9): 1-15.
[6] Gang Li,Huayang Zhou,Jin Mao,Sijing Chen. Classifying Social Media Users with Machine Learning[J]. 数据分析与知识发现, 2019, 3(8): 1-9.
[7] Cheng Zhou,Hongqin Wei. Evaluating and Classifying Patent Values Based on Self-Organizing Maps and Support Vector Machine[J]. 数据分析与知识发现, 2019, 3(5): 117-124.
[8] Jiaming Liang,Jie Zhao,Zhou Jianlong,Zhenning Dong. Detecting Collusive Fraudulent Online Transaction with Implicit User Behaviors[J]. 数据分析与知识发现, 2019, 3(5): 125-138.
[9] Bengong Yu,Yangnan Chen,Ying Yang. Classifying Short Text Complaints with nBD-SVM Model[J]. 数据分析与知识发现, 2019, 3(5): 77-85.
[10] Tingxin Wen,Yangzi Li,Jingshuang Sun. News Hotspots Discovery Method Based on Multi Factor Feature Selection and AFOA/K-means[J]. 数据分析与知识发现, 2019, 3(4): 97-106.
[11] Zhanglu Tan,Zhaogang Wang,Han Hu. Study on a Method of Feature Classification Selection Based on χ2 Statistics[J]. 数据分析与知识发现, 2019, 3(2): 72-78.
[12] Zixuan Zhang,Hao Wang,Liping Zhu,Sanhong eng. Identifying Risks of HS Codes by China Customs[J]. 数据分析与知识发现, 2019, 3(1): 72-84.
[13] Wen Tingxin,Li Yangzi,Sun Jingshuang. Extracting Text Features with Improved Fruit Fly Optimization Algorithm[J]. 数据分析与知识发现, 2018, 2(5): 59-69.
[14] Hou Jun,Liu Kui,Li Qianmu. Classification Recommendation Based on ESSVM[J]. 数据分析与知识发现, 2018, 2(3): 9-21.
[15] Zhao Yang,Li Qiqi,Chen Yuhan,Cao Wenhang. Examining Consumer Reviews of Overseas Shopping APP with Sentiment Analysis[J]. 数据分析与知识发现, 2018, 2(11): 19-27.
  Copyright © 2016 Data Analysis and Knowledge Discovery   Tel/Fax:(010)82626611-6626,82624938   E-mail:jishu@mail.las.ac.cn