IMTS: Detecting Fake Reviews with Image and Text Semantics
Shi Yunmei1,2,Yuan Bo1,2,Zhang Le1,2(),Lv Xueqiang1
1Beijing Key Laboratory of Internet Culture and Digital Dissemination Research, Beijing Information Science and Technology University, Beijing 100101, China 2School of Computer Science, Beijing Information Science and Technology University, Beijing 100101, China
[Objective] This paper proposes a fake comment detection method (IMTS) integrating image information and text semantics for Chinese e-commerce websites, aiming to address the proliferation of fake comments posted by “Internet Water Army”. [Methods] First, we used the text convolutional neural network (TextCNN) and the BERT pre-training model to extract features of the text review information, and obtained the corresponding feature vectors. Then, we integrated the reviewer features to enhance the model’s capture of the overall semantic information by splicing the review text semantics and the output features of the reviewer ID. Third, we used the Residual Network (ResNet) to extract features from pictures posted by users in comments to obtain corresponding visual features. Finally, we conducted multimodal fusion of text features and visual features to detect the fake comments. [Results] The IMTS method achieved 96.36% accuracy, 96.35% recall and 96.35% F1 value on the self-built multimodal Chinese fake comment dataset. [Limitations] The dataset in this paper was small in scale, and the BERT pre-training model was used in the text processing stage. [Conclusions] The proposed method could effectively improve the overall detection accuracy of fake comments.
施运梅, 袁博, 张乐, 吕学强. IMTS:融合图像与文本语义的虚假评论检测方法*[J]. 数据分析与知识发现, 2022, 6(8): 84-96.
Shi Yunmei, Yuan Bo, Zhang Le, Lv Xueqiang. IMTS: Detecting Fake Reviews with Image and Text Semantics. Data Analysis and Knowledge Discovery, 2022, 6(8): 84-96.
(China Internet Network Information Center. Statistical Report of the 47th Chinese Internet Development[R/OL]. [2021-02-28]. http://www.cac.gov.cn/2021-02/03/c_1613923423079314.htm.)
[2]
Wu Y Y, Ngai E W T, Wu P K, et al. Fake Online Reviews: Literature Review, Synthesis, and Directions for Future Research[J]. Decision Support Systems, 2020, 132: 113280.
doi: 10.1016/j.dss.2020.113280
(Chen Yanfang, Tan Lihui. Study on Information Management Strategies of Fake Reviews of Online Products[J]. Journal of Modern Information, 2015, 35(2): 150-153.)
[4]
Kim Y. Convolutional Neural Networks for Sentence Classification[C]// Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing. 2014: 1746-1751.
[5]
Devlin J, Chang M W, Lee K, et al. BERT: Pre-Training of Deep Bidirectional Transformers for Language Understanding[C]// Proceedings of the 17th Annual Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies. 2019: 4171-4186.
[6]
He K M, Zhang X Y, Ren S Q, et al. Deep Residual Learning for Image Recognition[C]// Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 2016: 770-778.
(Zhang Ziqiong, Ye Qiang, Li Yijun. Literature Review on Sentiment Analysis of Online Product Reviews[J]. Journal of Management Sciences in China, 2010, 13(6): 84-96.)
(Li Feifei, Wu Fan, Wang Zhongqing. Sentiment Analysis with Reviewer Types and Generative Adversarial Network[J]. Data Analysis and Knowledge Discovery, 2021, 5(4): 72-79.)
(Xing Juanjuan. Fake Reviews Identification Based on Markov Logic Networks[J]. Journal of Chinese Information Processing, 2016, 30(5): 94-100.)
[11]
Gao X Y, Li S, Zhu Y Y, et al. Identification of Deceptive Reviews by Sentimental Analysis and Characteristics of Reviewers[J]. Journal of Engineering Science and Technology Review, 2019, 12(1): 195-201.
(Zhang Qi, Ji Shujuan, Fu Qiang, et al. Weighted Reviewer Graph Based Spammer Group Detection and Characteristic Analysis[J]. Journal of Computer Applications, 2019, 39(6): 1595-1600.)
doi: 10.11772/j.issn.1001-9081.2018122611
[13]
Dong L Y, Ji S J, Zhang C J, et al. An Unsupervised Topic-Sentiment Joint Probabilistic Model for Detecting Deceptive Reviews[J]. Expert Systems with Applications, 2018, 114: 210-223.
doi: 10.1016/j.eswa.2018.07.005
[14]
Liu Y C, Pang B. A Unified Framework for Detecting Author Spamicity by Modeling Review Deviation[J]. Expert Systems with Applications, 2018, 112: 148-155.
doi: 10.1016/j.eswa.2018.06.028
[15]
Yu C M, Zuo Y H, Feng B L, et al. An Individual-Group-Merchant Relation Model for Identifying Fake Online Reviews: An Empirical Study on a Chinese E-Commerce Platform[J]. Information Technology and Management, 2019, 20(3): 123-138.
doi: 10.1007/s10799-018-0288-1
[16]
Zhang L, Wu Z A, Cao J. Detecting Spammer Groups from Product Reviews: A Partially Supervised Learning Model[J]. IEEE Access, 2018, 6: 2559-2568.
doi: 10.1109/ACCESS.2017.2784370
[17]
Yuan S H, Wu X T, Xiang Y. Task-Specific Word Identification from Short Texts Using a Convolutional Neural Network[J]. Intelligent Data Analysis, 2018, 22(3): 533-550.
doi: 10.3233/IDA-173413
[18]
Mandhula T, Pabboju S, Gugulotu N. Predicting the Customer’s Opinion on Amazon Products Using Selective Memory Architecture-Based Convolutional Neural Network[J]. The Journal of Supercomputing, 2020, 76(8): 5923-5947.
doi: 10.1007/s11227-019-03081-4
[19]
Bhargava R, Baoni A, Sharma Y. Composite Sequential Modeling for Identifying Fake Reviews[J]. Journal of Intelligent Systems, 2019, 28(3): 409-422.
doi: 10.1515/jisys-2017-0501
(Zhang Guobiao, Li Jie. Detecting Social Media Fake News with Semantic Consistency Between Multi-Model Contents[J]. Data Analysis and Knowledge Discovery, 2021, 5(5): 21-29.)
(Sun Xiaoyan, Ma Luyao, Qiao Yali. False Comment Recognition Based on Text Feature Fusion[C]// Proceedings of the 31st China Process Control Conference. 2020.)
[22]
Lu S, Mao C, Yu Z, et al. A Joint Model with Multi-Granularity Features of Low-Resource Language POS Tagging and Dependency Parsing[C]// Proceedings of the 20th Chinese National Conference on Computational Linguistics. 2021: 747-757.
[23]
Ali F, El-Sappagh S, Islam S M R, et al. A Smart Healthcare Monitoring System for Heart Disease Prediction Based on Ensemble Deep Learning and Feature Fusion[J]. Information Fusion, 2020, 63: 208-222.
doi: 10.1016/j.inffus.2020.06.008
[24]
Makiuchi M R, Warnita T, Uto K, et al. Multimodal Fusion of BERT-CNN and Gated CNN Representations for Depression Detection[C]// Proceedings of the 9th International on Audio/Visual Emotion Challenge and Workshop. 2019: 55-63.
(Chen Peng, Li Qing, Zhang Dezheng, et al. A Survey of Multimodal Machine Learning[J]. Chinese Journal of Engineering, 2020, 42(5): 557-569.)
[26]
Sutton C, McCallum A. An Introduction to Conditional Random Fields for Relational Learning[J]. Introduction to Statistical Relational Learning, 2006, 2: 93-128.
[27]
Ngiam J, Khosla A, Kim M, et al. Multimodal Deep Learning[C]// Proceedings of the 28th International Conference on Machine Learning. 2011: 689-696.
[28]
Lei J, Yu L C, Bansal M, et al. TVQA: Localized, Compositional Video Question Answering[C]// Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. 2018: 1369-1379.
[29]
Zhang Z F, Li X L, Gan C Q. Multimodality Fusion for Node Classification in D2D Communications[J]. IEEE Access, 2018, 6: 63748-63756.
doi: 10.1109/ACCESS.2018.2877715
[30]
Manaskasemsak B, Chanmakho C, Klainongsuang J, et al. Opinion Spam Detection Through User Behavioral Graph Partitioning Approach[C]// Proceedings of the 3rd International Conference on Intelligent Systems, Metaheuristics & Swarm Intelligence. 2019: 73-77.
[31]
Xie S H, Wang G, Lin S Y, et al. Review Spam Detection via Temporal Pattern Discovery[C]// Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2012: 823-831.
[32]
Dewang R K, Singh P, Singh A K. Finding of Review Spam Through “Corleone, Review Genre, Writing Style and Review Text Detail Features”[C]// Proceedings of the 2nd International Conference on Information and Communication Technology for Competitive Strategies. 2016.
[33]
Wang Y Q, Ma F L, Jin Z W, et al. EANN: Event Adversarial Neural Networks for Multi-Modal Fake News Detection[C]// Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 2018: 849-857.