|
|
Support for Cross-Domain Methods of Identifying Fake Comments of Chinese |
Gu Yan1,Zheng Kaihong1,Hu Yongjun1(),Song Yishan2,Liu Dongping3 |
1School of Management, Guangzhou University, Guangzhou 510006, China 2School of Data Science, The Chinese University of Hong Kong, Shenzhen 518000, China 3Partner & Business Enabling, Amazon Web Services GCR, Beijing 100015, China |
|
|
Abstract [Objective] This paper constructs a cross-domain Chinese fake review identification model (CFEE) for multi-domain datasets. It extracts the semantic information of the comment texts and addresses the problems of traditional recognition models. [Methods] First, we established 11 rules for constructing fake review datasets and created a multi-domain dataset. Then, we designed the CFEE model to identify Chinese fake comments across domains. Third, it extracted the deep semantic information with the ERNIE pre-training model. The model identified the hidden comments based on the texts' emotional attributes. Finally, it projected the text information to the word relation dimension with the convolutional neural network and realized classification based on features of neural network fusion. [Results] The CFEE model's F1 value reached 91.52% on the multi-domain Chinese fake comment datasets. The model's F1 values were 85.71%, 79.59%, 85.71%, and 85.00% on single-domain datasets for mobile phones, food, clothing, and household appliances, respectively. It outperformed the existing models significantly. [Limitations] There is subjectivity in the manual annotation. [Conclusions] The proposed method can effectively identify Chinese fake reviews across domains.
|
Received: 21 December 2022
Published: 08 January 2024
|
|
Fund:National Social Science Fund of China(18BGL236);National Key R&D Program of China(2021YFB3301801);2nd Phase of the Ministry of Education Supply and Demand Docking Employment Education Project(20230103480) |
Corresponding Authors:
Hu Yongjun,ORCID:0000-0002-9395-7535,E-mail: hyjsdu96@126.com。
|
[1] |
Chatterjee S, Chaudhuri R, Kumar A, et al. Impacts of Consumer Cognitive Process to Ascertain Online Fake Review: A Cognitive Dissonance Theory Approach[J]. Journal of Business Research, 2023, 154: Article No.113370.
|
[2] |
高翠, 刘婉妮, 王硕. 数字经济背景下对消费者评论数据的挖掘[J]. 活力, 2022(11): 178-180.
|
[2] |
(Gao Cui, Liu Wanni, Wang Shuo. Mining Consumer Comment Data Under the Background of Digital Economy[J]. Vitality, 2022(11): 178-180.)
|
[3] |
Wu Y Y, Ngai E W T, Wu P K, et al. Fake Online Reviews: Literature Review, Synthesis, and Directions for Future Research[J]. Decision Support Systems, 2020, 132: Article No.113280.
|
[4] |
魏瑾瑞, 徐晓晴. 虚假评论、消费决策与产品绩效——虚假评论能产生真实的绩效吗[J]. 南开管理评论, 2020, 23(1): 189-199.
|
[4] |
(Wei Jinrui, Xu Xiaoqing. Does Review Spam Create Real Performance: An Empirical Research Based on the Relationship Between Review Spam, Consumption Decisions and Product Performance[J]. Nankai Business Review, 2020, 23(1): 189-199.)
|
[5] |
Chen L R, Li W L, Chen H, et al. Detection of Fake Reviews: Analysis of Sellers' Manipulation Behavior[J]. Sustainability, 2019, 11(17): Article No.4802.
|
[6] |
Hu N, Bose I, Gao Y J, et al. Manipulation in Digital Word-of-Mouth: A Reality Check for Book Reviews[J]. Decision Support Systems, 2011, 50(3): 627-635.
doi: 10.1016/j.dss.2010.08.013
|
[7] |
吴峰, 谢聪, 姬少培. 基于跨领域迁移的AM-AdpGRU金融文本分类[J]. 应用科学学报, 2022, 40(5): 828-837.
|
[7] |
(Wu Feng, Xie Cong, Ji Shaopei. AM-AdpGRU Financial Text Classification Based on Cross-Domain[J]. Journal of Applied Sciences, 2022, 40(5): 828-837.)
|
[8] |
Zhang C R, Wang G, Wang S, et al. Cross-Domain Network Attack Detection Enabled by Heterogeneous Transfer Learning[J]. Computer Networks, 2023, 227: Article No.109692.
|
[9] |
张文韩, 刘小明, 杨关, 等. 多层结构化语义知识增强的跨领域命名实体识别[J]. 计算机研究与发展, 2023, 60(12):2864-2876.
|
[9] |
(Zhang Wenhan, Liu Xiaoming, Yang Guan, et al. Cross-Domain Named Entity Recognition of Multi-Level Structured Semantic Knowledge Enhancement[J]. Journal of Computer Research and Development, 2023, 60(12):2864-2876.)
|
[10] |
聂卉, 王佳佳. 产品评论垃圾识别研究综述[J]. 现代图书情报技术, 2014(2): 63-71.
|
[10] |
(Nie Hui, Wang Jiajia. Review of Product Review Spams Detection[J]. New Technology of Library and Information Service, 2014(2): 63-71.)
|
[11] |
Ott M, Choi Y, Cardie C, et al. Finding Deceptive Opinion Spam by Any Stretch of the Imagination[OL]. arXiv Preprint, arXiv: 1107.4557.
|
[12] |
Jindal N, Liu B. Analyzing and Detecting Review Spam[C]// Proceedings of the 7th IEEE International Conference on Data Mining. IEEE, 2007: 547-552.
|
[13] |
Alsubari S N, Deshmukh S N, Alqarni A A, et al. Data Analytics for the Identification of Fake Reviews Using Supervised Learning[J]. Computers, Materials & Continua, 2022, 70(2): 3189-3204.
|
[14] |
聂卉, 吴毅骏. 基于特征表现的虚假评论人预测研究[J]. 图书情报工作, 2015, 59(10): 102-109.
doi: 10.13266/j.issn.0252-3116.2015.10.015
|
[14] |
(Nie Hui, Wu Yijun. Study on Spammer Detection Based on Reviewer-Specific Characteristics[J]. Library and Information Service, 2015, 59(10): 102-109.)
doi: 10.13266/j.issn.0252-3116.2015.10.015
|
[15] |
赵军, 王红. 融合情感极性和逻辑回归的虚假评论检测方法[J]. 智能系统学报, 2016, 11(3): 336-342.
|
[15] |
(Zhao Jun, Wang Hong. Detection of Fake Reviews Based on Emotional Orientation and Logistic Regression[J]. CAAI Transactions on Intelligent Systems, 2016, 11(3): 336-342.)
|
[16] |
宋海霞, 严馨, 余正涛, 等. 基于自适应聚类的虚假评论检测[J]. 南京大学学报(自然科学版), 2013, 49(4): 433-438.
|
[16] |
Song Haixia, Yan Xin, Yu Zhengtao, et al. Detection of Fake Reviews Based on Adaptive Clustering[J]. Journal of Nanjing University (Natural Sciences), 2013, 49(4): 433-438.)
|
[17] |
任亚峰, 尹兰, 姬东鸿. 基于语言结构和情感极性的虚假评论识别[J]. 计算机科学与探索, 2014, 8(3): 313-320.
doi: 10.3778/j.issn.1673-9418.1310040
|
[17] |
(Ren Yafeng, Yin Lan, Ji Donghong. Deceptive Reviews Detection Based on Language Structure and Sentiment Polarity[J]. Journal of Frontiers of Computer Science & Technology, 2014, 8(3): 313-320.)
doi: 10.3778/j.issn.1673-9418.1310040
|
[18] |
Li H Y, Liu B, Mukherjee A, et al. Spotting Fake Reviews Using Positive-Unlabeled Learning[J]. Computación y Sistemas, 2014, 18(3): 467-475.
|
[19] |
Jindal N, Liu B. Opinion Spam and Analysis[C]// Proceedings of the 2008 International Conference on Web Search and Data Mining. ACM, 2008: 219-230.
|
[20] |
孟园, 王悦. 基于用户-评论-商户关系的虚假用户识别研究:用户偏差分析的视角[J]. 数据分析与知识发现, 2022, 6(6): 55-70.
|
[20] |
(Meng Yuan, Wang Yue. Identifying Fake Accounts with User-Review-Shop Relationship and User Deviation Analysis[J]. Data Analysis and Knowledge Discovery, 2022, 6(6): 55-70.)
|
[21] |
Vidanagama D U, Silva A T P, Karunananda A S. Ontology Based Sentiment Analysis for Fake Review Detection[J]. Expert Systems with Applications, 2022, 206: Article No.117869.
|
[22] |
任亚峰, 姬东鸿, 张红斌, 等. 基于PU学习算法的虚假评论识别研究[J]. 计算机研究与发展, 2015, 52(3): 639-648.
|
[22] |
(Ren Yafeng, Ji Donghong, Zhang Hongbin, et al. Deceptive Reviews Detection Based on Positive and Unlabeled Learning[J]. Journal of Computer Research and Development, 2015, 52(3): 639-648.)
|
[23] |
Lee M, Song Y H, Li L, et al. Detecting Fake Reviews with Supervised Machine Learning Algorithms[J]. The Service Industries Journal, 2022, 42(13-14): 1101-1121.
doi: 10.1080/02642069.2022.2054996
|
[24] |
缪裕青, 欧威健, 刘同来, 等. 基于情感极性与SMOTE过采样的虚假评论识别方法[J]. 计算机应用研究, 2018, 35(7): 2042-2045.
|
[24] |
(Miao Yuqing, Ou Weijian, Liu Tonglai, et al. Detection of Fake Reviews Based on Sentiment Polarity and Over-Sampling[J]. Application Research of Computers, 2018, 35(7): 2042-2045.)
|
[25] |
朱娟. 在线商品虚假评论关键问题研究综述[J]. 现代情报, 2017, 37(5): 166-171.
doi: 10.3969/j.issn.1008-0821.2017.05.028
|
[25] |
(Zhu Juan. A Review of Key Issues in the Opinion Spams of Online Products[J]. Journal of Modern Information, 2017, 37(5): 166-171.)
doi: 10.3969/j.issn.1008-0821.2017.05.028
|
[26] |
皮琪, 王文杰, 杨飞, 等. 基于深度学习的虚假评论识别[J]. 网络新媒体技术, 2016, 5(6): 30-33.
|
[26] |
(Pi Qi, Wang Wenjie, Yang Fei, et al. Spam Review Detection Based on Deep Learning Framework[J]. Journal of Network New Media, 2016, 5(6): 30-33.)
|
[27] |
Xu Y Z, Li Q. Attention-Based Feature Fusion Network for Fake Reviews Detection[C]// Proceedings of the 3rd International Conference on Artificial Intelligence and Advanced Manufacture. ACM, 2021: 666-671.
|
[28] |
Mohawesh R, Xu S X, Springer M, et al. Fake or Genuine? Contextualised Text Representation for Fake Review Detection[OL]. arXiv Preprint, arXiv: 2112.14343.
|
[29] |
林婧雯, 李建敦, 王赢胜, 等. 在线商品评论中的虚假评论识别模型研究[J]. 福建电脑, 2022, 38(8): 10-13.
|
[29] |
(Lin Jingwen, Li Jiandun, Wang Yingsheng, et al. Research on the Identification Model of False Comments on Online Goods[J]. Journal of Fujian Computer, 2022, 38(8): 10-13.)
|
[30] |
施运梅, 袁博, 张乐, 等. IMTS: 融合图像与文本语义的虚假评论检测方法[J]. 数据分析与知识发现, 2022, 6(8): 84-96.
|
[30] |
(Shi Yunmei, Yuan Bo, Zhang Le, et al. IMTS: Detecting Fake Reviews with Image and Text Semantics[J]. Data Analysis and Knowledge Discovery, 2022, 6(8): 84-96.)
|
[31] |
Zhou G Y, He T T, Wu W S, et al. Linking Heterogeneous Input Features with Pivots for Domain Adaptation[C]// Proceedings of the 24th International Conference on Artificial Intelligence. ACM, 2015: 1419-1425.
|
[32] |
Sun Y, Wang S H, Li Y K, et al. ERNIE: Enhanced Representation Through Knowledge Integration[OL]. arXiv Preprint, arXiv: 1904.09223.
|
[33] |
Vaswani A, Shazeer N M, Parmar N, et al. Attention is All You Need[OL]. arXiv Preprint, arXiv: 1706.03762.
|
[34] |
Chen Y. Convolutional Neural Network for Sentence Classification[D]. Waterloo: University of Waterloo, 2015.
|
[35] |
刘策, 李贞, 颜明会. 面向大众点评网评论的文本情感分析研究[J]. 现代信息科技, 2021, 5(19): 37-39.
|
[35] |
(Liu Ce, Li Zhen, Yan Minghui. Research on Text Emotion Analysis for Comments on Public Comments Network[J]. Modern Information Technology, 2021, 5(19): 37-39.)
|
[36] |
孟美任, 丁晟春. 虚假商品评论信息发布者行为动机分析[J]. 情报科学, 2013, 31(10): 100-104.
|
[36] |
(Meng Meiren, Ding Shengchun. Motivation and Behavior of the Fraud Reviews' Publishers[J]. Information Science, 2013, 31(10): 100-104.)
|
[37] |
张文, 王强, 马振中, 等. 在线商品虚假评论发布动机及形成机理研究[J]. 中国管理科学, 2022, 30(7): 176-188.
|
[37] |
(Zhang Wen, Wang Qiang, Ma Zhenzhong, et al. Research on the Motivation and Formation Mechanism of Online Products Deceptive Reviews[J]. Chinese Journal of Management Science, 2022, 30(7): 176-188.)
|
[38] |
Alonso M A, Vilares D, Gómez-Rodríguez C, et al. Sentiment Analysis for Fake News Detection[J]. Electronics, 2021, 10(11): Article No.1348.
|
[39] |
陈燕方, 李志宇. 基于评论产品属性情感倾向评估的虚假评论识别研究[J]. 现代图书情报技术, 2014(9): 81-90.
|
[39] |
(Chen Yanfang, Li Zhiyu. Research on Product Review Attribute-Based of Emotion Evaluate Review Spam Detection[J]. New Technology of Library and Information Service, 2014(9): 81-90.)
|
[40] |
汤皓星. 商品虚假评论检测技术研究及软件实现[D]. 兰州: 西北民族大学, 2021.
|
[40] |
(Tang Haoxing. Research on Technology Detection of Commodity Fake Review and Software Implementation[D]. Lanzhou: Northwest Minzu University, 2021.)
|
|
Viewed |
|
|
|
Full text
|
|
|
|
|
Abstract
|
|
|
|
|
Cited |
|
|
|
|
|
Shared |
|
|
|
|
|
Discussed |
|
|
|
|