Please wait a minute...
Data Analysis and Knowledge Discovery
Current Issue | Archive | Adv Search |
Unbalanced Fake Review Processing Model Based on Cost-Sensitive Learning
Liu Meiling,Shang Yue,Zhao Tiejun,Zhou Jiyun
(School of Information and Computer Engineering, Northeast Forestry University, Harbin 150006, China) (Department of Computer Science, Harbin Institute of Technology, Harbin 150001, China) (Lieber Institute, Johns Hopkins University, Baltimore, MD 21218, USA)
Download:
Export: BibTeX | EndNote (RIS)      
Abstract  

[Objective] It enhances the learning of deep semantic information of the text in the fake review detection task, and solves the serious data imbalance problem in this task. [Methods] Based on the user behavior characteristics and text characteristics of the data itself, the cost-sensitive matrix is automatically learned by calculating the inter-class separability, which enhances the model's learning ability for unbalanced data; at the same time, the model is further optimized by using BERT's ability in text encoding. [Results] Through a large number of experiments on the YelpCHI dataset, compared with the existing advanced methods, the F1 value of the proposed model has been improved by 18%, and the AUC value has been improved by 12%. [Limitations] The application of the proposed method to more research fields remains to be further explored. [Conclusions] Taking user behavior features and comment text features as feature sets between fake review class and real class for category separability calculation can effectively enhance the performance of the model for fake review detection.

Key words fake review detection      class separability computation      cost-sensitive learning      unbalanced data processing      
Published: 11 November 2022
ZTFLH:  TP393,G250  

Cite this article:

Liu Meiling, Shang Yue, Zhao Tiejun, Zhou Jiyun. Unbalanced Fake Review Processing Model Based on Cost-Sensitive Learning . Data Analysis and Knowledge Discovery, 0, (): 1-.

URL:

https://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/10.11925/infotech.2096-3467.2022-0442     OR     https://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/Y0/V/I/1

[1] Liu Meiling, Shang Yue, Zhao Tiejun, Zhou Jiyun. Unbalanced Fake Review Processing Model Based on Cost-Sensitive Learning[J]. 数据分析与知识发现, 2023, 7(6): 113-122.
[2] Zhang Yunqiu, Li Bocheng, Chen Yan. Automatic Classification with Unbalanced Data for Electronic Medical Records[J]. 数据分析与知识发现, 2022, 6(2/3): 233-241.
[3] Jiafen Wu,Feicheng Ma. Detecting Product Review Spam: A Survey[J]. 数据分析与知识发现, 2019, 3(9): 1-15.
[4] Chen Yanfang, Li Zhiyu. Research on Product Review Attribute-Based of Emotion Evaluate Review Spam Detection[J]. 现代图书情报技术, 2014, 30(9): 81-90.
  Copyright © 2016 Data Analysis and Knowledge Discovery   Tel/Fax:(010)82626611-6626,82624938   E-mail:jishu@mail.las.ac.cn