|
|
Unbalanced Fake Review Processing Model Based on Cost-Sensitive Learning
|
Liu Meiling,Shang Yue,Zhao Tiejun,Zhou Jiyun
|
(School of Information and Computer Engineering, Northeast Forestry University, Harbin 150006, China)
(Department of Computer Science, Harbin Institute of Technology, Harbin 150001, China)
(Lieber Institute, Johns Hopkins University, Baltimore, MD 21218, USA)
|
|
|
Abstract
[Objective] It enhances the learning of deep semantic information of the text in the fake review detection task, and solves the serious data imbalance problem in this task. [Methods] Based on the user behavior characteristics and text characteristics of the data itself, the cost-sensitive matrix is automatically learned by calculating the inter-class separability, which enhances the model's learning ability for unbalanced data; at the same time, the model is further optimized by using BERT's ability in text encoding. [Results] Through a large number of experiments on the YelpCHI dataset, compared with the existing advanced methods, the F1 value of the proposed model has been improved by 18%, and the AUC value has been improved by 12%. [Limitations] The application of the proposed method to more research fields remains to be further explored. [Conclusions] Taking user behavior features and comment text features as feature sets between fake review class and real class for category separability calculation can effectively enhance the performance of the model for fake review detection.
|
Published: 11 November 2022
|
|
|
Viewed |
|
|
|
Full text
|
|
|
|
|
Abstract
|
|
|
|
|
Cited |
|
|
|
|
|
Shared |
|
|
|
|
|
Discussed |
|
|
|
|