IMTS: 融合图像与文本语义的虚假评论检测方法

doi:10.11925/infotech.2096-3467.2021-1245

数据分析与知识发现

本期目录 | 过刊浏览 | 高级检索

IMTS: 融合图像与文本语义的虚假评论检测方法

施运梅,袁博,张乐,吕学强

(北京信息科技大学网络文化与数字传播北京市重点实验室北京 100101) (北京信息科技大学计算机学院北京 100101)

IMTS: Fake review detection method by fusing image and text semantics

Shi Yunmei,Yuan Bo,Zhang Le,lv Xueqiang

(Beijing Key Laboratory of Internet Culture and Digital Dissemination Research, Beijing Information Science and Technology University, Beijing 100101, China) (School of computer science, Beijing Information Science and Technology University, Beijing 100101, China)

摘要
相关文章
Metrics

全文: PDF (729 KB)
输出: BibTeX | EndNote (RIS)

摘要

[目的]针对“网络水军”发布的虚假评论信息在电商网站泛滥的问题，集成了一种面向中文电商网站评论的融合图像信息与文本语义的虚假评论检测方法(IMTS)。

[方法]IMTS方法使用文本卷积神经网络(TextCNN)及Bert预训练模型分别对文本评论信息进行特征提取，并得到对应的特征向量。再融入评论者特征，通过拼接评论文本语义与评论者ID的输出特征，进一步加强模型对整体语义信息的捕捉。再将用户在评论中发布的图片利用残差网络(ResNet)进行特征抽取，获得对应的视觉特征，最后将文本特征与视觉特征进行多模态融合对虚假评论进行检测。

[结果]IMTS方法在自建的多模态中文虚假评论数据集上，达到了96.36%的准确率，96.35%的召回率以及96.35%的F1值。

[局限]限于计算能力，本文的数据集规模较小，且在文本处理阶段使用了Bert预训练模型，在大规模的数据计算情况下，时间成本较高。

[结论]运用多模态思想以及特征融合方法来对虚假评论文本进行特征补充从而检测虚假评论是有效的，此方法可以有效提升虚假评论整体的检测精度。

	服务

	把本文推荐给朋友
	加入引用管理器
	E-mail Alert
	RSS
	作者相关文章

关键词 ：虚假评论, 多模态, 文本, 图像, Bert

Abstract：

[Objective]Aiming at the problem of the proliferation of fake comment information published by "Internet Water Army" on e-commerce websites, a fake comment detection method (IMTS) that integrates image information and text semantics for Chinese e-commerce website comments is integrated.

[Methods]The IMTS method uses the text convolutional neural network (TextCNN) and the Bert pre-training model to extract the features of the text review information, and obtain the corresponding feature vectors. The reviewer features are then integrated, and the model's capture of the overall semantic information is further enhanced by splicing the review text semantics and the output features of the reviewer ID. Then use Residual Network (ResNet) to extract features from pictures posted by users in comments to obtain corresponding visual features, and finally perform multimodal fusion of text features and visual features to detect false comments.

[Results]The IMTS method achieves 96.36% accuracy, 96.35% recall and 96.35% F1 value on the self-built multimodal Chinese fake comment dataset.

[Limitations]Due to the limitation of computing power, the dataset in this paper is small in scale, and the Bert pre-training model is used in the text processing stage. In the case of large-scale data computing, the time cost is high.

[Conclusions] It is effective to use multi-modal thinking and feature fusion method to supplement the fake comment text to detect fake comments. This method can effectively improve the overall detection accuracy of fake comments.

Key words： False comment Multimodal Text Image Bert

出版日期: 2022-03-29

ZTFLH:

TP393

引用本文:

施运梅, 袁博, 张乐, 吕学强. IMTS: 融合图像与文本语义的虚假评论检测方法 [J]. 数据分析与知识发现, 10.11925/infotech.2096-3467.2021-1245.
Shi Yunmei, Yuan Bo, Zhang Le, lv Xueqiang. IMTS: Fake review detection method by fusing image and text semantics . Data Analysis and Knowledge Discovery, 0, (): 1-.

链接本文:

https://manu44.magtech.com.cn/Jwk_infotech_wk3/CN/10.11925/infotech.2096-3467.2021-1245 或 https://manu44.magtech.com.cn/Jwk_infotech_wk3/CN/Y0/V/I/1

[1]	赵萌, 王昊, 李晓敏. 中国民歌多情感识别及情感变化规律分析研究^*[J]. 数据分析与知识发现, 2023, 7(7): 111-124.
[2]	成全, 董佳. 面向分级阅读的儿童读物层级多标签分类研究^*[J]. 数据分析与知识发现, 2023, 7(7): 156-169.
[3]	刘美玲, 尚玥, 赵铁军, 周继云. 基于代价敏感学习的不平衡虚假评论处理模型^*[J]. 数据分析与知识发现, 2023, 7(6): 113-122.
[4]	施国良, 周抒, 王云峰, 施春江, 刘亮. 基于改进多头注意力机制的专利文本摘要生成研究^*[J]. 数据分析与知识发现, 2023, 7(6): 61-72.
[5]	胥桂仙, 张子欣, 于绍娜, 董玉双, 田媛. 基于图卷积网络的藏文新闻文本分类^*[J]. 数据分析与知识发现, 2023, 7(6): 73-85.
[6]	徐康, 余胜男, 陈蕾, 王传栋. 基于语言学知识增强的自监督式图卷积网络的事件关系抽取方法^*[J]. 数据分析与知识发现, 2023, 7(5): 92-104.
[7]	本妍妍, 庞雪芹. 融入词性的医疗命名实体识别研究^*[J]. 数据分析与知识发现, 2023, 7(5): 123-132.
[8]	刘洋, 张雯, 胡毅, 毛进, 黄菲. 基于多模态深度学习的酒店股票预测^*[J]. 数据分析与知识发现, 2023, 7(5): 21-32.
[9]	叶光辉, 李松烨, 宋孝英. 基于多标签标注学习的城市画像文本分类方法研究^*[J]. 数据分析与知识发现, 2023, 7(5): 60-70.
[10]	潘华莉, 谢珺, 高婧, 续欣莹, 王长征. 融合多模态特征的深度强化学习推荐模型^*[J]. 数据分析与知识发现, 2023, 7(4): 114-128.
[11]	张昱, 张海军, 刘雅情, 梁科晋, 王月阳. 基于双向掩码注意力机制的多模态情感分析^*[J]. 数据分析与知识发现, 2023, 7(4): 46-55.
[12]	李佳蕾, 安培浚, 肖仙桃. 学科交叉主题识别方法研究综述^*[J]. 数据分析与知识发现, 2023, 7(4): 1-15.
[13]	吕琦, 上官燕红, 张琳, 黄颖. 基于文本内容自动分类的跨学科测度研究^*[J]. 数据分析与知识发现, 2023, 7(4): 56-67.
[14]	李岱峰, 林凯欣, 李栩婷. 基于提示学习与T5 PEGASUS的图书宣传自动摘要生成器^*[J]. 数据分析与知识发现, 2023, 7(3): 121-130.
[15]	赵朝阳, 朱贵波, 王金桥. ChatGPT给语言大模型带来的启示和多模态大模型新的发展思路^*[J]. 数据分析与知识发现, 2023, 7(3): 26-35.

Viewed

Full text

Abstract

Cited

Shared

Discussed