Data Analysis and Knowledge Discovery  2021, Vol. 5 Issue (5): 21-29    DOI: 10.11925/infotech.2096-3467.2020.0884
Detecting Social Media Fake News with Semantic Consistency Between Multi-model Contents
Zhang Guobiao1,2,Li Jie3()
1School of Information Management, Wuhan University, Wuhan 430072, China
2Institute for Information Retrieval and Knowledge Mining, Wuhan University, Wuhan 430072, China
3School of Sociology, Soochow University, Suzhou 215000, China
[Objective] This study aims to detect fake news on social media earlier and curb the dissemination of mis/dis-information. [Methods] Based on the features of news images and texts, we mapped the images to semantic tags and calculated the semantic consistency between images and texts. Then, we constructed a model to detect fake news. Finally, we examined our new model with the FakeNewsNet dataset. [Results] The F1 value of our model was up to 0.775 on PolitiFact data and 0.879 on GossipCop data. [Limitations] Due to the limits of existing annotation methods for image semantics, we could not accurately describe image contents, and calculate semantic consistency. [Conclusions] The constructed model could effectively detect fake news from social media.

Key wordsFake News Detection      Social Media      Multi-modal Feature Fusion      Semantic Consistency      Deep Learning     
Received: 08 September 2020      Published: 24 November 2020
ZTFLH:  TP393  
Fund:The work is supported by Soochow University 2020 Humanities and Social Sciences Excellent Academic Team Project(NH33711520)
Li Jie

Zhang Guobiao,Li Jie. Detecting Social Media Fake News with Semantic Consistency Between Multi-model Contents. Data Analysis and Knowledge Discovery, 2021, 5(5): 21-29.

An Example of Image and Text Semantic Inconsistency of Fake News
Image Label Mapping Process
Social Media Fake News Detection Model Based on Multi-modal Feature Fusion
项目 PolitiFact GossipCop
Fake True Fake True
训练集 2 466 3 190 14 737 17 922
验证集 352 456 2 105 2 560
测试集 705 912 4 210 5 121
总计 3 523 4 558 21 052 25 603
FakeNewsNet Experimental Data
参数 参数值
Epoch 50
Dropout 0.4
Batch_size 32
激活函数 ReLU
学习率 0.0001
图像全连接层神经元个数 200
MLP各层神经元个数 500,200,100
Experimental Parameter Settings
特征类型 PolitiFact GossipCop
准确度 精确率 召回率 F1 准确度 精确率 召回率 F1
文本特征 0.761 0.768 0.773 0.753 0.836 0.810 0.821 0.815
图像特征 0.540 0.520 0.560 0.520 0.654 0.704 0.702 0.653
语义一致性特征 0.520 0.450 0.524 0.480 0.564 0.530 0.545 0.548
文本与图像特征 0.782 0.784 0.813 0.770 0.857 0.827 0.838 0.836
全部特征 0.791 0.792 0.803 0.775 0.883 0.864 0.853 0.879
EANN 0.776 0.764 0.798 0.768 0.841 0.814 0.796 0.806
Fake News Detection Results
Average Value of Semantic Consistency of Each CNN Model
Examples of News Text and Image Semantic Consistency
