Please wait a minute...
Data Analysis and Knowledge Discovery  2021, Vol. 5 Issue (5): 21-29    DOI: 10.11925/infotech.2096-3467.2020.0884
Current Issue | Archive | Adv Search |
Detecting Social Media Fake News with Semantic Consistency Between Multi-model Contents
Zhang Guobiao1,2,Li Jie3()
1School of Information Management, Wuhan University, Wuhan 430072, China
2Institute for Information Retrieval and Knowledge Mining, Wuhan University, Wuhan 430072, China
3School of Sociology, Soochow University, Suzhou 215000, China
Download: PDF (2867 KB)   HTML ( 16
Export: BibTeX | EndNote (RIS)      
Abstract  

[Objective] This study aims to detect fake news on social media earlier and curb the dissemination of mis/dis-information. [Methods] Based on the features of news images and texts, we mapped the images to semantic tags and calculated the semantic consistency between images and texts. Then, we constructed a model to detect fake news. Finally, we examined our new model with the FakeNewsNet dataset. [Results] The F1 value of our model was up to 0.775 on PolitiFact data and 0.879 on GossipCop data. [Limitations] Due to the limits of existing annotation methods for image semantics, we could not accurately describe image contents, and calculate semantic consistency. [Conclusions] The constructed model could effectively detect fake news from social media.

Key wordsFake News Detection      Social Media      Multi-modal Feature Fusion      Semantic Consistency      Deep Learning     
Received: 08 September 2020      Published: 24 November 2020
ZTFLH:  TP393  
Fund:The work is supported by Soochow University 2020 Humanities and Social Sciences Excellent Academic Team Project(NH33711520)
Corresponding Authors: Li Jie     E-mail: allison_lijie@163.com

Cite this article:

Zhang Guobiao,Li Jie. Detecting Social Media Fake News with Semantic Consistency Between Multi-model Contents. Data Analysis and Knowledge Discovery, 2021, 5(5): 21-29.

URL:

http://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/10.11925/infotech.2096-3467.2020.0884     OR     http://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/Y2021/V5/I5/21

An Example of Image and Text Semantic Inconsistency of Fake News
Image Label Mapping Process
Social Media Fake News Detection Model Based on Multi-modal Feature Fusion
项目 PolitiFact GossipCop
Fake True Fake True
训练集 2 466 3 190 14 737 17 922
验证集 352 456 2 105 2 560
测试集 705 912 4 210 5 121
总计 3 523 4 558 21 052 25 603
FakeNewsNet Experimental Data
参数 参数值
Epoch 50
Dropout 0.4
Batch_size 32
激活函数 ReLU
学习率 0.0001
图像全连接层神经元个数 200
MLP各层神经元个数 500,200,100
Experimental Parameter Settings
特征类型 PolitiFact GossipCop
准确度 精确率 召回率 F1 准确度 精确率 召回率 F1
文本特征 0.761 0.768 0.773 0.753 0.836 0.810 0.821 0.815
图像特征 0.540 0.520 0.560 0.520 0.654 0.704 0.702 0.653
语义一致性特征 0.520 0.450 0.524 0.480 0.564 0.530 0.545 0.548
文本与图像特征 0.782 0.784 0.813 0.770 0.857 0.827 0.838 0.836
全部特征 0.791 0.792 0.803 0.775 0.883 0.864 0.853 0.879
EANN 0.776 0.764 0.798 0.768 0.841 0.814 0.796 0.806
Fake News Detection Results
Average Value of Semantic Consistency of Each CNN Model
Examples of News Text and Image Semantic Consistency
[1] Aldwairi M, Alwahedi A. Detecting Fake News in Social Media Networks[J]. Procedia Computer Science, 2018,141:215-222.
doi: 10.1016/j.procs.2018.10.171
[2] Kim A, Moravec P L, Dennis A R. Combating Fake News on Social Media with Source Ratings: The Effects of User and Expert Reputation Ratings[J]. Journal of Management Information Systems, 2019,36(3):931-968.
doi: 10.1080/07421222.2019.1628921
[3] Shu K, Mahudeswaran D, Wang S, et al. Hierarchical Propagation Networks for Fake News Detection: Investigation and Exploitation[C]// Proceedings of the 14th International AAAI Conference on Web and Social Media. 2020.
[4] Qi P, Cao J, Yang T, et al. Exploiting Multi-domain Visual Information for Fake News Detection[C]// Proceedings of the 19th IEEE International Conference on Data Mining (ICDM), Beijing, China. USA: IEEE, 2019.
[5] Castillo C, Mendoza M, Poblete B. Information Credibility on Twitter[C]// Proceedings of the 20th International Conference on World Wide Web, Hyderabad, India. New York, USA: ACM, 2011.
[6] Rashkin H, Choi E, Jang J Y, et al. Truth of Varying Shades: Analyzing Language in Fake News and Political Fact-checking[C]// Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, Copenhagen, Denmark. USA: ACL, 2017.
[7] Ma J, Gao W, Mitra P, et al. Detecting Rumors from Microblogs with Recurrent Neural Networks[C]// Proceedings of the 25th International Joint Conference on Artificial Intelligence (IJCAI), New York, USA. New York, USA: ACM, 2016.
[8] Popat K, Mukherjee S, Yates A, et al. DeClarE: Debunking Fake News and False Claims Using Evidence-Aware Deep Learning[C]// Proceeding of the 2018 Conference on Empirical Methods in Natural Language Processing (EMNLP), Brussels, Belgium. USA: ACL, 2018: 22-32.
[9] Jin Z, Cao J, Guo H, et al. Multimodal Fusion with Recurrent Neural Networks for Rumor Detection on Microblogs[C]// Proceedings of the 25th ACM International Conference on Multimedia, Mountain View, USA. New York, USA: ACM, 2017: 795-816.
[10] Wang Y, Ma F, Jin Z, et al. EANN: Event Adversarial Neural Networks for Multi-modal Fake News Detection[C]// Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, London, UK. New York, USA: ACM, 2018.
[11] Khattar D, Goud J S, Gupta M, et al. MVAE: Multimodal Variational Autoencoder for Fake News Detection[C]// Proceedings of the 2019 World Wide Web Conference. ACM, 2019.
[12] Sing V K, Ghosh I, Sonagara D. Detecting Fake News Stories via Multimodal Analysis[J]. Journal of the Association for Information Science and Technology, 2021,72(1):3-17.
doi: 10.1002/asi.v72.1
[13] 鲍远福. 新媒体文本表意论:从“语图关系”到“语图间性”[J]. 南京邮电大学学报(社会科学版), 2016,18(1):11-22.
[13] ( Bao Yuanfu. Ideographic Text of New Media: From “Language-icon Relationship” to “Language-Icon Intertextuality”[J]. Journal of Nanjing University of Posts and Telecommunications (Social Science), 2016,18(1):11-22.)
[14] Gombrich E H. The Image and the Eye: Further Studies in the Psychology of Pictorial Representation[M]. Oxford: Phaidon Press, 1982: 150.
[15] Deng J, Dong W, Socher R, et al. ImageNet: A Large-scale Hierarchical Image Database[C]// Proceedings of the 2009 IEEE Conference on Computer Vision & Pattern Recognition, Miami, USA. USA: IEEE, 2009.
[16] Krizhevsky A, Sutskever I, Hinton G E. Imagenet Classification with Deep Convolutional Neural Networks[C]// Proceedings of the 25th International Conference on Neural Information Processing Systems (NIPS), Lake Tahoe, USA. 2012: 1097-1105.
[17] Mikolov T, Sutskever I, Chen K, et al. Distributed Representations of Words and Phrases and Their Compositionality[C]// Proceedings of the 26th International Conference on Neural Information Processing Systems. 2013: 3111-3119.
[18] Maas A L, Daly R E, Pham P T, et al. Learning Word Vectors for Sentiment Analysis[C]// Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, Portland, USA. New York, USA: ACM, 2011: 142-150.
[19] Gentzkow M, Shapiro J M, Stone D F. Media Bias in the Marketplace: Theory[R]. National Bureau of Economic Research, Inc., 2014: 623-645.
[20] Hochreiter S, Schmidhuber J. Long Short-Term Memory[J]. Neural Computation, 1997,9(8):1735-1780.
pmid: 9377276
[21] Jibril T A, Abdullah M H. Relevance of Emoticons in Computer-Mediated Communication Contexts: An Overview[J]. Asian Social Ence, 2013,9(4):201-207.
[22] Yoon J, Chung E. Image Use in Social Network Communication: A Case Study of Tweets on the Boston Marathon Bombing[J]. Information Research, 2016,21(1):106-116.
[23] He K, Zhang X, Ren S, et al. Deep Residual Learning for Image Recognition[C]// Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, USA. USA: IEEE, 2016.
[24] Mikolov T, Chen K, Corrado G, et al. Efficient Estimation of Word Representations in Vector Space[OL]. arXiv Preprint, arXiv: 1301. 3781.
[25] Simonyan K, Zisserman A. Very Deep Convolutional Networks for Large-scale Image Recognition[OL]. arXiv Preprint, arXiv: 1409. 1556.
[26] Szegedy C, Liu W, Jia Y, et al. Going Deeper with Convolutions[C]// Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, USA. USA: IEEE, 2015.
[27] Chollet F. Xception: Deep Learning with Depthwise Separable Convolutions[C]// Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, USA. USA: IEEE, 2017.
[28] Shu K, Mahudeswaran D, Wang S, et al. Fakenewsnet: A Data Repository with News Content, Social Context and Dynamic Information for Studying Fake News on Social Media[OL]. arXiv Preprint ,arXiv: 1809. 01286.
[29] Autonomio Talos[EB/OL]. [ 2020- 11- 07]. http://github.com/autonomio/talos .
[1] Cheng Bin,Shi Shuicai,Du Yuncheng,Xiao Shibin. Keyword Extraction for Journals Based on Part-of-Speech and BiLSTM-CRF Combined Model[J]. 数据分析与知识发现, 2021, 5(3): 101-108.
[2] Chang Chengyang,Wang Xiaodong,Zhang Shenglei. Polarity Analysis of Dynamic Political Sentiments from Tweets with Deep Learning Method[J]. 数据分析与知识发现, 2021, 5(3): 121-131.
[3] Feng Yong,Liu Yang,Xu Hongyan,Wang Rongbing,Zhang Yonggang. Recommendation Model Incorporating Neighbor Reviews for GRU Products[J]. 数据分析与知识发现, 2021, 5(3): 78-87.
[4] Hu Haotian,Ji Jinfeng,Wang Dongbo,Deng Sanhong. An Integrated Platform for Food Safety Incident Entities Based on Deep Learning[J]. 数据分析与知识发现, 2021, 5(3): 12-24.
[5] Zhang Qi,Jiang Chuan,Ji Youshu,Feng Minxuan,Li Bin,Xu Chao,Liu Liu. Unified Model for Word Segmentation and POS Tagging of Multi-Domain Pre-Qin Literature[J]. 数据分析与知识发现, 2021, 5(3): 2-11.
[6] Lv Xueqiang,Luo Yixiong,Li Jiaquan,You Xindong. Review of Studies on Detecting Chinese Patent Infringements[J]. 数据分析与知识发现, 2021, 5(3): 60-68.
[7] Li Danyang, Gan Mingxin. Music Recommendation Method Based on Multi-Source Information Fusion[J]. 数据分析与知识发现, 2021, 5(2): 94-105.
[8] Huang Lu,Zhou Enguo,Li Daifeng. Text Representation Learning Model Based on Attention Mechanism with Task-specific Information[J]. 数据分析与知识发现, 2020, 4(9): 111-122.
[9] Liu Qian, Li Chenliang. A Survey of Topic Evolution on Social Media[J]. 数据分析与知识发现, 2020, 4(8): 1-14.
[10] Yu Chuanming, Wang Manyi, Lin Hongjun, Zhu Xingyu, Huang Tingting, An Lu. A Comparative Study of Word Representation Models Based on Deep Learning[J]. 数据分析与知识发现, 2020, 4(8): 28-40.
[11] Li Gang, Guan Weidong, Ma Yaxue, Mao Jin. Predicting Social Media Visibility of Scholarly Articles[J]. 数据分析与知识发现, 2020, 4(8): 63-74.
[12] Xu Chenfei, Ye Haiying, Bao Ping. Automatic Recognition of Produce Entities from Local Chronicles with Deep Learning[J]. 数据分析与知识发现, 2020, 4(8): 86-97.
[13] Zhao Yang, Zhang Zhixiong, Liu Huan, Ding Liangping. Classification of Chinese Medical Literature with BERT Model[J]. 数据分析与知识发现, 2020, 4(8): 41-49.
[14] Wang Xinyun,Wang Hao,Deng Sanhong,Zhang Baolong. Classification of Academic Papers for Periodical Selection[J]. 数据分析与知识发现, 2020, 4(7): 96-109.
[15] Jiao Qihang,Le Xiaoqiu. Generating Sentences of Contrast Relationship[J]. 数据分析与知识发现, 2020, 4(6): 43-50.
  Copyright © 2016 Data Analysis and Knowledge Discovery   Tel/Fax:(010)82626611-6626,82624938   E-mail:jishu@mail.las.ac.cn