Please wait a minute...
Advanced Search
数据分析与知识发现  2022, Vol. 6 Issue (2/3): 376-384     https://doi.org/10.11925/infotech.2096-3467.2021.0964
  专辑 本期目录 | 过刊浏览 | 高级检索 |
多视角证据融合的虚假新闻甄别*
李保珍(),陈科
南京审计大学信息工程学院 南京 211815
Identifying “Fake News” with Multi-Perspective Evidence Fusion
Li Baozhen(),Chen Ke
School of Information Engineering, Nanjing Auditing University, Nanjing 211815, China
全文: PDF (1005 KB)   HTML ( 24
输出: BibTeX | EndNote (RIS)      
摘要 

【目的】 针对现有虚假新闻分类算法的不足,提出一种多视角证据融合的虚假新闻分类模型,用以解决传统单视角下虚假新闻分类的片面性证据和不准确性分类等问题。【方法】 引入主观逻辑模型,以及不同视角下分类的不确定性测度,基于Dempster-Shafer证据理论,利用不同的权值融合多视角下的证据,得到总体的证据与分类的不确定性测度。【结果】 基于两个公开数据集的实验结果表明,所提模型的准确性和F1值比传统虚假新闻分类模型均有较显著的提高。【局限】 多视角证据融合之后存在一定的噪声,有时会影响实验结果的准确性。【结论】 多视角证据融合途径可有效提高虚假新闻甄别的准确性和鲁棒性。

服务
把本文推荐给朋友
加入引用管理器
E-mail Alert
RSS
作者相关文章
李保珍
陈科
关键词 虚假新闻多视角证据融合不确定性    
Abstract

[Objective] This paper proposes a multi-perspective evidence fusion model for identifying fake news, aiming to address the issues of lacking evidence and inaccurate classification in traditional model. [Methods] With the help of subjective logical model and uncertainty measurements for the classification from different perspectives, we modified the Dempster-Shafer evidence theory. Then, we used different weights to combine the evidence from multiple perspectives, and obtained the uncertainty measurements of the overall evidence and classification. [Results] We examined our model with two public data sets, and found its accuracy and F1 values were significantly higher than the traditional models. [Limitations] Evidence fusion from multiple perspectives generated some noise, which might reduce the accuracy of the results. [Conclusions] Multi-perspective evidence fusion could effectively identify fake news.

Key wordsFake News    Multi-perspective    Evidence Fusion    Uncertainty
收稿日期: 2021-08-31      出版日期: 2022-04-14
ZTFLH:  TP391  
基金资助:*国家自然科学基金项目(71673122);*国家自然科学基金项目(72074117);江苏省研究生科研与实践创新计划项目的研究成果之一(SJCX21_0889)
通讯作者: 李保珍,ORCID:0000-0002-6160-1390     E-mail: bzli@nau.edu.cn
引用本文:   
李保珍, 陈科. 多视角证据融合的虚假新闻甄别*[J]. 数据分析与知识发现, 2022, 6(2/3): 376-384.
Li Baozhen, Chen Ke. Identifying “Fake News” with Multi-Perspective Evidence Fusion. Data Analysis and Knowledge Discovery, 2022, 6(2/3): 376-384.
链接本文:  
https://manu44.magtech.com.cn/Jwk_infotech_wk3/CN/10.11925/infotech.2096-3467.2021.0964      或      https://manu44.magtech.com.cn/Jwk_infotech_wk3/CN/Y2022/V6/I2/3/376
Fig.1  虚假新闻甄别的多视角证据融合分类的流程图
Fig.2  狄利克雷分布的典型例子
统计项 PolitiFact GossipCop
真新闻数量 400 500
假新闻数量 400 500
总数量 800 1 000
Table 1  数据集统计
模型 PolitiFact GossipCop
AUC F1 AUC F1
TF-IDF+SVM 0.67 0.58 0.83 0.87
Doc2Vec+Bagging 0.63 0.61 0.79 0.82
Keras+LSTM 0.82 0.81 0.65 0.70
MPEFC 0.99 0.98 0.99 0.99
Table 2  MPEFC模型与基线的对比
数据集 新闻样本 预测结果 真实分类
GossipCop Selena Gomez Is Going To Keep Her Blonde Hair 0 1
Selena Gomez and Justin Bieber baby news 1 0
PolitiFact JUST IN: John Kerry Facing Prison 1 0
American News is under construction 1 0
Table 3  MPEFC模型分类错误的部分新闻样本
Fig.3  不同规模的数据集下MPEFC模型与基线模型的准确率对比
[1] 吴万佩. 面向虚假新闻识别的主动学习算法研究与应用[D]. 北京: 北京交通大学, 2020.
[1] ( Wu Wanpei. Research and Application of Active Learning Algorithm for Fake News Recognition[D]. Beijing: Beijing Jiaotong University, 2020.)
[2] Shu K, Sliva A, Wang S H, et al. Fake News Detection on Social Media[J]. ACM SIGKDD Explorations Newsletter, 2017, 19(1):22-36.
doi: 10.1145/3137597.3137600
[3] Gravanis G, Vakali A, Diamantaras K, et al. Behind the Cues: A Benchmarking Study for Fake News Detection[J]. Expert Systems with Applications, 2019, 128:201-213.
doi: 10.1016/j.eswa.2019.03.036
[4] Lukasik M, Bontcheva K, Cohn T, et al. Gaussian Processes for Rumour Stance Classification in Social Media[J]. ACM Transactions on Information Systems, 2019, 37(2): Article No.20.
[5] Sa A, Hinkelmann K, Corradini F. Combining Machine Learning with Knowledge Engineering to Detect Fake News in Social Networks: A Survey[C]// Proceedings of the AAAI 2019 Spring Symposium on Combining Machine Learning with Knowledge Engineering. 2019.
[6] O’Brien N, Latessa S, Evangelopoulos G, et al. The Language of Fake News: Opening the Black-Box of Deep Learning Based Detectors[C]// Proceedings of the 32nd Conference on Neural Information Processing Systems. 2018.
[7] Ren Y C, Xing T, Xing Z F, et al. Design on Data Manipulation Class Based on ADO.NET[J]. Applied Mechanics and Materials, 2011, 109:603-607.
doi: 10.4028/www.scientific.net/AMM.109.603
[8] Tian Y L, Krishnan D, Isola P. Contrastive Multiview Coding[C]// Proceedings of European Conference on Computer Vision. 2020:776-794.
[9] Wang X, Kumar D, Thome N, et al. Recipe Recognition with Large Multimodal Food Dataset[C]// Proceedings of 2015 IEEE International Conference on Multimedia & Expo Workshops. 2015: 1-6.
[10] Zhang C Q, Han Z B, Fu H Z, et al. CPM-Nets: Cross Partial Multi-View Networks[C]// Proceedings of the 33rd Conference on Advances in Neural Information Processing Systems. 2019: 559-569.
[11] MacKay D J C. A Practical Bayesian Framework for Backpropagation Networks[J]. Neural Computation, 1992, 4(3):448-472.
doi: 10.1162/neco.1992.4.3.448
[12] Gal Y, Ghahramani Z. Dropout as a Bayesian Approximation: Representing Model Uncertainty in Deep Learning[OL]. arXiv Preprint, arXiv: 1506.02142.
[13] Sensoy M, Kaplan K, Kandemir M. Evidential Deep Learning to Quantify Classification Uncertainty[C]// Proceedings of the 32nd International Conference on Advances in Neural Information Processing Systems. 2018: 3179-3189.
[14] Wang W R, Arora R, Livescu K, et al. On Deep Multi-View Representation Learning[C]// Proceedings of the 32nd International Conference on International Conference on Machine Learning. 2015: 1083-1092.
[15] Wang W R, Yan X C, Lee H, et al. Deep Variational Canonical Correlation Analysis[OL]. arXiv Preprint, arXiv:1610.03454.
[16] Zhang H, Patel V M, Chellappa R. Hierarchical Multimodal Metric Learning for Multimodal Classification[C]// Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition. 2017: 3057-3065.
[17] Dempster A P. Upper and Lower Probabilities Induced by a Multivalued Mapping[J]. The Annals of Mathematical Statistics, 1967, 38(2):325-339.
doi: 10.1214/aoms/1177698950
[18] Shafer G. A Mathematical Theory of Evidence[M]. Princeton University Press, 1976:17-24.
[19] Dempster A P. A Generalization of Bayesian Inference[J]. Journal of the Royal Statistical Society: Series B (Methodological), 1968, 30(2):205-232.
doi: 10.1111/j.2517-6161.1968.tb00722.x
[20] Fu K S. Pattern Recognition and Machine Learning[M]. Boston, MA: Springer US, 1971.
[21] Devlin J, Chang M W, Lee K, et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding[C]// Proceedings of NAACL-HLT 2019. 2019:4171-4186.
[22] Frigyik B A, Kapila A, Gupta M R. Introduction to the Dirichlet Distribution and Related Processes[R]. UWEE Technical Report, UQEETR-2010-0006.
[23] 宋奎勇, 王念滨, 王红滨. 基于Shapelets的多变量D-S证据加权集成分类[J]. 吉林大学学报(信息科学版), 2021, 39(2):205-214.
[23] ( Song Kuiyong, Wang Nianbin, Wang Hongbin. Multivariate D-S Evidence Weighted Ensemble Classification Based on Shapelets[J]. Journal of Jilin University (Information Science Edition), 2021, 39(2):205-214.)
[24] 董增寿, 邓丽君, 曾建潮. 一种新的基于证据权重的D-S改进算法[J]. 计算机技术与发展, 2013, 23(5):58-62.
[24] ( Dong Zengshou, Deng Lijun, Zeng Jianchao. A New Improved D-S Algorithm Based on Weight of Evidence[J]. Computer Technology and Development, 2013, 23(5):58-62.)
[25] Han Z B, Zhang C Q, Fu H Z, et al. Trusted Multi-View Classification[C]// Proceedings of the 9th International Conference on Learning Representations. 2021.
[26] 张睿. 基于SVM的中文文本分类相关算法研究与实现[D]. 昆明: 昆明理工大学, 2013.
[26] ( Zhang Rui. Research and Implementation of Chinese Text Classification Algorithm Based on SVM[D]. Kunming: Kunming University of Science and Technology, 2013.)
[27] 陈杨楠. 基于改进SVM算法的投诉文本分类研究[D]. 合肥: 合肥工业大学, 2019.
[27] ( Chen Yangnan. Research on Classification of Complaint Texts Based on Improved SVM[D]. Hefei: Hefei University of Technology, 2019.)
[28] 党静园. 基于深度学习的网络舆情文本倾向性分析系统的研究与设计[D]. 西安: 西安电子科技大学, 2019.
[28] ( Dang Jingyuan. Research and Design of Online Public Opinion Text Orientation Analysis System Based on Deep Learning[D]. Xi’an: Xidian University, 2019.)
[1] 杨晗迅, 周德群, 马静, 罗永聪. 基于不确定性损失函数和任务层级注意力机制的多任务谣言检测研究*[J]. 数据分析与知识发现, 2021, 5(7): 101-110.
[2] 张国标,李洁. 融合多模态内容语义一致性的社交媒体虚假新闻检测*[J]. 数据分析与知识发现, 2021, 5(5): 21-29.
[3] 杜建. 医学知识不确定性测度的进展与展望*[J]. 数据分析与知识发现, 2020, 4(10): 14-27.
[4] 李克潮, 蓝冬梅, 凌霄娥. 云模型和多特征的高校读者借阅偏好不确定性图书推荐研究[J]. 现代图书情报技术, 2013, (5): 54-58.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
版权所有 © 2015 《数据分析与知识发现》编辑部
地址:北京市海淀区中关村北四环西路33号 邮编:100190
电话/传真:(010)82626611-6626,82624938
E-mail:jishu@mail.las.ac.cn