Please wait a minute...
Advanced Search
数据分析与知识发现  2023, Vol. 7 Issue (11): 79-87     https://doi.org/10.11925/infotech.2096-3467.2022.1144
  研究论文 本期目录 | 过刊浏览 | 高级检索 |
融合外部知识和用户交互特征的虚假新闻检测
刘帅1,傅丽芳2()
1东北农业大学工程学院 哈尔滨 150038
2东北农业大学文理学院 哈尔滨 150038
Identifying Fake News with External Knowledge and User Interaction Features
Liu Shuai1,Fu Lifang2()
1College of Engineering, Northeast Agricultural University, Harbin 150038, China
2College of Letters and Science, Northeast Agricultural University, Harbin 150038, China
全文: PDF (965 KB)   HTML ( 13
输出: BibTeX | EndNote (RIS)      
摘要 

目的】 针对虚假新闻在社交媒体中肆意传播这一现象,通过融入外部知识特征和用户交互特征,构建多维度数据分类模型以提高虚假新闻检测的效率和准确性。【方法】 提取虚假新闻文本的背景知识,通过维基知识图谱引入外部知识检测新闻内容与既有知识体系的内在一致性,同时根据心理学中相似效应理论分析传播链上的用户交互,通过改进图卷积网络的连接边权更真实地体现用户间相互影响,构建了一个融合外部知识、新闻内容、传播链特征与用户交互关系的多维度数据虚假新闻检测模型。【结果】 在两个公开数据集Twitter15、Twitter16上验证模型的性能,与5个类似模型进行对比分析,该模型的准确率分别达到0.901和0.927。【局限】 未考虑新闻附加内容中隐藏的知识信息和语言表达等其他特征,模型的可解释性也需要进一步提高。【结论】 外部知识和传播链用户交互特征等多维度数据信息融合的检测模型能够有效提高虚假新闻的识别准确率。

服务
把本文推荐给朋友
加入引用管理器
E-mail Alert
RSS
作者相关文章
刘帅
傅丽芳
关键词 虚假新闻检测特征工程网络社交媒体知识图谱    
Abstract

[Objective] This paper proposes a multidimensional-data classification model to improve the efficiency of fake news detection. The new model incorporates external knowledge features and user interaction features to reduce fake news spreading in social media. [Methods] First, we extracted the background knowledge of fake news. Then, we introduced external knowledge through the Wikipedia knowledge graph to detect the consistency between the news content and the existing knowledge system. Third, we analyzed the user interaction on the communication chain according to the psychological “similarity effect”. Finally, we improved the connection edge weight of the graph convolutional network to reflect the interaction between users. [Results] We examined the new model’s performance with two public datasets, Twitter15 and Twitter16. Compared with the other five similar models, our model’s accuracy reached 0.901 and 0.927. [Limitations] We did not consider features like knowledge information and language expression hidden in the additional news content. The model’s interpretability needs to be further improved. [Conclusions] By integrating news content, external knowledge, and user interaction characteristics of the communication chain, the proposed model can effectively detect fake news.

Key wordsFake News Detection    Feature Engineering    Online Social Media    Knowledge Graph
收稿日期: 2022-11-01      出版日期: 2023-04-28
ZTFLH:  G250 TP393  
通讯作者: 傅丽芳,ORCID:0000-0003-2298-2378,E-mail:lifangfu@neau.edu.cn。   
引用本文:   
刘帅, 傅丽芳. 融合外部知识和用户交互特征的虚假新闻检测[J]. 数据分析与知识发现, 2023, 7(11): 79-87.
Liu Shuai, Fu Lifang. Identifying Fake News with External Knowledge and User Interaction Features. Data Analysis and Knowledge Discovery, 2023, 7(11): 79-87.
链接本文:  
https://manu44.magtech.com.cn/Jwk_infotech_wk3/CN/10.11925/infotech.2096-3467.2022.1144      或      https://manu44.magtech.com.cn/Jwk_infotech_wk3/CN/Y2023/V7/I11/79
Fig.1  MFND模型的结构框架
Fig.2  知识抽取结果
Fig.3  实体链接结果展示
对比项目 Twitter15 Twitter16
源推文数量 742 412
标签为真 372 205
标签为假 370 207
用户数量 190 868 115 036
Table 1  数据集的结构
阶段 参数
预处理阶段 每个推文的用户数 40
最大文本长度 30
知识嵌入的维度 300
特征获取阶段 CNN输出维度 32
过滤器尺寸 3
BERT输出维度 32
GRU输出维度 32
GCN输出维度 32
GCN层数 2
训练阶段 Optimizer Adam
Epoch 100
学习率 0.001
Table 2  模型的参数设置
模型 Twitter15 Twitter16
F1-score Recall Precision Accuracy F1-score Recall Precision Accuracy
DTC 0.495 0.481 0.496 0.495 0.562 0.537 0.575 0.561
SVM-TS 0.519 0.519 0.520 0.520 0.692 0.691 0.693 0.693
mGRU 0.510 0.515 0.515 0.555 0.556 0.562 0.560 0.661
CSI 0.717 0.687 0.699 0.699 0.630 0.631 0.632 0.661
GCAN 0.825 0.830 0.826 0.877 0.759 0.763 0.759 0.908
MFND 0.890 0.886 0.894 0.901 0.879 0.878 0.880 0.927
Table 3  实验结果
Fig.4  消融实验的结果展示
[1] Popat K. Assessing the Credibility of Claims on the Web[C]// Proceedings of the 26th International Conference on World Wide Web Companion. 2017: 735-739.
[2] 张国标, 李洁. 融合多模态内容语义一致性的社交媒体虚假新闻检测[J]. 数据分析与知识发现, 2021, 5(5):21-29.
[2] (Zhang Guobiao, Li Jie. Detecting Social Media Fake News with Semantic Consistency Between Multi-model Contents[J]. Data Analysis and Knowledge Discovery, 2021, 5(5):21-29.)
[3] 段大高, 白宸宇, 韩忠明, 等. 基于多传递影响力的社交媒体谣言检测方法[J]. 计算机工程, 2022, 48(10):138-145,157.
doi: 10.19678/j.issn.1000-3428.0061592
[3] (Duan Dagao, Bai Chenyu, Han Zhongming, et al. Social Media Rumor Detection Method Based on Multi-Transmit Influence[J]. Computer Engineering, 2022, 48(10):138-145,157.)
doi: 10.19678/j.issn.1000-3428.0061592
[4] Shu K, Wang S, Liu H. Understanding User Profiles on Social Media for Fake News Detection[C]// Proceedings of the 2018 IEEE Conference on Multimedia Information Processing and Retrieval (MIPR). IEEE, 2018: 430-435.
[5] 刘鹏飞. 基于多模态特征及语义增强的虚假新闻检测算法的研究与应用[D]. 青岛: 山东科技大学, 2020.
[5] (Liu Pengfei. Research and Application of Fake News Detection Algorithm Based on Multi-modal Feature and Semantic Enhancement[D]. Qingdao: Shandong University of Science and Technology, 2020.)
[6] Lu Y J, Li C T. GCAN: Graph-aware Co-attention Networks for Explainable Fake News Detection on Social Media[OL]. arXiv Preprint, arXiv: 2004.11648.
[7] Przybyla P. Capturing the Style of Fake News[C]// Proceedings of the AAAI Conference on Artificial Intelligence. 2020, 34(1): 490-497.
[8] Karimi H, Tang J. Learning Hierarchical Discourse-level Structure for Fake News Detection[OL]. arXiv Preprint, arXiv: 1903.07389.
[9] 刘华玲, 陈尚辉, 乔梁, 等. 多模态混合注意力机制的虚假新闻检测研究[J]. 计算机工程与应用, 2023, 59(9): 95-103.
doi: 10.3778/j.issn.1002-8331.2202-0204
[9] (Liu Hualing, Chen Shanghui, Qiao Liang, et al. Multimodal False News Detection Based on Fusion Attention Mechanism[J]. Computer Engineering and Applications, 2023, 59(9): 95-103..)
doi: 10.3778/j.issn.1002-8331.2202-0204
[10] 亓鹏, 曹娟, 盛强. 语义增强的多模态虚假新闻检测[J]. 计算机研究与发展, 2021, 58(7):1456-1465.
[10] (Qi Peng, Cao Juan, Sheng Qiang. Semantics-Enhanced Multi-modal Fake News Detection[J]. Journal of Computer Research and Development, 2021, 58(7):1456-1465.)
[11] 刘赏, 沈逸凡. 基于新闻标题-正文差异性的虚假新闻检测方法[J]. 数据分析与知识发现, 2023, 7(2):97-107.
[11] (Liu Shang, Shen Yifan. Detecting Fake News Based on Title-Content Difference[J]. Data Analysis and Knowledge Discovery, 2023, 7(2):97-107.)
[12] Bian T, Xiao X, Xu T, et al. Rumor Detection on Social Media with Bi-directional Graph Convolutional Networks[C]// Proceedings of the AAAI Conference on Artificial Intelligence. 2020, 34(1): 549-556.
[13] Chen Z, Freire J. Proactive Discovery of Fake News Domains from Real-time Social Media Feeds[C]// Companion Proceedings of the Web Conference 2020. 2020: 584-592.
[14] Liang G, He W, Xu C, et al. Rumor Identification in Microblogging Systems Based on Users’ Behavior[J]. IEEE Transactions on Computational Social Systems, 2015, 2(3): 99-108.
doi: 10.1109/TCSS.2016.2517458
[15] Shen W, Wang J, Han J. Entity Linking with a Knowledge Base: Issues, Techniques, and Solutions[J]. IEEE Transactions on Knowledge and Data Engineering, 2015, 27(2): 443-460.
doi: 10.1109/TKDE.2014.2327028
[16] 朱彤. 人际交往中的心理学[M]. 北京: 金城出版社出版, 2009.
[17] Ma J, Gao W, Wong K F. Detect Rumors in Microblog Posts Using Propagation Structure via Kernel Learning[C]// Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1:Long Papers). 2017.
[18] Lai Y, Zhang L, Han D, et al. Fine-Grained Emotion Classification of Chinese Microblogs Based on Graph Convolution Networks[J]. World Wide Web: Internet and Web Information Systems, 2019, 23(5):2771-2787.
[19] Kipf T N, Welling M. Semi-supervised Classification with Graph Convolutional Networks[C]// Proceedings of the International Conference on Learning Representations (ICLR 2017). 2016.
[20] Castillo C, Mendoza M, Poblete B. Information Credibility on Twitter[C]// Proceedings of the 20th International Conference on World Wide Web. 2011: 675-684.
[21] Ma J, Gao W, Wei Z, et al. Detect Rumors Using Time Series of Social Context Information on Microblogging Websites[C]// Proceedings of the 24th ACM International Conference on Information and Knowledge Management. 2015: 1751-1754.
[22] Ma J, Gao W, Mitra P, et al. Detecting Rumors from Microblogs with Recurrent Neural Networks[C]// Proceedings of the 25th International Joint Conference on Artificial Intelligence. 2016: 3818-3824.
[23] Ruchansky N, Seo S, Liu Y. CSI: A Hybrid Deep Model for Fake News Detection[C]// Proceedings of the 26th ACM International Conference on Information and Knowledge Management. 2017: 797-806.
[1] 翟东升, 娄莹, 阚慧敏, 何喜军, 梁国强, 马自飞. 基于多源异构数据的中医药知识图谱构建与应用研究*[J]. 数据分析与知识发现, 2023, 7(9): 146-158.
[2] 张志剑, 倪珍妮, 刘政昊, 夏苏迪. 面向金融知识图谱的动态关系预测方法研究*[J]. 数据分析与知识发现, 2023, 7(9): 39-50.
[3] 普祥和, 王红斌, 线岩团. 结合类型感知注意力的少样本知识图谱补全*[J]. 数据分析与知识发现, 2023, 7(9): 51-63.
[4] 汪晓凤, 孙雨洁, 王华珍, 张恒彰. 融合深度学习和知识图谱的类型可控问句生成模型构建及验证*[J]. 数据分析与知识发现, 2023, 7(6): 26-37.
[5] 李锴君, 牛振东, 时恺泽, 邱萍. 基于学术知识图谱及主题特征嵌入的论文推荐方法*[J]. 数据分析与知识发现, 2023, 7(5): 48-59.
[6] 王寅秋, 虞为, 陈俊鹏. 融合知识图谱的中文医疗问答社区自动问答研究*[J]. 数据分析与知识发现, 2023, 7(3): 97-109.
[7] 杜悦, 常志军, 董美, 钱力, 王颖. 一种面向海量科技文献数据的大规模知识图谱构建方法*[J]. 数据分析与知识发现, 2023, 7(2): 141-150.
[8] 张贞港, 余传明. 基于实体与关系融合的知识图谱补全模型研究*[J]. 数据分析与知识发现, 2023, 7(2): 15-25.
[9] 刘赏, 沈逸凡. 基于新闻标题-正文差异性的虚假新闻检测方法*[J]. 数据分析与知识发现, 2023, 7(2): 97-107.
[10] 彭成, 张春霞, 张鑫, 郭倞涛, 牛振东. 基于实体多元编码的时序知识图谱推理*[J]. 数据分析与知识发现, 2023, 7(1): 138-149.
[11] 张晗, 安欣宇, 刘春鹤. 基于多源语义知识图谱的药物知识发现:以药物重定位为实证*[J]. 数据分析与知识发现, 2022, 6(7): 87-98.
[12] 刘春江, 李姝影, 胡汗林, 方曙. 图数据库在复杂网络分析中的研究与应用进展*[J]. 数据分析与知识发现, 2022, 6(7): 1-11.
[13] 刘勘, 徐勤亚, 於陆. 面向营商环境的知识图谱构建研究*[J]. 数据分析与知识发现, 2022, 6(4): 82-96.
[14] 张卫, 王昊, 陈玥彤, 范涛, 邓三鸿. 融合迁移学习与文本增强的中文成语隐喻知识识别与关联研究*[J]. 数据分析与知识发现, 2022, 6(2/3): 167-183.
[15] 刘政昊, 钱宇星, 衣天龙, 吕华揆. 知识关联视角下金融证券知识图谱构建与相关股票发现*[J]. 数据分析与知识发现, 2022, 6(2/3): 184-201.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
版权所有 © 2015 《数据分析与知识发现》编辑部
地址:北京市海淀区中关村北四环西路33号 邮编:100190
电话/传真:(010)82626611-6626,82624938
E-mail:jishu@mail.las.ac.cn