|
|
Detecting Mis/Dis-information from Social Media with Semantic Enhancement |
Wang Hao1,2,Gong Lijuan1,2,Zhou Zeyu1,2(),Fan Tao1,2,Wang Yongsheng1,2 |
1School of Information Management, Nanjing University, Nanjing 210023, China 2Jiangsu Key Laboratory of Data Engineering and Knowledge Service, Nanjing 210233, China |
|
|
Abstract [Objective] This paper builds an automated detection model to effectively identify mis/dis-information from social media, aiming to balance the speed and accuracy of processing massive data. [Methods] The classification model is the mainstream processing technique to detect for mis/dis-information. However, most of them could not extract deep semantic features from the texts. Therefore, we used the single text feature BFID model (BERT False-Information-Detection) as the benchmark model, and proposed two new methods with fused semantic enhancement to detect the mis/dis-information. [Results] We examined the new models with data from Sina Weibo. The accuracy of the model based on fused sentiment feature BFID-SEN (BFID-Sentiment) increased about 1.59 percentage point, while the accuracy of model with fused image feature BFID-IMG (BFID-Image) model improved by 0.78 percentage point. [Limitations] The ability to fuse semantic enhancement is limited due to the small corpus size, sentiment categories and multimodal disinformation training datasets. [Conclusions] The proposed methods are able to more effectively identify false information from social media.
|
Received: 31 August 2022
Published: 09 November 2022
|
|
Fund:National Natural Science Foundation of China(72074108);Fundamental Research Funds for the Central Universities(010814370113) |
Corresponding Authors:
Zhou Zeyu,ORCID:0000-0003-2757-2992,E-mail: mf20140111@smail.nju.edu.cn。
|
[1] |
李宗建, 程竹汝. 新媒体时代舆论引导的挑战与对策[J]. 上海行政学院学报, 2016, 17(5): 76-85.
|
[1] |
(Li Zongjian, Cheng Zhuru. Challenges and Countermeasures of Public Opinion Guidance in the New Media Time[J]. The Journal of Shanghai Administration Institute, 2016, 17(5): 76-85.)
|
[2] |
高玉君, 梁刚, 蒋方婷, 等. 社会网络谣言检测综述[J]. 电子学报, 2020, 48(7): 1421-1435.
doi: 10.3969/j.issn.0372-2112.2020.07.023
|
[2] |
(Gao Yujun, Liang Gang, Jiang Fangting, et al. Social Network Rumor Detection: A Survey[J]. Acta Electronica Sinica, 2020, 48(7): 1421-1435.)
doi: 10.3969/j.issn.0372-2112.2020.07.023
|
[3] |
范涛, 王昊, 郝琳娜, 等. 基于视频上下文和高维融合的突发事件中网民情感分析研究[J]. 情报科学, 2021, 39(5): 176-183.
|
[3] |
(Fan Tao, Wang Hao, Hao Linna, et al. Sentiment Analysis of Online Users in the Emergency Based on Video Context and High-Dimensional Fusion[J]. Information Science, 2021, 39(5): 176-183.)
|
[4] |
Bondielli A, Marcelloni F. A Survey on Fake News and Rumour Detection Techniques[J]. Information Sciences, 2019, 497: 38-55.
doi: 10.1016/j.ins.2019.05.035
|
[5] |
Chen W L, Yeo C K, Lau C T, et al. Behavior Deviation: An Anomaly Detection View of Rumor Preemption[C]// Proceedings of the 7th Annual Information Technology, Electronics and Mobile Communication Conference. IEEE, 2016: 1-7.
|
[6] |
Wu K, Yang S, Zhu K Q. False Rumors Detection on Sina Weibo by Propagation Structures[C]// Proceedings of the 31st International Conference on Data Engineering. IEEE, 2015: 651-662.
|
[7] |
Okazaki N, Nabeshima K, Watanabe K, et al. Extracting and Aggregating False Information from Microblogs[C]// Proceedings of the 2013 Workshop on Language Processing and Crisis Information. 2013: 36-43.
|
[8] |
Yang F, Liu Y, Yu X H, et al. Automatic Detection of Rumor on Sina Weibo[C]// Proceedings of the 2012 ACM SIGKDD Workshop on Mining Data Semantics. 2012: 13.
|
[9] |
Mendoza M, Poblete B, Castillo C. Twitter Under Crisis: Can We Trust What We RT?[C]// Proceedings of the 1st Workshop on Social Media Analytics. 2010: 71-79.
|
[10] |
Yang Y K, Niu K, He Z Q. Exploiting the Topology Property of Social Network for Rumor Detection[C]// Proceedings of the 12th International Joint Conference on Computer Science and Software Engineering. 2015: 41-46.
|
[11] |
Wang S H, Terano T. Detecting Rumor Patterns in Streaming Social Media[C]// Proceedings of the 2015 IEEE International Conference on Big Data. IEEE, 2015: 2709-2715.
|
[12] |
Jain S, Sharma V, Kaushal R. Towards Automated Real-Time Detection of Misinformation on Twitter[C]// Proceedings of the 2016 International Conference on Advances in Computing, Communications and Informatics. 2016: 2015-2020.
|
[13] |
陈燕方, 李志宇, 梁循, 等. 在线社会网络谣言检测综述[J]. 计算机学报, 2018, 41(7): 1648-1677.
|
[13] |
(Chen Yanfang, Li Zhiyu, Liang Xun, et al. Review on Rumor Detection of Online Social Networks[J]. Chinese Journal of Computers, 2018, 41(7): 1648-1677.)
|
[14] |
祖坤琳, 赵铭伟, 郭凯, 等. 新浪微博谣言检测研究[J]. 中文信息学报, 2017, 31(3): 198-204.
|
[14] |
(Zu Kunlin, Zhao Mingwei, Guo Kai, et al. Research on the Detection of Rumor on Sina Weibo[J]. Journal of Chinese Information Processing, 2017, 31(3): 198-204.)
|
[15] |
Kwon S, Cha M, Jung K, et al. Prominent Features of Rumor Propagation in Online Social Media[C]// Proceedings of the 13th International Conference on Data Mining. IEEE, 2013: 1103-1108.
|
[16] |
杨文太, 梁刚, 谢凯, 等. 基于突发话题和领域专家的微博谣言检测方法[J]. 计算机应用, 2017, 37(10): 2799-2805.
doi: 10.11772/j.issn.1001-9081.2017.10.2799
|
[16] |
(Yang Wentai, Liang Gang, Xie Kai, et al. Rumor Detection Method Based on Burst Topic Detection and Domain Expert Discovery[J]. Journal of Computer Applications, 2017, 37(10): 2799-2805.)
doi: 10.11772/j.issn.1001-9081.2017.10.2799
|
[17] |
陈一新, 陈馨悦, 刘奕, 等. 基于SIDR模型的谣言传播与源头检测研究[J]. 数据分析与知识发现, 2021, 5(1): 78-89.
|
[17] |
(Chen Yixin, Chen Xinyue, Liu Yi, et al. Detecting Rumor Dissemination and Sources with SIDR Model[J]. Data Analysis and Knowledge Discovery, 2021, 5(1): 78-89.)
|
[18] |
刘彻, 刘祖根. 基于信息传递的谣言源检测新算法[J]. 计算机与现代化, 2020(3): 54-59.
|
[18] |
(Liu Che,Liu Zugen. A New Algorithm for Rumor Source Detection Based on Information Transmission[J]. Computer and Modernization, 2020(3): 54-59.)
|
[19] |
Chang C, Zhang Y H, Szabo C, et al. Extreme User and Political Rumor Detection on Twitter[C]// Proceedings of the 12th International Conference on Advanced Data Mining and Applications. 2016: 751-763.
|
[20] |
Zubiaga A, Aker A, Bontcheva K, et al. Detection and Resolution of Rumours in Social Media: A Survey[J]. ACM Computing Surveys (CSUR), 2018, 51(2): 1-36.
|
[21] |
Cai G Y, Wu H, Lv R. Rumors Detection in Chinese via Crowd Responses[C]// Proceedings of the 2014 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining. IEEE, 2014: 912-917.
|
[22] |
Liang G, He W B, Xu C, et al. Rumor Identification in Microblogging Systems Based on Users’ Behavior[J]. IEEE Transactions on Computational Social Systems, 2015, 2(3): 99-108.
doi: 10.1109/TCSS.2016.2517458
|
[23] |
Castillo C, Mendoza M, Poblete B. Information Credibility on Twitter[C]// Proceedings of the 20th International Conference on World Wide Web. 2011: 675-684.
|
[24] |
Takahashi T, Igata N. Rumor Detection on Twitter[C]// Proceedings of the 6th International Conference on Soft Computing and Intelligent Systems, and the 13th International Symposium on Advanced Intelligence Systems. IEEE, 2012: 452-457.
|
[25] |
Ratkiewicz J, Conover M, Meiss M, et al. Detecting and Tracking the Spread of Astroturf Memes in Microblog Streams[OL]. arXiv Preprint, arXiv: 1011.3768.
|
[26] |
Ma J, Gao W, Mitra P, et al. Detecting Rumors from Microblogs with Recurrent Neural Networks[C]// Proceedings of the 25th International Joint Conference on Artificial Intelligence. 2016: 3818-3824.
|
[27] |
王鑫芸, 王昊, 邓三鸿, 等. 面向期刊选择的学术论文内容分类研究[J]. 数据分析与知识发现, 2020, 4(7): 96-109.
|
[27] |
(Wang Xinyun, Wang Hao, Deng Sanhong, et al. Classification of Academic Papers for Periodical Selection[J]. Data Analysis and Knowledge Discovery, 2020, 4(7): 96-109.)
|
[28] |
黄亚驹, 陈福集, 游丹丹. 基于混合算法和BP神经网络的网络舆情预测研究[J]. 情报科学, 2018, 36(2): 24-29.
|
[28] |
(Huang Yaju, Chen Fuji, You Dandan. Research on the Prediction of Network Public Opinion Based on Hybrid Algorithm and BP Neural Network[J]. Information Science, 2018, 36(2): 24-29.)
|
[29] |
徐绪堪, 周泽聿. 基于多尺度BiLSTM-CNN的微信推文的情感分类模型及应用研究[J]. 情报科学, 2021, 39(5): 130-137.
|
[29] |
(Xu Xukan, Zhou Zeyu. A Multi-Scale BiLSTM-CNN Based Emotion Classification Model for WeChat Tweets and Its Application[J]. Information Science, 2021, 39(5): 130-137.)
|
[30] |
Chen T, Li X, Yin H Z, et al. Call Attention to Rumors: Deep Attention Based Recurrent Neural Networks for Early Rumor Detection[C]// Proceedings of the 22nd Pacific-Asia Conference on Knowledge Discovery and Data Mining. 2018: 40-52.
|
[31] |
程亮, 邱云飞, 孙鲁. 微博谣言检测方法研究[J]. 计算机应用与软件, 2013, 30(2): 226-228.
|
[31] |
(Cheng Liang, Qiu Yunfei, Sun Lu. Research on Detecting Microblogging Rumours[J]. Computer Applications and Software, 2013, 30(2): 226-228.)
|
[32] |
Zhang Q, Zhang S Y, Dong J, et al. Automatic Detection of Rumor on Social Network[C]// Proceedings of the 4th Natural Language Processing and Chinese Computing. 2015: 113-122.
|
[33] |
Andreevskaia A, Bergler S. Mining WordNet for Fuzzy Sentiment: Sentiment Tag Extraction from WordNet Glosses[C]// Proceedings of the 11th Conference of the European Chapter of the Association for Computational Linguistics. 2006: 209-216.
|
[34] |
杨晗迅, 周德群, 马静, 等. 基于不确定性损失函数和任务层级注意力机制的多任务谣言检测研究[J]. 数据分析与知识发现, 2021, 5(7): 101-110.
|
[34] |
(Yang Hanxun, Zhou Dequn, Ma Jing, et al. Detecting Rumors with Uncertain Loss and Task-Level Attention Mechanism[J]. Data Analysis and Knowledge Discovery, 2021, 5(7): 101-110.)
|
[35] |
张柳, 王晰巍, 黄博, 等. 基于字词向量的多尺度卷积神经网络微博评论的情感分类模型及实验研究[J]. 图书情报工作, 2019, 63(18): 99-108.
doi: 10.13266/j.issn.0252-3116.2019.18.012
|
[35] |
(Zhang Liu, Wang Xiwei, Huang Bo, et al. A Sentiment Classification Model and Experimental Study of Microblog Commentary Based on Multivariate Convolutional Neural Networks Based on Word Vector[J]. Library and Information Service, 2019, 63(18): 99-108.)
doi: 10.13266/j.issn.0252-3116.2019.18.012
|
[36] |
沈瑞琳, 潘伟民, 彭成, 等. 基于多任务学习的微博谣言检测方法[J]. 计算机工程与应用, 2021, 57(24): 192-197.
doi: 10.3778/j.issn.1002-8331.2007-0152
|
[36] |
(Shen Ruilin, Pan Weimin, Peng Cheng, et al. Microblog Rumor Detection Method Based on Multi-Task Learning[J]. Computer Engineering and Applications, 2021, 57(24): 192-197.)
doi: 10.3778/j.issn.1002-8331.2007-0152
|
[37] |
陈帆. 基于LSTM情感分析模型的微博谣言识别方法研究[D]. 武汉: 华中师范大学, 2018.
|
[37] |
(Chen Fan. Microblog Rumor Detection Research Based on LSTM Sentiment Analysis Model[D]. Wuhan: Central China Normal University, 2018.)
|
[38] |
李巍胤. 基于情感分析的微博谣言识别模式研究[D]. 重庆: 重庆大学, 2016.
|
[38] |
(Li Weiyin. Research on Microblog Rumors Detection Pattern Based on Sentiment Analysis[D]. Chongqing: Chongqing University, 2016.)
|
[39] |
Jin Z W, Cao J, Zhang Y D, et al. Novel Visual and Statistical Image Features for Microblogs News Verification[J]. IEEE Transactions on Multimedia, 2017, 19(3): 598-608.
doi: 10.1109/TMM.6046
|
[40] |
Gupta M, Zhao P X, Han J W. Evaluating Event Credibility on Twitter[C]// Proceedings of the 2012 SIAM International Conference on Data Mining. 2012: 153-164.
|
[41] |
Sun S Y, Liu H Y, He J, et al. Detecting Event Rumors on Sina Weibo Automatically[C]// Proceedings of the 15th Asia-Pacific Web Conference. 2013: 120-131.
|
[42] |
王雨竹, 谢珺, 陈波, 等. 基于跨模态上下文感知注意力的多模态情感分析[J]. 数据分析与知识发现, 2021, 5(4): 49-59.
|
[42] |
(Wang Yuzhu, Xie Jun, Chen Bo, et al. Multi-Modal Sentiment Analysis Based on Cross-Modal Context-Aware Attention[J]. Data Analysis and Knowledge Discovery, 2021, 5(4): 49-59.)
|
[43] |
张国标, 李洁. 融合多模态内容语义一致性的社交媒体虚假新闻检测[J]. 数据分析与知识发现, 2021, 5(5): 21-29.
|
[43] |
(Zhang Guobiao, Li Jie. Detecting Social Media Fake News with Semantic Consistency Between Multi-Model Contents[J]. Data Analysis and Knowledge Discovery, 2021, 5(5): 21-29.)
|
[44] |
王仁武, 孟现茹. 图片情感分析研究综述[J]. 图书情报知识, 2020(3): 119-127.
|
[44] |
(Wang Renwu, Meng Xianru. Review of Image Sentiment Analysis[J]. Documentation, Information & Knowledge, 2020(3): 119-127.)
|
[45] |
张少钦, 杜圣东, 张晓博, 等. 融合多模态信息的社交网络谣言检测方法[J]. 计算机科学, 2021, 48(5): 117-123.
doi: 10.11896/jsjkx.200400057
|
[45] |
(Zhang Shaoqin, Du Shengdong, Zhang Xiaobo, et al. Social Rumor Detection Method Based on Multimodal Fusion[J]. Computer Science, 2021, 48(5): 117-123.)
doi: 10.11896/jsjkx.200400057
|
[46] |
谢豪, 毛进, 李纲. 基于多层语义融合的图文信息情感分类研究[J]. 数据分析与知识发现, 2021, 5(6): 103-114.
|
[46] |
(Xie Hao, Mao Jin, Li Gang. Sentiment Classification of Image-Text Information with Multi-Layer Semantic Fusion[J]. Data Analysis and Knowledge Discovery, 2021, 5(6): 103-114.)
|
[47] |
范涛, 吴鹏, 曹琪. 基于深度学习的多模态融合网民情感识别研究[J]. 信息资源管理学报, 2020, 10(1): 39-48.
|
[47] |
(Fan Tao, Wu Peng, Cao Qi. The Research of Sentiment Recognition of Online Users Based on DNNS Multimodal Fusion[J]. Journal of Information Resources Management, 2020, 10(1): 39-48.)
|
[48] |
张国标, 李洁, 胡潇戈. 基于多模态特征融合的社交媒体虚假新闻检测[J]. 情报科学, 2021, 39(10): 126-132.
|
[48] |
(Zhang Guobiao, Li Jie, Hu Xiaoge. Fake News Detection Based on Multimodal Feature Fusion on Social Media[J]. Information Science, 2021, 39(10): 126-132.)
|
[49] |
Devlin J, Chang M W, Lee K, et al. BERT: Pre-Training of Deep Bidirectional Transformers for Language Understanding[C]// Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies. 2019: 4171-4186.
|
[50] |
陈德鑫, 占袁圆, 杨兵, 等. 基于CNN-BiLSTM模型的在线医疗实体抽取研究[J]. 图书情报工作, 2019, 63(12): 105-113.
doi: 10.13266/j.issn.0252-3116.2019.12.014
|
[50] |
(Chen Dexin, Zhan Yuanyuan, Yang Bing, et al. Research on Extraction of Online Medical Entities Based on Mixed Deep Learning Model[J]. Library and Information Service, 2019, 63(12): 105-113.)
doi: 10.13266/j.issn.0252-3116.2019.12.014
|
[51] |
Cho K, Van Merriënboer B, Gulcehre C, et al. Learning Phrase Representations Using RNN Encoder-Decoder for Statistical Machine Translation[OL]. arXiv Preprint, arXiv:1406.1078.
|
[52] |
Treisman A M, Gelade G. A Feature-Integration Theory of Attention[J]. Cognitive Psychology, 1980, 12(1):97-136.
doi: 10.1016/0010-0285(80)90005-5
pmid: 7351125
|
[53] |
祁瑞华, 简悦, 郭旭, 等. 融合特征与注意力的跨领域产品评论情感分析[J]. 数据分析与知识发现, 2020, 4(12): 85-94.
|
[53] |
(Qi Ruihua, Jian Yue, Guo Xu, et al. Sentiment Analysis of Cross-Domain Product Reviews Based on Feature Fusion and Attention Mechanism[J]. Data Analysis and Knowledge Discovery, 2020, 4(12): 85-94.)
|
[54] |
周瑛, 刘越, 蔡俊. 基于注意力机制的微博情感分析[J]. 情报理论与实践, 2018, 41(3): 89-94.
doi: 10.16353/j.cnki.1000-7490.2018.03.018
|
[54] |
(Zhou Ying, Liu Yue, Cai Jun. Sentiment Analysis of Micro-Blogs Based on Attention Mechanism[J]. Information Studies: Theory & Application, 2018, 41(3): 89-94.)
doi: 10.16353/j.cnki.1000-7490.2018.03.018
|
[55] |
Poria S, Cambria E, Howard N, et al. Fusing Audio, Visual and Textual Clues for Sentiment Analysis from Multimodal Content[J]. Neurocomputing, 2016, 174: 50-59.
doi: 10.1016/j.neucom.2015.01.095
|
[56] |
王树义, 刘赛, 马峥. 基于深度迁移学习的微博图像隐私分类研究[J]. 数据分析与知识发现, 2020, 4(10): 80-92.
|
[56] |
(Wang Shuyi, Liu Sai, Ma Zheng. Microblog Image Privacy Classification with Deep Transfer Learning[J]. Data Analysis and Knowledge Discovery, 2020, 4(10): 80-92.)
|
[57] |
Targ S, Almeida D, Lyman K. Resnet in Resnet: Generalizing Residual Architectures[OL]. arXiv Preprint, arXiv: 1603.08029.
|
[58] |
郝旭政, 柴争义. 一种改进的深度残差网络行人检测方法[J]. 计算机应用研究, 2019, 36(5): 1569-1572.
|
[58] |
(Hao Xuzheng, Chai Zhengyi. Improved Pedestrian Detection Method Based on Depth Residual Network[J]. Application Research of Computers, 2019, 36(5): 1569-1572.)
|
|
Viewed |
|
|
|
Full text
|
|
|
|
|
Abstract
|
|
|
|
|
Cited |
|
|
|
|
|
Shared |
|
|
|
|
|
Discussed |
|
|
|
|