Please wait a minute...
Advanced Search
数据分析与知识发现  2017, Vol. 1 Issue (7): 44-51     https://doi.org/10.11925/infotech.2096-3467.2017.0479
  首届"数据分析与知识发现"学术研讨会专辑(I) 本期目录 | 过刊浏览 | 高级检索 |
基于情感分析的网络谣言识别方法*
首欢容, 邓淑卿, 徐健()
中山大学资讯管理学院 广州 510006
Detecting Online Rumors with Sentiment Analysis
Shou Huanrong, Deng Shuqing, Xu Jian()
School of Information Management, Sun Yat-Sen University, Guangzhou 510006, China
全文: PDF (704 KB)   HTML ( 10
输出: BibTeX | EndNote (RIS)      
摘要 

目的】提出一种基于情感分析技术自动识别特定领域谣言的方法。【方法】界定高、低质量信息源, 在假设高质量信息源信息更可靠的情况下, 通过基于情感词典的情感分析方法, 量化高质量信息源与低质量信息源对特定对象的情感差异, 判定低质量信息源提供的信息是否属于谣言。【结果】将该方法应用于“食品养生”、“医学健康”两个领域进行谣言识别。在30个疑似谣言案例中准确识别出23个谣言案例, 准确率为76.67%。本文提出的谣言识别方法在谣言预测方面的F值为83.34%, 查全率为71.42%, 查准率为100%; 在非谣言文本预测上的F值为72.73%, 查全率为100%, 查准率为57.14%。【局限】未实现不同信息源数据自动抽取, 每个谣言案例下的人工收集的谣言数量有限。【结论】本文基于情感分析的谣言识别方法对特定类型的谣言是有效的。

服务
把本文推荐给朋友
加入引用管理器
E-mail Alert
RSS
作者相关文章
首欢容
邓淑卿
徐健
关键词 情感分析情感词典谣言检测谣言识别    
Abstract

[Objective] This paper aims to identify rumors automatically with the help of sentiment analysis. [Methods] First, we chose high-quality and low-quality information sources. Then, we calculated the sentiment value and difference between the information from different sources. Based on the assumption that the information from high-quality source was more reliable, information from low-quality channels could be listed as rumor if the sentiment difference between them exceeded the pre-set threshold. [Results] We applied the proposed method to information on food and health as well as health and medical issues, and then successfully identified twenty-three rumors from thirty suspected cases. The accuracy rate of rumor detection was 76.67%, the F-value was 83.34%, the recall and precision was 71.42% and 100%, respectively. For non-rumor message, the F-value, recall, and precision were 72.73%, 100% and 57.14%. [Limitations] We did not extract the data automatically from different sources and the sample size was relatively small. [Conclusions] Sentiment analysis could help us identify rumors effectively.

Key wordsSentiment Analysis    Sentiment Lexicon    Rumor Identification    Rumor Detection
收稿日期: 2017-05-27      出版日期: 2017-07-12
ZTFLH:  G350  
基金资助:*本文系国家社会科学基金项目“用户评论情感分析及其在竞争情报服务中的应用研究”(项目编号: 11CTQ022)和广东省科技专项“基于内容的科技文献分析服务平台”(项目编号: 2016B030303003)的研究成果之一
引用本文:   
首欢容, 邓淑卿, 徐健. 基于情感分析的网络谣言识别方法*[J]. 数据分析与知识发现, 2017, 1(7): 44-51.
Shou Huanrong,Deng Shuqing,Xu Jian. Detecting Online Rumors with Sentiment Analysis. Data Analysis and Knowledge Discovery, 2017, 1(7): 44-51.
链接本文:  
https://manu44.magtech.com.cn/Jwk_infotech_wk3/CN/10.11925/infotech.2096-3467.2017.0479      或      https://manu44.magtech.com.cn/Jwk_infotech_wk3/CN/Y2017/V1/I7/44
  基于情感分析技术的谣言识别方法的总体框架
  疑似谣言文本的情感值计算流程
实际是谣言 实际不是谣言
预测是谣言 15 0
预测不是谣言 6 8
  谣言识别结果
实际是谣言 实际不是谣言
预测是谣言 A B
预测不是谣言 C D
  谣言检测分类性能评价列表
指标
Pr 100%
Rr 71.42%
F1r 83.34%
Pn 57.14%
Rn 100%
F1n 72.73%
  谣言检测分类性能评价结果
[1] 郭小安. 当代中国网络谣言的社会心理研究[M]. 北京: 中国社会科学出版社, 2015.
[1] (Guo Xiaoan.A Study on the Social Psychology of Contemporary Chinese Rumors Online[M]. Beijing: China Social Sciences Press, 2015.)
[2] 中国社会科学语言研究所词典编辑室. 现代汉语词典[M].第5版. 北京: 商务印书馆, 2005.
[2] (Dictionary of Chinese Social Sciences Language Research Institute.The Modern Chinese Dictionary [M]. The 5th Edition. Beijing: The Commercial Press, 2005.)
[3] 沙莲香. 社会心理学[M]. 第2版. 北京: 中国人民大学出版社, 2006.
[3] (Sha Lianxiang.Social Psychology[M]. The 2nd Edition. Beijing: China Renmin University Press, 2016.)
[4] Zhang L, Liu B.Sentiment Analysis and Opinion Mining[J]. Synthesis Lectures on Human Language Technologies, 2012, 30(1): 152-153.
[5] 杨立公, 朱俭, 汤世平.文本情感分析综述[J]. 计算机应用, 2013, 33(6): 1574-1578, 1607.
doi: 10.3724/SP.J.1087.2013.01574
[5] (Yang Ligong, Zhu Jian, Tang Shiping.Survey of Text Sentiment Analysis[J]. Journal of Computer Applications, 2013, 33(6): 1574-1578, 1607.)
doi: 10.3724/SP.J.1087.2013.01574
[6] Tong R M.An Operational System for Detecting and Tracking Opinions in Online Discussion[C]// Proceedings of the 26th International ACM SIGIR Conference on Research and Development in Information Retrieval. 2001.
[7] 徐琳宏, 林鸿飞, 潘宇, 等.情感词汇本体的构造[J]. 情报学报, 2008, 27(2): 180-185.
doi: 10.3969/j.issn.1000-0135.2008.02.004
[7] (Xu Linhong, Lin Hongfei, Pan Yu, et al.Constructing the Affective Lexicon Ontology[J]. Journal of the China Society for Scientific and Technical Information, 2008, 27(2): 180-185.)
doi: 10.3969/j.issn.1000-0135.2008.02.004
[8] HowNet [EB/OL]. [2017-02-01].
[9] 陈晓东. 基于情感词典的中文微博情感倾向分析研究[D].武汉: 华中科技大学, 2012.
[9] (Chen Xiaodong.Research on Sentiment Dictionary Based Emotional Tendency Analysis of Chinese MicroBlog[D]. Wuhan: Huazhong University of Science & Technology, 2012.)
[10] 肖璐, 陈果, 刘继云. 基于情感分析的企业产品级竞争对手识别研究——以用户评论为数据源[J]. 图书情报工作, 2016, 60(1): 83-90, 97.
doi: 10.13266/j.issn.0252-3116.2016.01.012
[10] (Xiao Lu, Chen Guo, Liu Jiyun.Study on Identification of Enterprise Product Level Competitor Based on Sentiment Analysis: Taking User Reviews for Data Resources[J]. Library and Information Service, 2016, 60(1): 83-90, 97.)
doi: 10.13266/j.issn.0252-3116.2016.01.012
[11] Litton I. # TwitterCritic: Sentiment Analysis of Tweets to Predict TV Ratings [EB/OL]. [2017-03-05].
[12] Nguyen T H, Shirai K.Topic Modeling Based Sentiment Analysis on Social Media for Stock Market Prediction[C]// Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing. 2015: 1354-1364.
[13] 毛二松, 陈刚, 刘欣, 等. 基于深层特征和集成分类器的微博谣言检测研究[J]. 计算机应用研究, 2016, 33(11): 3369-3373.
doi: 10.3969/j.issn.1001--3695.2016.11.037
[13] (Mao Ersong, Chen Gang, Liu Xin, et al.Research on Detecting Micro-blog Rumors Based on Deep Features and Ensemble Classifier[J]. Application Research of Computers, 2016, 33(11): 3369-3373.)
doi: 10.3969/j.issn.1001--3695.2016.11.037
[14] Qazvinian V, Rosengren E, Radev D R, et al.Rumor Has It: Identifying Misinformation in Microblogs[C]// Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing. 2012: 1589-1599.
[15] Kwon S, Cha M, Jung K.Rumor Detection over Varying Time Windows[J]. PLoS One, 2017, 12(1): e0168344.
doi: 10.1371/journal.pone.0168344 pmid: 28081135
[16] 张志安, 束开荣, 何凌南. 微信谣言的主题与特征[J]. 新闻与写作, 2016(1): 60-64.
[16] (Zhang Zhian, Shu Kairong, He Lingnan.The Topics and Features About Rumors on the WeChat[J]. News and Writing, 2016(1): 60-64.)
[17] 马费成, 宋恩梅. 信息管理学基础[M]. 第2版. 武汉: 武汉大学出版社, 2011: 136-142.
[17] (Ma Feicheng, Song Enmei.Principles of Information Management [M]. The 2nd Edition. Wuhan: Wuhan University Press, 2011: 136-142.)
[18] 杜嘉忠, 徐健, 刘颖. 网络商品评论的特征-情感词本体构建与情感分析方法研究[J]. 现代图书情报技术, 2014(5): 74-82.
[18] (Du Jiazhong, Xu Jian, Liu Ying.Research on Construction of Feature-Sentiment Ontology and Sentiment Analysis[J]. New Technology of Library and Information Service, 2014(5): 74-82.)
[19] 维基百科[EB/OL]. [2017-03-26].
[19] (Wikipedia[EB/OL].[2017-03-26].)
[20] 知乎[EB/OL]. [2017-03-26].
[20] (Zhihu [EB/OL]. [2017-03-26].)
[21] 果壳网[EB/OL]. [2017-03-26].
[21] (Guokr [EB/OL]. [2017-03-26].
[22] Giles J.Internet Encyclopedias Go Head to Head[J]. Nature, 2005, 138(15): 900-901.
[23] 百度[EB/OL]. [2017-03-26].
[23] (Baidu [EB/OL]. [2017-03-26].)
[24] 搜狗[EB/OL]. [2017-03-26].
[24] (Sougou [EB/OL]. [2017-03-26].)
[25] 流言百科[EB/OL]. [2017-03-26].
[25] (Liuyanbaike [EB/OL]. [2017-03-26].
[26] 百度文库. 疾病名称大全[EB/OL]. [2017-03-26].
[26] (Baidu Wenku.Full Listing of Disease Name [EB/OL]. [2017-03-26].)
[27] 吴江, 唐常杰, 李太勇, 等. 基于语义规则的Web金融文本情感分析[J]. 计算机应用, 2014, 34(2): 481-485.
[27] (Wu Jiang, Tang Changjie, Li Taiyong, et al.Sentiment Analysis on Web Financial Text Based on Semantic Rules[J]. Journal of Computer Applications, 2014, 34(2): 481-485.)
[1] 杨晗迅, 周德群, 马静, 罗永聪. 基于不确定性损失函数和任务层级注意力机制的多任务谣言检测研究*[J]. 数据分析与知识发现, 2021, 5(7): 101-110.
[2] 钟佳娃,刘巍,王思丽,杨恒. 文本情感分析方法及应用综述*[J]. 数据分析与知识发现, 2021, 5(6): 1-13.
[3] 刘彤,刘琛,倪维健. 多层次数据增强的半监督中文情感分析方法*[J]. 数据分析与知识发现, 2021, 5(5): 51-58.
[4] 王雨竹,谢珺,陈波,续欣莹. 基于跨模态上下文感知注意力的多模态情感分析 *[J]. 数据分析与知识发现, 2021, 5(4): 49-59.
[5] 常城扬,王晓东,张胜磊. 基于深度学习方法对特定群体推特的动态政治情感极性分析*[J]. 数据分析与知识发现, 2021, 5(3): 121-131.
[6] 张梦瑶, 朱广丽, 张顺香, 张标. 基于情感分析的微博热点话题用户群体划分模型 *[J]. 数据分析与知识发现, 2021, 5(2): 43-49.
[7] 韩普, 张伟, 张展鹏, 王宇欣, 方浩宇. 基于特征融合和多通道的突发公共卫生事件微博情感分析*[J]. 数据分析与知识发现, 2021, 5(11): 68-79.
[8] 吕华揆,刘政昊,钱宇星,洪旭东. 异质性财经新闻与股市关系研究*[J]. 数据分析与知识发现, 2021, 5(1): 99-111.
[9] 徐红霞,于倩倩,钱力. 基于主题模型和情感分析的话题交互数据观点对抗性分析 *[J]. 数据分析与知识发现, 2020, 4(7): 110-117.
[10] 姜霖,张麒麟. 基于引文细粒度情感量化的学术评价研究*[J]. 数据分析与知识发现, 2020, 4(6): 129-138.
[11] 石磊,王毅,成颖,魏瑞斌. 自然语言处理中的注意力机制研究综述*[J]. 数据分析与知识发现, 2020, 4(5): 1-14.
[12] 李铁军,颜端武,杨雄飞. 基于情感加权关联规则的微博推荐研究*[J]. 数据分析与知识发现, 2020, 4(4): 27-33.
[13] 沈卓,李艳. 基于PreLM-FT细粒度情感分析的餐饮业用户评论挖掘[J]. 数据分析与知识发现, 2020, 4(4): 63-71.
[14] 薛福亮,刘丽芳. 一种基于CRF与ATAE-LSTM的细粒度情感分析方法*[J]. 数据分析与知识发现, 2020, 4(2/3): 207-213.
[15] 张翼鹏,马敬东. 突发公共卫生事件误导信息受众情感分析及传播特征研究*[J]. 数据分析与知识发现, 2020, 4(12): 45-54.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
版权所有 © 2015 《数据分析与知识发现》编辑部
地址:北京市海淀区中关村北四环西路33号 邮编:100190
电话/传真:(010)82626611-6626,82624938
E-mail:jishu@mail.las.ac.cn