Please wait a minute...
Advanced Search
数据分析与知识发现  2021, Vol. 5 Issue (2): 32-42     https://doi.org/10.11925/infotech.2096-3467.2020.1027
  专题 本期目录 | 过刊浏览 | 高级检索 |
特定舆情的意见领袖挖掘和关键传播路径预测
徐雅斌1,2(),孙秋天2
1北京信息科技大学网络文化与数字传播北京市重点实验室 北京 100101
2北京信息科技大学计算机学院 北京 100101
Identifying Leaders and Dissemination Paths of Public Opinion
Xu Yabin1,2(),Sun Qiutian2
1Beijing Key Laboratory of Internet Culture and Digital Dissemination Research, Beijing Information Science and Technology University, Beijing 100101, China
2School of Computer, Beijing Information Science and Technology University, Beijing 100101, China
全文: PDF (1281 KB)   HTML ( 31
输出: BibTeX | EndNote (RIS)      
摘要 

【目的】 对社交网络进行有效的监管,在一定程度上把控和干预舆情的传播和发展变化。【方法】 提出一种综合拓扑势网红度、传播力和关注度的意见领袖挖掘模型OLMT,由此可以从更多的角度、更加客观地进行意见领袖挖掘。此外,对Transformer模型进行改造,构建社交网络传播行为预测模型MF-Transformer,利用其高度并行性和注意力机制,可以更加高效、准确地预测意见领袖的转发行为。【结果】 结合意见领袖挖掘结果以及传播行为预测结果,有效预测舆情传播过程中由意见领袖构成的关键传播路径。预测结果的查全率和查准率分别达92.17%和99.07%,明显高于其他方法。【局限】 实验主要面向特定舆情事件的新浪微博数据集,没有面向推特等数据集。【结论】 本文提出的意见领袖挖掘模型和传播行为预测模型不仅可以更加准确地挖掘出意见领袖,而且可以有效预测舆情传播过程中的关键路径。

服务
把本文推荐给朋友
加入引用管理器
E-mail Alert
RSS
作者相关文章
徐雅斌
孙秋天
关键词 舆情意见领袖传播行为预测关键路径识别    
Abstract

[Objective] This study proposes new method to monitor social media, aiming to limit or guide the spread of public opinion. [Methods] First, we constructed an OLMT model to identify opinion leaders based on the dissemination force and topological potential. Then, we modified the Transformer model to build a social media behavior prediction model (MF-Transformer) with high parallelism and attention mechanism. [Results] The proposed models identified opinion leaders and their retweeting behaviors, as well as the main dissemination paths of online public opinion. The recall and accuracy of the predicted results were 92.17% and 99.07%, respectively, which were higher than those of the existing methods. [Limitations] We only examined our new models with data from Sina Weibo. [Conclusions] The proposed models could effectively identify online opinion leaders, as well as predict the dissemination paths of their comments and retweets.

Key wordsPublic Opinion    Opinion Leader    Social Network Behavior Prediction    Dissemination Paths Identification
收稿日期: 2020-10-21      出版日期: 2020-11-24
ZTFLH:  TP393  
基金资助:*国家自然科学基金项目(61672101);网络文化与数字传播北京市重点实验室开放课题(ICDDXN004);信息网络安全公安部重点实验室开放课题(C18601)
通讯作者: 徐雅斌 ORCID:0000-0003-2727-3773     E-mail: xyb@bistu.edu.cn
引用本文:   
徐雅斌, 孙秋天. 特定舆情的意见领袖挖掘和关键传播路径预测[J]. 数据分析与知识发现, 2021, 5(2): 32-42.
Xu Yabin, Sun Qiutian. Identifying Leaders and Dissemination Paths of Public Opinion. Data Analysis and Knowledge Discovery, 2021, 5(2): 32-42.
链接本文:  
https://manu44.magtech.com.cn/Jwk_infotech_wk3/CN/10.11925/infotech.2096-3467.2020.1027      或      https://manu44.magtech.com.cn/Jwk_infotech_wk3/CN/Y2021/V5/I2/32
Fig.1  基于MF-Transformer的传播行为预测模型整体框架
Fig.2  Self-Attention实现机制
名称 含义 获取方式
userfols 用户粉丝数 网络爬虫技术爬取用户的基本资料
userinfl 用户影响力 通过意见领袖挖掘算法OLMT
usercati 用户活跃度 原创微博数、转发微博数、评论微博数加权求和
proauth 上游用户的认证情况 爬取用户的认证情况,认证用户记做1,非认证用户记做0
averretw 上游用户的平均被转发次数 上游用户历史微博被转发次数之和与历史微博条数的比值
commmoti 用户的传播积极性 用户最近转发的微博数量与发布微博的总数量的比值
textleng 文本长度 计算需预测被转发与否的微博文本长度
whetherpict 是否包含图片 通过微博文本判断是否包含图片,包含记做1,不包含记做0
whetherurl 是否包含URL 通过微博文本判断是否包含URL,包含记做1,不包含记做0
publishtime 文本发布时间段 爬取文本发布的时间,如5:23记做5
publishacti 发布时间段用户的活跃度 发布时间段内的原创微博数、转发微博数、评论微博数加权求和
textinte 用户对微博文本的兴趣度 计算方法见4.2节
forwardfreq 对上游用户的转发频率 用户转发上游用户的微博数与用户转发的微博总数的比值
transpower 传播源的传播力 计算方法见公式(2)
intersimi 与上游用户的兴趣相似度 计算方法见4.2节
Table 1  传播行为特征及获取方式
Fig.3  不同挖掘算法的准确率对比
Fig.4  不同挖掘算法的覆盖率对比
Fig.5  传播预测模型的性能对比
Fig.6  路径长度与路径条数、覆盖能力的变化关系
Fig.7  舆情传播预测结果
[1] Pal A, Counts S. Identifying Topical Authorities in Microblogs[C]//Proceedings of the 4th ACM International Conference on Web Search and Data Mining. ACM, 2011.
[2] 彭丽徽, 李贺, 张艳丰. 基于灰色关联分析的网络舆情意见领袖识别及影响力排序研究——以新浪微博“8·12滨海爆炸事件”为例[J]. 情报理论与实践, 2017,40(9):90-94.
[2] ( Peng Lihui, Li He, Zhang Yanfeng. Research on the Identification and Influence Ranking of Network Public Opinion Leaders Based on Grey Relational Analysis[J]. Information Studies: Theory & Application, 2017,40(9):90-94.)
[3] 陈芬, 付希, 何源, 等. 融合社会网络分析与影响力扩散模型的微博意见领袖发现研究[J]. 数据分析与知识发现, 2018,2(12):60-67.
[3] ( Chen Fen, Fu Xi, He Yuan, et al. Identifying Weibo Opinion Leaders with Social Network Analysis and Influence Diffusion Model[J]. Data Analysis and Knowledge Discovery, 2018,2(12):60-67.)
[4] Matsumura N, Ohsawa Y, Ishizuka M. Influence Diffusion Model in Text-Based Communication[J]. Transactions of the Japanese Society for Artificial Intelligence, 2002,17(3):259-267.
doi: 10.1527/tjsai.17.259
[5] 徐慧, 冯雪晴, 施磊磊, 等. 基于影响力扩散内容模型的舆论领袖识别方法[J]. 软件导刊, 2016,15(3):9-11.
[5] ( Xu Hui, Feng Xueqing, Shi Leilei, et al. An Opinion Leader Identification Method Based on Influence Diffusion Content Model[J]. Software Guide, 2016,15(3):9-11.)
[6] 樊兴华, 赵静, 方滨兴, 等. 影响力扩散概率模型及其用于意见领袖发现研究[J]. 计算机学报, 2013,36(2):360-367.
[6] ( Fan Xinghua, Zhao Jing, Fang Binxing, et al. Influence Diffusion Probability Model and Utilizing It to Identify Network Opinion Leader[J]. Chinese Journal of Computers, 2013,36(2):360-367.)
[7] 曹玖新, 陈高君, 吴江林, 等. 基于多维特征分析的社交网络意见领袖挖掘[J]. 电子学报, 2016,44(4):898-905.
doi: 10.3969/j.issn.0372-2112.2016.04.021
[7] ( Cao Jiuxin, Chen Gaojun, Wu Jianglin, et al. Multi-Feature Based Opinion Leader Mining in Social Networks[J]. Acta Electronica Sinica, 2016,44(4):898-905.)
doi: 10.3969/j.issn.0372-2112.2016.04.021
[8] 黄贤英, 阳安志, 刘小洋, 等. 一种改进的微博用户影响力评估算法[J]. 计算机工程, 2019,45(12):294-299.
[8] ( Huang Xianying, Yang Anzhi, Liu Xiaoyang, et al. An Improved Algorithm for Microblog User Influence Evaluation[J]. Computer Engineering, 2019,45(12):294-299.)
[9] 曹苗苗. 社交网络中的意见领袖挖掘方法研究[D]. 重庆: 重庆邮电大学, 2016.
[9] ( Cao Miaomiao. Research of Opinion Leaders Mining Method in Social Network[D]. Chongqing: Chongqing University of Posts and Telecommunications, 2016.)
[10] Li H, Zhao Y, Bai J, et al. Comparative Study in Complex Network: Node Degree and Topological Potential[C]//Proceedings of the 2nd International Conference on Image, Vision and Computing (ICIVC). IEEE, 2017.
[11] Sun R, Luo W. Using Topological Potential Method to Evaluate Node Importance in Public Opinion[C]//Proceedings of the 2017 International Conference on Electronic Industry & Automation. EIA, 2017.
[12] 肖俐平, 孟晖, 李德毅. 基于拓扑势的网络节点重要性排序及评价方法[J]. 武汉大学学报(信息科学版), 2008,33(4):379-383.
[12] ( Xiao Liping, Meng Hui, Li Deyi. Approach to Node Ranking in a Network Based on Topology Potential[J]. Geomatics and Information Science of Wuhan University, 2008,33(4):379-383.)
[13] 霍明奎, 竺佳琪, 赵丹. 移动环境下微博舆情信息传播网络结构研究[J]. 情报科学, 2019,37(5):98-102, 107.
[13] ( Huo Mingkui, Zhu Jiaqi, Zhao Dan. Propagation Characteristics and Network Structure of Micro-blog Public Opinion Information in Mobile Environment[J]. Information Science, 2019,37(5):98-102, 107.)
[14] Liu L, Zheng M, Xie Y. Information Propagation Model for Social Network Based on Information Characteristic and Social Status[C]//Proceedings of the 2016 International Conference on Human Centered Computing. Springer International Publishing, 2016.
[15] 石小月. 微博中环境风险舆情的传播机制研究——以“泉港碳九泄露事件”为例[D]. 保定: 河北大学, 2019.
[15] ( Shi Xiaoyue. Research on the Spreading Mechanism of Environmental Risk and Public Opinion in Weibo —Taking the Pollution of Carbon Harbor in Quangang as an Example[D]. Baoding: Hebei University, 2019.)
[16] Li D H, Zhang Y Q, Chen X, et al. Propagation Regularity of Hot Topics in Sina Weibo Based on SIR Model—A Simulation Research[C]//Proceedings of the 2014 Communications and IT Applications Conference. IEEE, 2014.
[17] Zhou X, Hu Y, Wu Y, et al. Influence Analysis of Information Erupted on Social Networks Based on SIR Model[J]. International Journal of Modern Physics C, 2015,26(2):1550018.
doi: 10.1142/S0129183115500187
[18] Chuan A, Chen B, Liu L, et al. Design and Implementation of Information Dissemination Simulation Algorithm in Large-Scale Complex Network Based on Spark[C]//Proceedings of the 3rd International Conference on Data Science in Cyberspace (DSC). IEEE, 2018.
[19] Kim J, Lee W, Yu H. CT-IC: Continuously Activated and Time-restricted Independent Cascade Model for Viral Marketing[C]//Proceedings of the 12th International Conference on Data Mining. IEEE, 2012.
[20] 王家坤, 王新华. 一种基于线性阈值的网络谣言离散传播模型[J]. 情报科学, 2019,37(6):163-169.
[20] ( Wang Jiakun, Wang Xinhua. A Discrete Propagation Model of Internet Rumor Based on Linear Threshold[J]. Information Science, 2019,37(6):163-169.)
[21] 曹玖新, 吴江林, 石伟, 等. 新浪微博网信息传播分析与预测[J]. 计算机学报, 2014,37(4):779-790.
[21] ( Cao Jiuxin, Wu Jianglin, Shi Wei, et al. Sina Weibo Information Diffusion Analysis and Prediction[J]. Chinese Journal of Computers, 2014,37(4):779-790.)
[22] Zhang L, Li H, Zhao C, et al. Social Network Information Propagation Model Based on Individual Behavior[J]. China Communications, 2017,14(7):1-15.
doi: 10.1109/CC.2017.8246482
[23] 刘玮, 贺敏, 王丽宏, 等. 基于用户行为特征的微博转发预测研究[J]. 计算机学报, 2016,39(10):1992-2006.
[23] ( Liu Wei, He Min, Wang Lihong, et al. Research on Microblog Retweeting Prediction Based on User Behavior Features[J]. Chinese Journal of Computers, 2016,39(10):1992-2006.)
[24] Zhang L, Zhao W, Zhao C. Research on Information Propagation Method Based on Individual User Characteristics[C]//Proceedings of the 13th International Conference on Computational Intelligence and Security (CIS). IEEE Computer Society, 2017.
[25] Li Z R, Lu T J, Shi W H, et al. Predicting the Scale of Information Diffusion in Social Network Services[J]. The Journal of China Universities of Posts and Telecommunications, 2013,20(S1):100-104.
doi: 10.1016/S1005-8885(13)60239-3
[26] 陈煜森. 基于表示学习的网络文本谣言的传播预测[D]. 武汉: 武汉大学, 2018.
[26] ( Chen Yusen. Propagation Prediction of Network Text Rumors Based on Representation Learning[D]. Wuhan: Wuhan University, 2018.)
[27] 辛悦. 社交网络信息传播及预测算法研究[D]. 西安:西安电子科技大学, 2018.
[27] ( Xin Yue. Information Dissemination and Prediction Algorithms of Social Network[D]. Xi’an: Xidian University, 2018.)
[28] 张健沛, 李泓波, 杨静, 等. 基于拓扑势的网络社区结点重要度排序算法[J]. 哈尔滨工程大学学报, 2012,33(6):745-752.
[28] ( Zhang Jianpei, Li Hongbo, Yang Jing, et al. An Importance-Sorting Algorithm of Network Community Nodes Based on Topological Potential[J]. Journal of Harbin Engineering University, 2012,33(6):745-752.)
[29] 齐超, 陈鸿昶, 于洪涛. 基于用户行为综合分析的微博用户影响力评价方法[J]. 计算机应用研究, 2014,31(7):2004-2007.
[29] ( Qi Chao, Chen Hongchang, Yu Hongtao. Method of Evaluating Micro-blog Users’ Influence Based on Comprehensive Analysis of User Behavior[J]. Application Research of Computers, 2014,31(7):2004-2007.)
[1] 范涛,王昊,吴鹏. 基于图卷积神经网络和依存句法分析的网民负面情感分析研究*[J]. 数据分析与知识发现, 2021, 5(9): 97-106.
[2] 王晰巍,贾若男,韦雅楠,张柳. 多维度社交网络舆情用户群体聚类分析方法研究*[J]. 数据分析与知识发现, 2021, 5(6): 25-35.
[3] 马莹雪,赵吉昌. 自然灾害期间微博平台的舆情特征及演变*——以台风和暴雨数据为例[J]. 数据分析与知识发现, 2021, 5(6): 66-79.
[4] 王楠,李海荣,谭舒孺. 基于改进SMOTE算法与集成学习的舆情反转预测研究*[J]. 数据分析与知识发现, 2021, 5(4): 37-48.
[5] 程铁军, 王曼, 黄宝凤, 冯兰萍. 基于CEEMDAN-BP模型的突发事件网络舆情预测研究*[J]. 数据分析与知识发现, 2021, 5(11): 59-67.
[6] 邵琦,牟冬梅,王萍,靳春妍. 基于语义的突发公共卫生事件网络舆情主题发现研究*[J]. 数据分析与知识发现, 2020, 4(9): 68-80.
[7] 梁野,李小元,许航,胡伊然. CLOpin:一种面向舆情分析与预警领域的跨语言知识图谱架构*[J]. 数据分析与知识发现, 2020, 4(6): 1-14.
[8] 叶光辉,曾杰妍,胡婧岚,毕崇武. 城市画像视角下的社会公众情感演化研究*[J]. 数据分析与知识发现, 2020, 4(4): 15-26.
[9] 邓建高,张璇,傅柱,韦庆明. 基于系统动力学的突发事件网络舆情传播研究:以“江苏响水爆炸事故”为例*[J]. 数据分析与知识发现, 2020, 4(2/3): 110-121.
[10] 梁艳平,安璐,刘静. 同类突发公共卫生事件微博话题共振研究*[J]. 数据分析与知识发现, 2020, 4(2/3): 122-133.
[11] 丁晟春,俞沣洋,李真. 网络舆情潜在热点主题识别研究*[J]. 数据分析与知识发现, 2020, 4(2/3): 29-38.
[12] 黄微,赵江元,闫璐. 网络热点事件话题漂移指数构建与实证研究*[J]. 数据分析与知识发现, 2020, 4(11): 92-101.
[13] 安璐,梁艳平. 突发公共卫生事件微博话题与用户行为选择研究*[J]. 数据分析与知识发现, 2019, 3(4): 33-41.
[14] 王林,王可,吴江. 社交媒体中突发公共卫生事件舆情传播与演变*——以2018年疫苗事件为例[J]. 数据分析与知识发现, 2019, 3(4): 42-52.
[15] 吴江,赵颖慧,高嘉慧. 医疗舆情事件的微博意见领袖识别与分析研究*[J]. 数据分析与知识发现, 2019, 3(4): 53-62.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
版权所有 © 2015 《数据分析与知识发现》编辑部
地址:北京市海淀区中关村北四环西路33号 邮编:100190
电话/传真:(010)82626611-6626,82624938
E-mail:jishu@mail.las.ac.cn