Please wait a minute...
Data Analysis and Knowledge Discovery  2022, Vol. 6 Issue (2/3): 396-408    DOI: 10.11925/infotech.2096-3467.2021.0800
Current Issue | Archive | Adv Search |
Predicting Public Opinion Reversal Based on Evolution Analysis of Events and Improved KE-SMOTE Algorithm
Wang Nan1,2,Li Hairong1(),Tan Shuru3
1School of Management Science and Information Engineering, Jilin University of Finance and Economics, Changchun 130117, China
2Institute of Economic Information Management, Jilin University of Finance and Economics, Changchun 130117, China3College of Information Science and Engineering, Guilin University of Technology, Guilin 541006, China
Download: PDF (1729 KB)   HTML ( 27
Export: BibTeX | EndNote (RIS)      
Abstract  

[Objective] This paper tries to accurately predict online public opinion reversal. [Methods] First, we retrieved the features of public opinion events based on their evolution characteristics and development process before the reversal points. Then, we used the improved KE-SMOTE algorithm to create an automatic optimization process, which balanced the event set with very skewed positive and negative samples. We also constructed a neural network ensemble learning model using the balanced event set. Finally, we examined our model with 30 trending public opinion events from 2021, and discussed the causes of errors for the inconsistent prediction results. We also provided corresponding countermeasures and suggestions on avoiding the reversal of public opinion. [Results] We found that the prediction accuracy of the proposed model on the test sets reached 99.7%, and all reversal events were predicted. [Limitations] While the time interval becoming much shorter between the occurrence and reversal of public opinion events, more research is needed to examine the proposed model with smaller data sets. [Conclusions] Our new model can accurately identify the public opinion reversal events in advance.

Key wordsPublic Opinion Reversal      KE-SMOTE Algorithm      Neural Network      Ensemble Learning      Countermeasure Research     
Received: 05 August 2021      Published: 14 April 2022
ZTFLH:  G353  
Fund:13th Five-year Plan for the Science and Technology Research Project of Education Department of Jilin Province(JJKH20210131KJ);13th Five-year Plan for the Key Fund Project of Education Science of Jilin Province(ZD20024);National Natural Science Foundation of China(61702213)
Corresponding Authors: Li Hairong,ORCID:0000-0002-4884-7783     E-mail: 1070929014@qq.com

Cite this article:

Wang Nan, Li Hairong, Tan Shuru. Predicting Public Opinion Reversal Based on Evolution Analysis of Events and Improved KE-SMOTE Algorithm. Data Analysis and Knowledge Discovery, 2022, 6(2/3): 396-408.

URL:

https://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/10.11925/infotech.2096-3467.2021.0800     OR     https://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/Y2022/V6/I2/3/396

特征 赋值依据
x1持续时长(天/单位) 0≤x1<5 1;5 ≤x1<10 2;10≤x1<15 3;15≤x1<20 4 ;20≤x1<40 5;x1≥40 6
x2事件规模 国内局部 1;全国范围 2;国际范围 3
x3事件从产生到初次报道的时间差(天/单位) x3<1 1;1≤x3<2 2;2≤x3<3 3;3≤x3<4 4;x3≥4 5
x4事件人物的年龄 0≤x4<20 1;20≤x4<40 2;40≤x4<60 3;60≤x4<80 4; x4≥80 5
x5首发平台 微博 1;微信 2;网媒 3
x6转发量 0≤x6<5000 1;5000≤x6<10000 2;10000≤x6<15000 3;15000≤x6<20000 4;20000≤x6<25000 5;25000≤x6 <30000 6;x6≥30000 7
x7评论量 0≤x7<5000 1;5000≤x7<10000 2;10000≤x7<15000 3;15000≤x7<20000 4;20000≤x7<25000 5;25000≤x7 <30000 6;x7≥30000 7
x8点赞量 0≤x8<5000 1;5000≤x8<10000 2;10000≤x8<15000 3;15000≤x8<2000 4;20000≤x8<25000 5;25000≤x8<30000 6;x8≥30000 7
x9博文数 0≤x9<100 1;100≤x9<200 2;200≤x9<300 3;300≤x9<400 4;400≤x9<500 5;500≤x9<600 6;x9≥600 7
x10影响力指数 0≤x10<20 1;20≤x10<40 2;40≤x10<60 3;60≤x10<80 4;80≤x10≤100 5
x11事件当事人身份类型 医生1;女性2;警察3;大学生4;快递员5;其他6
x12该事件是否与现实生活联系紧密 是 1;否 0
x13该事件背后是否带有某种社会情绪 是 1;否 0
x14事件的相关信息是否被模糊化处理 是 1;否 0
x15事件当事人是否为弱势群体 是 1;否 0
x16当事人是否被人肉搜索,并遭遇网络暴力 是 1;否 0
x17事件是否存在争议 是 1;否 0
x18是否引起线下事件 是 1;否 0
x19网民的观点是否存在刻板效应 是 1;否 0
x20是否进行了议程设置 是 1;否 0
x21内容爆点是否多 是 1;否 0
x22是否产生次生舆情 是 1;否 0
x23网民是否对事件进行了带有明显倾向性的预判 是 1;否 0
Feature Assignment Basis
Iteration Times, CH Value and Cluster Number in the Process of Cluster Number Optimization
Comparison of Sample Distribution Before and After Balancing
Single Neural Network (Individual Learner 1)
Single Neural Network (Individual Learner 2)
Construction of Improved KE-SMOTE Classification Model
指标名 含义 计算公式
准确率 分类模型所有预测正确的结果占总观测值的比重 Accuracy = TP + TN TP + FP + FN + TN
精确率 模型预测是正例的所有结果中,模型预测对的比重 Precision = TP TP + FP
召回率 真实值是正例的所有结果中,模型预测对的比重 Recall = TP TP + FN
特异度 真实值是反例的所有结果中,模型预测对的比重 Specificity = TN TN + FP
F PrecisionRecall加权调和平均数,并假设两者一样重要 F 1 - Score = 2 Precision Recall Precision + Recall
Partial Indexes of the Classification Model Evaluation
评估指标 准确率 精确率 召回率 特异度 F
Model1 0.958 2 0.540 9 1.00 0.956 0 0.702 1
Model2 0.997 0 0.967 2 1.00 0.996 7 0.983 3
Comparison of Evaluation of Index Values
ROC Curve of Integrated Neural Network Classification Model
序号 事件名称 预测概率值 预测分类 真实分类
Model1 Model2 Model1 Model2
1 货拉拉女乘客坠车死亡 1 0.941 63 1 1 1
2 杭州辣椒水事件 1 0.953 04 1 1 1
3 成都四十九中一学生在校坠楼身亡事件 1 0.996 22 1 1 1
4 首汽约车平台网约车事件 1 0.982 82 1 1 1
5 马金瑜事件 1 0.985 47 1 1 1
6 广州一特斯拉撞树后自燃 1 0.452 39 1 0 0
7 河南一学校发表熟鸡蛋返生孵小鸡论文 1 0.998 19 1 1 0
8 上海金山区厂房火灾导致8人遇难 0.999 68 0.405 35 1 0 0
9 杀毒软件死于巴塞罗监狱 0.999 59 0.374 00 1 0 0
10 央美教师徐天华性侵未成年女生 1 0.985 85 1 1 0
11 谭鸭血老火锅为泄露肖战行踪公开道歉 0.999 61 0.400 74 1 0 0
12 神州十二号载人飞船发射圆满成功 0.999 72 0.415 38 1 0 0
13 江西通报专升本考试有关考点作弊事件 1 0.405 89 1 0 0
14 复旦大学数学科学学院党委书记遇害身亡 0.999 88 0.440 31 1 0 0
15 王者荣耀被指侵害未成年人权益 0.999 99 0.460 44 1 0 0
16 国航回应粉丝飞机上船舱追星事件 0.999 91 0.459 95 1 0 0
17 黑龙江科技大学学生不雅视频遭传播 1 0.983 27 1 1 0
18 长沙教师招聘男性应聘者4分进面试 1 0.990 24 1 1 0
19 内蒙古文旅厅副厅长李晓秋自杀身亡 0.999 97 0.483 50 1 0 0
20 上海一女子持刀伤人致5伤 0.999 80 0.448 44 1 0 0
21 西安一的车内猝死仍被贴罚单 0.999 87 0.586 31 1 0 0
22 武汉在校博士后因套路贷自杀 0.999 96 0.480 30 1 0 0
23 重庆一15岁女孩校内坠亡 0.999 87 0.242 27 1 0 0
24 B站招聘争议 0.999 89 0.374 52 1 0 0
25 红黄蓝幼儿园幼师发男童闻脚图 0.999 96 0.568 76 1 0 0
26 台铁一列车发生脱轨事故 0.999 98 0.477 61 1 0 0
27 黑龙江15岁女生弑母藏尸冷库 0.996 45 0.482 70 1 0 0
28 河北5名10岁儿童遭校园欺凌 0.999 78 0.430 32 1 0 0
29 江苏一女辅警勒索多名公职人员 0.997 37 0.419 83 1 0 0
30 网传河南一智障女孩嫁给中年男子 0.999 96 0.605 33 1 0 0
Prediction Results of Public Opinion Events (2021)
Objective Characteristics of Public Opinion Reversal and Public Opinion Non-Reversal Events
Public Opinion Reversal and Public Opinion Non-Reversal Events with Subjective Characteristics of 1
[1] 肖金克, 李国松. 高校突发事件网络舆情演化分析——以“武软寝室征用”事件为例[J]. 新闻研究导刊, 2020, 11(11):52-53.
[1] ( Xiao Jinke, Li Guosong. Analysis of the Evolution of Public Opinion on the Network of University Emergency — Taking the “Wuhan Software Engineering Vocational School’s Dormitory was Requisitioned” as an Example[J]. Journal of News Research, 2020, 11(11):52-53.)
[2] 齐中祥. 舆情学[M].江苏人民出版社, 2015.
[2] ( Qi Zhongxiang. Public Opinion [M]. Jiangsu People’s Publishing House, 2015.)
[3] 刘语潇, 杨丽萍. 新媒体时代新闻反转与舆情反转的关系机制研究——基于三例反转新闻进行的研究[J]. 北方传媒研究, 2021(2):61-67.
[3] ( Liu Yuxiao, Yang Liping. Research on the Relationship Mechanism Between News Reversal and Public Opinion Reversal in the New Media Era — Based on Three Cases of Reversal News[J]. North Media Research, 2021(2):61-67.)
[4] 胡文昭. 从“罗一笑事件”透视网络捐款事件中舆情反转的成因[J]. 新闻研究导刊, 2017, 8(3):271-272.
[4] ( Hu Wenzhao. Perspective on the Causes of Public Opinion Reversal in the Online Donation Event from the“Luo Yixiao Event”[J]. Journal of News Research, 2017, 8(3):271-272.)
[5] 韩运荣. 舆论反转的成因及治理——通过新闻反转的对比分析[J]. 人民论坛, 2019(30):116-118.
[5] ( Han Yunrong. Causes and Treatment of Public Opinion Reversal — Through a Comparative Analysis of News Reversal[J]. People’s Tribune, 2019(30):116-118.)
[6] 王楠, 李海荣, 谭舒孺. 基于改进SMOTE算法与集成学习的舆情反转预测研究[J]. 数据分析与知识发现, 2021, 5(4):37-48.
[6] ( Wang Nan, Li Hairong, Tan Shuru. Predicting of Public Opinion Reversal with Improved SMOTE Algorithm and Ensemble Learning[J]. Data Analysis and Knowledge Discovery, 2021, 5(4):37-48.)
[7] 郑玮. 舆情反转现象的推动机制研究[D]. 哈尔滨: 黑龙江大学, 2020.
[7] ( Zheng Wei. Research on the Reversal Mechanism of Public Opinion[D]. Harbin: Heilongjiang University, 2020.)
[8] 孙翠平. 网络舆情反转的传播及演化研究[D]. 广州: 华南理工大学, 2018.
[8] ( Sun Cuiping. A Study on the Spread and Evolution of Internet Public Opinion Reversal[D]. Guangzhou: South China University of Technology, 2018.)
[9] 田俊静, 兰月新, 夏一雪, 等. 基于决策树方法的网络舆情反转识别与实证研究[J]. 情报杂志, 2019, 38(8):121-125, 171.
[9] ( Tian Junjing, Lan Yuexin, Xia Yixue, et al. Recognition and Empirical Study of Network Public Opinion Reversal Based on Decision Tree Method[J]. Journal of Intelligence, 2019, 38(8):121-125,171.)
[10] 田世海, 孙美琪, 张家毓. 基于贝叶斯网络的自媒体舆情反转预测[J]. 情报理论与实践, 2019, 42(2):127-133.
[10] ( Tian Shihai, Sun Meiqi, Zhang Jiayu. Prediction of We-Media Public Opinion Reversion Based on Bayesian Network[J]. Information Studies: Theory & Application, 2019, 42(2):127-133.)
[11] 敖阳利. 传播学视阈下舆情反转事件研究[J]. 新闻研究导刊, 2015, 6(23):138-139.
[11] ( Ao Yangli. Research on Public Opinion Reversal Events from the Perspective of Communication[J]. Journal of News Research, 2015, 6(23):138-139.)
[12] 蒋叶莎. 后真相时代真相何以接近真实——基于成都七中实验学校食品安全事件的舆情分析[J]. 东南传播, 2019(10):91-93.
[12] ( Jiang Yesha. How the Truth Approaches the Truth in the Post-Truth Era--An Analysis of Public Opinion Based on the Food Safety Incident in Chengdu No.7 Experimental School[J]. Southeast Communication, 2019(10):91-93.)
[13] 骆正林, 温馨. 后真相时代“反转新闻”的传播机制及社会规治[J]. 传媒观察, 2019(12):5-13.
[13] ( Luo Zhenglin, Wen Xin. Communication Mechanism of “Reversal News” in the Post-Truth Era and Its Social Regulation[J]. Media Observer, 2019(12):5-13.)
[14] 张丽, 朱侯, 万芳彬, 等. 考虑信息与组织氛围影响的网络舆论反转模拟[J]. 情报科学, 2018, 36(5):57-63.
[14] ( Zhang Li, Zhu Hou, Wan Fangbin, et al. Simulation of Public Opinion Reversal Phenomenon Considering the Effects of Information and Organization Atmosphere[J]. Information Science, 2018, 36(5):57-63.)
[15] Zhu H, Hu B. Impact of Information on Public Opinion Reversal—An Agent Based Model[J]. Physica A: Statistical Mechanics and Its Applications, 2018, 512:578-587.
doi: 10.1016/j.physa.2018.08.085
[16] 普莎. “后真相”时代舆情反转事件的成因及规制探析[J]. 西部广播电视, 2020(5):50-51.
[16] ( Pu Sha. The Causes and Regulations of Public Opinion Reversal in the Post-Truth Era[J]. West China Broadcasting TV, 2020(5):50-51.)
[17] 袁野, 兰月新, 张鹏, 等. 基于系统聚类的反转网络舆情分类及预测研究[J]. 情报科学, 2017, 35(9):54-60.
[17] ( Yuan Ye, Lan Yuexin, Zhang Peng, et al. Research on the Classification and Forecast of Reversal Network Public Opinion Based on Cluster Analysis[J]. Information Science, 2017, 35(9):54-60.)
[18] 夏一雪, 兰月新, 刘茉, 等. 大数据环境下网络舆情反转机理与预测研究[J]. 情报杂志, 2018, 37(8):92-96,207.
[18] ( Xia Yixue, Lan Yuexin, Liu Mo, et al. Inversion Mechanism and Prediction of Network Public Opinion in Big Data Environment[J]. Journal of Intelligence, 2018, 37(8):92-96,207.)
[19] 江长斌, 邹悦琦, 王虎, 等. 基于SVM的自媒体舆情反转预测研究[J]. 情报科学, 2021, 39(4):47-53.
[19] ( Jiang Changbin, Zou Yueqi, Wang Hu, et al. Research on Prediction for Reversal of We-Media Public Opinion Based on SVM[J]. Information Science, 2021, 39(4):47-53.)
[20] 阮紫玥. 新媒体网络舆情反转预测研究[D]. 武汉: 华中师范大学, 2019.
[20] ( Ruan Ziyue. Research on Public Opinion Reversal Prediction Model in New Media Environment[D]. Wuhan: Central China Normal University, 2019.)
[21] Judd C M, Park B. Definition and Assessment of Accuracy in Social Stereotypes[J]. Psychological Review, 1993, 100(1):109-128.
pmid: 8426877
[22] 麦克斯韦尔·麦考姆斯, 郭镇之, 邓理峰. 议程设置理论概览: 过去, 现在与未来[J]. 新闻大学, 2007(3):55-67.
[22] ( Maxwell C McCombs, Guo Zhenzhi Deng Lifeng. An Overview of Agenda Setting Theory: Past, Present and Future[J]. Journalism Research, 2007(3):55-67.)
[23] 赵静娴. 次生舆情及其监管对策研究[J]. 新闻传播, 2016(9): 4,6.
[23] ( Zhao Jingxian. Research on Secondary Public Opinion and Its Regulatory Countermeasures[J]. Journalism & Communication, 2016(9): 4,6.)
[24] 唐圣倬. 施拉姆大众传播视域下社会矛盾激化的过程分析[J]. 新闻知识, 2021(3):79-86.
[24] ( Tang Shengzhuo. Analysis on the Process of Intensification of Social Contradictions from the Perspective of Schramm’s Mass Communication[J]. News Research, 2021(3):79-86.)
[1] Wei Tingting, Jiang Tao, Zheng Shuling, Zhang Jiantao. Extracting Chinese Patent Keywords with LSTM and Logistic Regression[J]. 数据分析与知识发现, 2022, 6(2/3): 308-317.
[2] Che Hongxin,Wang Tong,Wang Wei. Comparing Prediction Models for Prostate Cancer[J]. 数据分析与知识发现, 2021, 5(9): 107-114.
[3] Gu Yaowen, Zhang Bowen, Zheng Si, Yang Fengchun, Li Jiao. Predicting Drug ADMET Properties Based on Graph Attention Network[J]. 数据分析与知识发现, 2021, 5(8): 76-85.
[4] Xu Liangchen, Guo Chonghui. Predicting Survival Rates for Gastric Cancer Based on Ensemble Learning[J]. 数据分析与知识发现, 2021, 5(8): 86-99.
[5] Zhang Le, Leng Jidong, Lv Xueqiang, Cui Zhuo, Wang Lei, You Xindong. RLCPAR: A Rewriting Model for Chinese Patent Abstracts Based on Reinforcement Learning[J]. 数据分析与知识发现, 2021, 5(7): 59-69.
[6] Han Pu,Zhang Zhanpeng,Zhang Mingtao,Gu Liang. Normalizing Chinese Disease Names with Multi-feature Fusion[J]. 数据分析与知识发现, 2021, 5(5): 83-94.
[7] Wang Nan,Li Hairong,Tan Shuru. Predicting of Public Opinion Reversal with Improved SMOTE Algorithm and Ensemble Learning[J]. 数据分析与知识发现, 2021, 5(4): 37-48.
[8] Qiu Yunfei, Guo Lei. Predicting Diabetic Complications with Unbalanced Data[J]. 数据分析与知识发现, 2021, 5(2): 116-128.
[9] Li Danyang, Gan Mingxin. Music Recommendation Method Based on Multi-Source Information Fusion[J]. 数据分析与知识发现, 2021, 5(2): 94-105.
[10] Yu Bengong, Zhang Shuwen. Aspect-Level Sentiment Analysis Based on BAGCNN[J]. 数据分析与知识发现, 2021, 5(12): 37-47.
[11] Ding Hao, Ai Wenhua, Hu Guangwei, Li Shuqing, Suo Wei. A Personalized Recommendation Model with Time Series Fluctuation of User Interest[J]. 数据分析与知识发现, 2021, 5(11): 45-58.
[12] Yin Haoran,Cao Jinxuan,Cao Luzhe,Wang Guodong. Identifying Emergency Elements Based on BiGRU-AM Model with Extended Semantic Dimension[J]. 数据分析与知识发现, 2020, 4(9): 91-99.
[13] Qiu Erli,He Hongwei,Yi Chengqi,Li Huiying. Research on Public Policy Support Based on Character-level CNN Technology[J]. 数据分析与知识发现, 2020, 4(7): 28-37.
[14] Liu Weijiang,Wei Hai,Yun Tianhe. Evaluation Model for Customer Credits Based on Convolutional Neural Network[J]. 数据分析与知识发现, 2020, 4(6): 80-90.
[15] Wang Mo,Cui Yunpeng,Chen Li,Li Huan. A Deep Learning-based Method of Argumentative Zoning for Research Articles[J]. 数据分析与知识发现, 2020, 4(6): 60-68.
  Copyright © 2016 Data Analysis and Knowledge Discovery   Tel/Fax:(010)82626611-6626,82624938   E-mail:jishu@mail.las.ac.cn