Please wait a minute...
Data Analysis and Knowledge Discovery  2024, Vol. 8 Issue (2): 131-142    DOI: 10.11925/infotech.2096-3467.2022.1316
Current Issue | Archive | Adv Search |
Identifying Trending Events Based on Time Series Anomaly Detection
Yang Xinyi1,2,Ma Haiyun1,2(),Zhu Hengmin3
1School of Information Management, Nanjing University, Nanjing 210023, China
2Jiangsu Key Laboratory of Data Engineering and Knowledge Service, Nanjing 210023, China
3School of Management, Nanjing University of Posts and Telecommunications, Nanjing 210003, China
Download: PDF (2385 KB)   HTML ( 8
Export: BibTeX | EndNote (RIS)      
Abstract  

[Objective] This study aims to discover information topics and identify real-world events that stimulate public discussions. It helps us establish timely responses and reduce risks. [Methods] We first constructed a co-word network to detect communities representing topics. Then, we calculated the document topic vectors based on the overlaps between the document words and topic community words. Third, we decided topic popularity time series according to the document time. Finally, we used the STL to decompose topic popularity time series and employed the 3σ rule to detect anomalies. We identified real-world events stimulating discussion by examining high-frequency words and highly correlated documents at anomalous time points. [Results] We examined the new model with posts from Sina Weibo about the heavy rainstorm in Henan. We discovered topics related to disaster situations, emergency management, and social response. Anomaly detection and analysis show that the topics about disaster situations received the highest public attention, with rainfall warnings and flood control actions being hot events. In emergency management, rescue and relief efforts and accident investigation can stimulate discussions. Regarding social response, stories of victims' mutual aid and public donations attract attention. [Limitations] The dataset of this study is relatively small, so we have to manually set the threshold of anomaly detection. An automatic method is needed for larger datasets. [Conclusions] Anomaly detection in topic time series can identify the trending events on social platforms. In crisis response, government agencies need to address rescue, prevention, and recovery aspects, issue timely warnings, provide information on disaster relief and accident investigations to address public concerns, and guide positive or healthy public opinion by promoting rescue, mutual aid, and donation activities.

Key wordsAnomaly Detection      Topic Popularity      Time Series      Community Detection      Online Social Medias     
Received: 11 December 2022      Published: 10 April 2023
ZTFLH:  G202  
Fund:National Social Science Fund of China(20ATQ006)
Corresponding Authors: Ma Haiyun,ORCID:0000-0003-4827-8469,E-mail:1281570663@qq.com。   

Cite this article:

Yang Xinyi, Ma Haiyun, Zhu Hengmin. Identifying Trending Events Based on Time Series Anomaly Detection. Data Analysis and Knowledge Discovery, 2024, 8(2): 131-142.

URL:

https://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/10.11925/infotech.2096-3467.2022.1316     OR     https://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/Y2024/V8/I2/131

The Anomaly Detection Model of Topic Popularity Time Series
话题 分类 话题关键词
#0应急管理部署 应急管理 应急 运营 影响 措施 部门 车站 情况 单位 工作 管理 条件 设备 铁路 评估 部署
#1公益捐赠 社会响应 娱乐 四川 物资 灾害 爱心 牵动 红十字会 灾区 灾情 家人 总会 明星 家乡 能量 社会
#2消防救援 应急管理 救援 消防 力量 橙色 消防员 中国 洪水 救灾 救援队 人民 群众 全国 政府 救助 英雄
#3雨情报道 灾情态势 预警 地区 预计 局地 强降雨 气象台 强降水 大风 强对流 降雨 降水量 河南省 局部 洛阳 许昌
#4郑州地铁灾情 灾情态势 地铁 号线 郑州 乘客 遇难者 沙口 车厢 路站 隧道 名单 五号线 海滩 爸爸 现场 父亲
#5公路运输状况 灾情态势 积水 新乡 车辆 道路 全线 高速公路 路况 东站 交通 货车 焦桐 中原 报警 商登 服务
#6地铁排查 应急管理 列车 郑州市 区间 龙口 停车场 区域 现象 正线 水墙 全力 场线 遭遇 极端 事件 指挥部 抗旱
#7受灾者互助事迹 社会响应 回家 生命 驻马店 老人 专业 身体 都市报 技术 制作 大水 医院 小伙 工程 都市 衣服
#8防汛报道 灾情态势 沙袋 放映厅 信号 直播 车库 注意安全 沸点 大象 高架 将车 猛犸 车主 私家车 门窗 防汛
#9社会援助事迹 社会响应 鸿星 大河 前线 发微 博物院 危机 战役 涝疫 博称 头条 文章 退伍军人 曙光 王刚 套房
The Topics and Their Keywords
Comparison of Topics' Popularities
STL Decomposition of Topic Popularity Time Series
Anomaly Detection of Topic Popularity Time Series(#4)
话题 异常时点 事件
#4郑州地铁灾情 7月23日 郑州地铁遇难者名单
#2消防救援 7月26日 中国消防救援河南
#6地铁排查 7月27日 郑州地铁5号线“7.20事件”情况公布
#2消防救援 8月4日 致谢风雨中的逆行者
#1公益捐赠 8月6日 艺人、企业家捐款
#6地铁排查 8月10日 郑州地铁3号线推迟空载运行
#0应急管理部署 8月12日 郑州地铁感谢信
#3雨情报道 8月13日 河南省气象台继续发布暴雨黄色预警
#5公路运输状况 8月14日 河南15条高速封闭
#4郑州地铁灾情 8月19日 李克强考察郑州地铁5号线受灾现场
#3雨情报道 8月21日 郑州暴雨即将再次来袭
#9社会援助事迹 8月21日 鸿星尔克向河南博物院捐赠100万元
#6地铁排查 8月22日 郑州京广路隧道淮河路段双向封闭
#8防汛报道 8月23日 60秒看河南人民应对暴雨有多努力
#7受灾者互助事迹 8月26日 河南小伙15秒救起暴雨中触电司机
Anomaly Time Points of Topic Popularity Time Series and Real Events
Abnormal Time Points and Real Events of Trends Topics
Abnormal Time Points and Real Events of Emergency Management Topics
Abnormal Time Points and Real Events of Social Reaction Topics
[1] Lyu J C, Han E L, Luli G K. COVID-19 Vaccine-Related Discussion on Twitter: Topic Modeling and Sentiment Analysis[J]. Journal of Medical Internet Research, 2021, 23(6): e24435.
doi: 10.2196/24435
[2] Weng J S, Lee B S. Event Detection in Twitter[C]// Proceedings of the 5th International AAAI Conference on Web and Social Media. 2011: 401-408.
[3] Xu Z. Personal Stories Matter: Topic Evolution and Popularity Among Pro- and Anti-vaccine Online Articles[J]. Journal of Computational Social Science, 2019, 2(2): 207-220.
doi: 10.1007/s42001-019-00044-w
[4] Zhang Z F, Li Q D. QuestionHolic: Hot Topic Discovery and Trend Analysis in Community Question Answering Systems[J]. Expert Systems with Applications, 2011, 38(6): 6848-6855.
doi: 10.1016/j.eswa.2010.12.052
[5] Kato S, Nakanishi T, Ahsan B, et al. Time-Series Topic Analysis Using Singular Spectrum Transformation for Detecting Political Business Cycles[J]. Journal of Cloud Computing, 2021, 10(1): 21.
[6] Ntompras C, Drosatos G, Kaldoudi E. A High-Resolution Temporal and Geospatial Content Analysis of Twitter Posts Related to the COVID-19 Pandemic[J]. Journal of Computational Social Science, 2022, 5(1): 687-729.
doi: 10.1007/s42001-021-00150-8
[7] Blázquez-García A, Conde A, Mori U, et al. A Review on Outlier/Anomaly Detection in Time Series Data[J]. ACM Computing Surveys, 2021, 54(3): 56.
[8] Hochenbaum J, Vallis O S, Kejariwal A. Automatic Anomaly Detection in the Cloud via Statistical Learning[OL]. arXiv Preprint, arXiv:1704.07706.
[9] Dani M C, Jollois F X, Nadif M, et al. Adaptive Threshold for Anomaly Detection Using Time Series Segmentation[C]// Proceedings of International Conference on Neural Information Processing. 2015: 82-89.
[10] Ansah J, Liu L, Kang W, et al. Leveraging Burst in Twitter Network Communities for Event Detection[J]. World Wide Web, 2020, 23(5): 2851-2876.
doi: 10.1007/s11280-020-00786-y
[11] Feng W, Zhang C, Zhang W, et al. STREAMCUBE: Hierarchical Spatio-Temporal Hashtag Clustering for Event Exploration over the Twitter Stream[C]// Proceedings of the 31st International Conference on Data Engineering. 2015: 1561-1572.
[12] Kleinberg J. Bursty and Hierarchical Structure in Streams[J]. Data Mining and Knowledge Discovery, 2003, 7(4): 373-397.
doi: 10.1023/A:1024940629314
[13] Leskovec J, Backstrom L, Kleinberg J. Meme-Tracking and the Dynamics of the News Cycle[C]// Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2009: 497-506.
[14] Mathioudakis M, Koudas N. TwitterMonitor: Trend Detection over the Twitter Stream[C]// Proceedings of the 2010 ACM SIGMOD International Conference on Management of Data. 2010: 1155-1158.
[15] 罗鹏程, 王一博, 王世奇, 等. 基于突发短语挖掘的微博事件检测方法研究[J]. 情报理论与实践, 2021, 44(12): 172-179.
[15] (Luo Pengcheng, Wang Yibo, Wang Shiqi, et al. Microblog Event Detection Method Based on Bursty Phrase Mining[J]. Information Studies: Theory & Application, 2021, 44(12): 172-179.)
[16] Stilo G, Velardi P. Efficient Temporal Mining of Micro-blog Texts and Its Application to Event Discovery[J]. Data Mining and Knowledge Discovery, 2016, 30(2): 372-402.
doi: 10.1007/s10618-015-0412-3
[17] Slater P B. A Two-Stage Algorithm for Extracting the Multiscale Backbone of Complex Weighted Networks[J]. Proceedings of the National Academy of Sciences of the United States of America, 2009, 106(26): E66.
[18] Traag V A, Waltman L, van Eck N J. From Louvain to Leiden: Guaranteeing Well-Connected Communities[J]. Scientific Reports, 2019, 9: 5233.
doi: 10.1038/s41598-019-41695-z pmid: 30914743
[19] Cruickshank I J, Carley K M. Characterizing Communities of Hashtag Usage on Twitter During the 2020 COVID-19 Pandemic by Multi-view Clustering[J]. Applied Network Science, 2020, 5(1): 66.
doi: 10.1007/s41109-020-00317-8 pmid: 32953977
[20] 李乾瑞, 郭俊芳, 黄颖, 等. 基于突变-融合视角的颠覆性技术主题演化研究[J]. 科学学研究, 2021, 39(12): 2129-2139.
[20] (Li Qianrui, Guo Junfang, Huang Ying, et al. Topic Evolution Research of Disruptive Technology Based on Mutation and Fusion Perspective[J]. Studies in Science of Science, 2021, 39(12): 2129-2139.)
[21] Cortés J D. Identifying the Dissension in Management and Business Research in Latin America and the Caribbean via Co-word Analysis[J]. Scientometrics, 2022, 127(12): 7111-7125.
doi: 10.1007/s11192-021-04259-5
[22] Cleveland R B, Cleveland W S, McRae J E, et al. STL: A Seasonal-Trend Decomposition Procedure Based on Loess[J]. Journal of Official Statistics, 1990, 6(1): 3-73.
[23] 黄纪心, 郭雪松. 基于应急任务驱动的灾害应对组织网络适应性机制——以河南郑州“7.20”特大暴雨应对为例[J]. 公共管理学报, 2022, 19(4): 52-64, 168-169.
[23] (Huang Jixin, Guo Xuesong. Research on Adaptive Mechanism of Disaster Response Organization Network Drived by Emergency Task——Taking the Case of “7.20” Extraordinary Rainstorm in Zhengzhou, Henan[J]. Journal of Public Management, 2022, 19(4): 52-64, 168-169.)
[24] 粟路军, 冯姗. 公共危机信息响应模式对公众应对行为的影响机制[J]. 管理评论, 2023, 35(1): 324-338.
[24] (Su Lujun, Feng Shan. The Mechanism of How Public Crisis Information Response Influences Public Reaction[J]. Management Review, 2023, 35(1): 324-338.)
[25] 孙莉玲. 几类网络舆情研判模型及应对策略研究[D]. 南京: 东南大学, 2016.
[25] (Sun Liling. On Analysis and Judgment Model and Coping Strategy for Some Kinds of Network Public Opinions[D]. Nanjing: Southeast University, 2016.)
[26] 姚乐野, 孟群. 重特大自然灾害舆情演化机理:构成要素、运行逻辑与动力因素[J]. 情报资料工作, 2020, 41(5): 49-57.
[26] (Yao Leye, Meng Qun. The Evolution Mechanism of Public Opinion on Large-Scale Natural Disasters: Constituent Elements, Operational Logic and Dynamic Factors[J]. Information and Documentation Services, 2020, 41(5): 49-57.)
[27] 鲁艳霞. 自然灾害突发事件中微博用户的心理反应及传播行为研究[D]. 大连: 大连理工大学, 2021.
[27] (Lu Yanxia. Research on the Psychological Response and Dissemination Behavior of Weibo Users in Natural Disaster Emergencies[D]. Dalian: Dalian University of Technology, 2021.)
[28] 李纲, 陈思菁, 毛进, 等. 自然灾害事件微博热点话题的时空对比分析[J]. 数据分析与知识发现, 2019, 3(11): 1-15.
[28] (Li Gang, Chen Sijing, Mao Jin, et al. Spatio-Temporal Comparison of Microblog Trending Topics on Natural Disasters[J]. Data Analysis and Knowledge Discovery, 2019, 3(11): 1-15.)
[29] 李紫薇, 邢云菲. 新媒体环境下突发事件网络舆情话题演进规律研究——以新浪微博“九寨沟地震”话题为例[J]. 情报科学, 2017, 35(12): 39-44, 167.
[29] (Li Ziwei, Xing Yunfei. Research on the Evolution of Emergency Public Opinion Topic in the New Media Environment ——A Case of “Jiuzhaigou Earthquake” in Sina Micro-blog[J]. Information Science, 2017, 35(12): 39-44, 167.)
[1] Qian Cong, Qi Jianglei, Ding Hao. Online Publication Recommendation Based on Weighted Features of User Multiple Interest Drift[J]. 数据分析与知识发现, 2023, 7(8): 119-127.
[2] Li Aihua, Wang Diwen, Xu Weijia, Li Zimo, Yao Sihan. Financial Fraud Detection for Growth Enterprise Market Listed Companies Based on Data Fusion[J]. 数据分析与知识发现, 2023, 7(5): 33-47.
[3] Gao Guangliang, Li Yazhou, Yuan Ming, Wang Qun. Community Detection Algorithm Base on Node and Edge Analysis[J]. 数据分析与知识发现, 2023, 7(11): 114-124.
[4] Zhang Teng, Ni Yuan, Mo Tong, Lv Xueqiang. Sentiment Curve Clustering and Communication Effects of Barrage Videos[J]. 数据分析与知识发现, 2022, 6(6): 32-45.
[5] Wang Jie,Gao Yuan,Zhang Lei,Ma Liwen,Feng Jun. Predicting Short-Term Urban Traffics Based on Causality Analysis Graph[J]. 数据分析与知识发现, 2022, 6(11): 111-125.
[6] Ding Hao, Hu Guangwei, Wang Ting, Suo Wei. Recommendation Method for Potential Factor Model Based on Time Series Drift[J]. 数据分析与知识发现, 2022, 6(10): 1-8.
[7] Ding Hao, Ai Wenhua, Hu Guangwei, Li Shuqing, Suo Wei. A Personalized Recommendation Model with Time Series Fluctuation of User Interest[J]. 数据分析与知识发现, 2021, 5(11): 45-58.
[8] Guo Xu,Qi Ruihua. Identifying Authorship with Novelty Detection Method[J]. 数据分析与知识发现, 2020, 4(4): 56-62.
[9] Yan Jinghua,Hou Miaomiao. Predicting Time Series of Theft Crimes Based on LSTM Network[J]. 数据分析与知识发现, 2020, 4(11): 84-91.
[10] Hao Ding,Shuqing Li. Personalized Recommendation Based on Predictive Analysis of User’s Interests[J]. 数据分析与知识发现, 2019, 3(11): 43-51.
[11] Shi Xiaohua,Lu Hongtao. Detecting Community in Scientific Collaboration Network with Bayesian Symmetric NMF[J]. 数据分析与知识发现, 2017, 1(9): 49-56.
[12] Zhai Dongsheng,Guo Cheng,Zhang Jie,Li Dengjie. Identifying Technology Opportunities with Anomaly Detection Technique[J]. 现代图书情报技术, 2016, 32(10): 81-90.
[13] Liu Haoxia, Peng Shanglian. A Community Detection Algorithm via Neighborhood Node Influence Based Label Propagation[J]. 现代图书情报技术, 2015, 31(4): 58-64.
[14] Wang Weijun, Bao Liqian, Liu Kai. Development Trends of Cloud Services in Time Dimension[J]. 现代图书情报技术, 2014, 30(3): 42-48.
[15] Bai Lingen, Chen Zhiqun, Wang Rongbo, Huang Xiaoxi. Empirical Analysis on K-core of Microblog Following Relationship Network[J]. 现代图书情报技术, 2013, 29(11): 68-74.
  Copyright © 2016 Data Analysis and Knowledge Discovery   Tel/Fax:(010)82626611-6626,82624938   E-mail:jishu@mail.las.ac.cn