%A Ding Shengchun,Gong Silan,Li Hongmei %T A New Method to Detect Bursty Events from Micro-blog Posts Based on Bursty Topic Words and Agglomerative Hierarchical Clustering Algorithm %0 Journal Article %D 2016 %J Data Analysis and Knowledge Discovery %R 10.11925/infotech.1003-3513.2016.07.03 %P 12-20 %V 32 %N 7-8 %U {https://manu44.magtech.com.cn/Jwk_infotech_wk3/CN/abstract/article_4247.shtml} %8 2016-08-25 %X

[Objective] This paper proposes a new method to detect real time bursty events accurately and efficiently from massive micro-blog posts. It provides decision-making information to public opinion emergency management. [Methods] First, we introduced the reference time window mechanism, and then designed an algorithm to process the data of word frequency, document frequency, Hashtags, and word frequency growth rates. Second, used this dynamic threshold based algorithm to extract bursty words. Third, transformed micro-blog texts to feature vector of the bursty words. Finally, we detected the bursty events using agglomerative hierarchical clustering algorithm. [Results] The bursty events detection method reached 80% of accuracy rate compared with real world cases. Thus, the proposed method was feasible and effective. [Limitations] We could not describe the detected emergencies automatically due to the limits of data and size of the current study. More research is needed to analyze users’ emotion and semantic relationships among the bursty events. [Conclusions] Our study fills the knowledge gaps left by previous research, and improves the efficiency of retrieving bursty events from micro-blog posts.