|
|
A Survey of Burst Topic Detection Towards Social Text Stream Data |
Le Xiaoqiu1, Hong Na2 |
1. National Science Library, Chinese Academy of Sciences, Beijing 100190, China;
2. Institute of Medical Information, Chinese Academy of Medical Sciences, Beijing 100020, China |
|
|
Abstract Social text streams have rich contextual information and huge participants who communicate with informal steams. It needs to find suitable solutions to detect burst topics from this kind of data. In this paper, the authors comb through the concepts, the characteristics of social text stream data and the presentation forms of burst topics. It also summarizes the main research ideas and the basic procedures of burst topic detection towards social text stream data in three dimensions: textual content, social, and temporal. The principal approaches to make use of social features, such as user participation, social context and community structure evolution, for burst topic detection are generally discussed.
|
Received: 25 September 2012
Published: 24 January 2013
|
|
[1] Matsumura N, Goldberg D E, Llora X. Mining Directed Social Network from MessageBoard[OL].[2012-09-20].http://delivery.acm.org/10.1145/1070000/1062884/p1092-matsumura.pdf?ip=159.226.100.225&acc=ACTIVE%20SERVICE&CFID=118941016&CFTOKEN=80199382&__acm__=1348453917_c173fc996f56a0c611739ba100e23712.[2] Mathioudakis M, Koudas N. TwitterMonitor: Trend Detection over the Twitter Stream [OL].[2012-03-11].http://delivery.acm.org/10.1145/1810000/1807306/p1155-mathioudakis.pdf?ip=159.226.100.225&CFID=35057597&CFTOKEN=95005305&__acm__=1310711607_fd4454ee954f38c1a4c767c8b 5047820.[3] Zhao Q, Mitra P, Chen B. Temporal and Information Flow Based Event Detection from Social Text Streams[C]. In: Proceedings of the 22nd AAAI Conference on Artificial Intelligence. 2007:1501-1506.[4] Kleinberg J. Bursty and Hierarchical Structure in Streams[J]. Data Mining and Knowledge Discovery, 2003,7(4): 373-397.[5] Fujiki T, Nanno T, Suzuki Y, et al. Identification of Bursts in a Document Stream[OL]. [2012-10-12].http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.106.6773.[6] Chen Y,Yang S,Cheng X Q. Bursty Topics Extraction for Web Forums[C/OL].In: Proceedings of the 11th International Workshop on Web Information and Data Management.2009:55-58. [2012-03-09]. http://portal.acm.org/citation.cfm?id=1651587.1651600&coll=DL&dl=ACM&CFID=19983188&CFTOKEN=89593705.[7] Yao J, Cui B, Huang Y, et al.Temporal and Social Context Based Burst Detection from Folksonomies[C]. In: Proceedings of the 24th AAAI Conference on Artificial Intelligence. 2010:1474-1479.[8] 陈友,程学旗,杨森.面向网络论坛的突发话题发现[J]. 中文信息学报,2010,24(3):29-36.(Chen You, Cheng Xueqi, Yang Sen. Outburst Topic Detection for Web Forums[J]. Journal of Chinese Information Processing, 2010, 24(3): 29-36. )[9] 徐戈,王厚峰.自然语言处理中主题模型的发展[J]. 计算机学报, 2011,34(8): 1423-1436.(Xu Ge, Wang Houfeng. The Development of Topic Models in Natural Language Processing [J]. Chinese Journal of Computer, 2011, 34(8): 1423-1436.)[10] Diao Q, Jiang J, Zhu F, et al. Finding Bursty Topics from Microblogs[OL]. [2012-10-02]. http://www.mysmu.edu/faculty/jingjiang/papers/ACL’12.pdf.[11] Ramage D,Hall D,Nallapati R,et al.Labeled LDA:A Supervised Topic Model for Credit Attribution in Multi Labeled Corpora[C]. In: Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing, Singapore.2009:248-256.[12] He Q, Chang K, Lim E P. Analyzing Feature Trajectories for Event Detection[C]. In: Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. 2007: 207-214.[13] Zhu M, Hu W, Wu O.Topic Detection and Tracking for Threaded Discussion Communities[C]. In: Proceedings of the 2008 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology. 2008:77-83.[14] Wu Z L, Li C H. Topic Detection in Online Discussion Using Non-Negative Matrix Factorization[C].In: Proceedings of the 2007 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology-Workshops.2007:272-275.[15] Parikh N, Sundaresan N. Scalable and Near Real-Time Burst Detection from ECommerce Queries[OL]. [2012-04-16]. http://portal.acm.org/citation.cfm?id=1401890.1402006&coll=DL&dl= ACM &CFID=19983188&CFTOKEN=89593705.[16] Liu Y T, Gao B, Liu T Y. BrowseRank: Letting Web Users Vote for Page Importance[C]. In: Proceedings of the 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Singapore.2008:451-458.[17] 李东方,俞能海,尹华罡. 一种Web 2.0环境下互联网热点挖掘算法[J]. 电子与信息学报, 2010, 32(5):1142-1145.(Li Dongfang, Yu Nenghai, Yin Huagang. Mining Hot Topics on Internet Under Web 2.0[J]. Journal of Electronics & Information Technology, 2010,32(5):1142-1145.)[18] Cataldi M,Caro L D,Schifanella C. Emerging Topic Detection on Twitter Based on Temporal and Social Terms Evaluation[OL].[2012-05-15]. http://delivery.acm.org/10.1145/1820000/ 1814249/a4-cataldi.pdf?ip=159.226.100.225&acc=ACTIVE%20SERVICE&CFID=118941016&CFTOKEN=80199382 &_acm_=1348454353_fc699f8fab306f6ff8ed2cd2b208f191.[19] Adamic L A, Adar E. Friends and Neighbors on the Web[J]. Social Networks, 2003, 25(3):211-230.[20] Backstrom L, Huttenlocher D, Kleinberg J, et al.Group Formation in Large Social Networks: Membership, Growth, and Evolution[C]. In: Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2006: 44-54.[21] Lin C X, Zhao B, Mei Q, et al. PET:A Statistical Model for Popular Event Tracking in Social Communities[C/OL]. In: Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Washington, DC, USA.2010:929-938. [2012-05-15]. http://delivery.acm.org/10.1145/1840000/1835922/p929-lin.pdf? ip=159.226.100.225&CFID=35057597&CFTOKEN=95005305&__acm__=1310719076_1a0cf689597ec51c79e4b05c7c614370.[22] Kumar R, Novak J, Raghavan P, et al. On the Bursty Evolution of Blogspace[C/OL]. In:Proceedings of the 12th International Conference on World Wide Web, Budapest, Hungary. 2003:568-576. [2012-05-15]. http://delivery.acm.org/10.1145/780000/775233/p568-kumar.pdf? ip= 159.226.100.225&acc=ACTIVE%20SERVICE&CFID=118941016&CFTOKEN=80199382 &__acm__=1348454514_530db4146e72606621c2c46ea5822704. |
|
Viewed |
|
|
|
Full text
|
|
|
|
|
Abstract
|
|
|
|
|
Cited |
|
|
|
|
|
Shared |
|
|
|
|
|
Discussed |
|
|
|
|