Abstract:This paper introduces the definition of correlated bursty topic patterns and studies the key issues of mining correlated bursty topic patterns such as detect bursty topics, locate bursty period of a bursty topic and discover correlated bursty topics.Finally, it analyzes the methods of mining correlated bursty topics from text collections, synchronous text streams and asynchronous text streams.
黄永文. 关联爆发主题模式挖掘方法研究综述[J]. 现代图书情报技术, 2012, (10): 28-34.
Huang Yongwen. Review on Mining Methods of Correlated Bursty Topic Patterns. New Technology of Library and Information Service, 2012, (10): 28-34.
[1] Klan D, Karnstedt M, Pölitz C, et al. Towards Burst Detection for Non-Stationary Stream Data[EB/OL].[2012-06-25]. http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.150.1719&rep=rep1&type=pdf.[2] Kleinberg J. Bursty and Hierarchical Structure in Streams[J].Data Mining and Knowledge Discovery,2003,7(4): 373-397.[3] Wang X H, Zhai C X, Hu X,et al. Mining Correlated Bursty Topic Patterns from Coordinated Text Streams[C].In: Proceedings of the 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Jose, California, USA. New York:ACM,2007:784-793.[4] Yi J. Detecting Buzz from Time-Sequenced Document Streams[C].In: Proceedings of the 2005 IEEE International Conference on e-Technology, e-Commerce and e-Service.Washington:IEEE Computer Society,2005: 347-352.[5] Fujiki T, Nanno T, Suzuki Y, et al. Identification of Bursts in a Document Stream[J].Joho Shori Gakkai Kenkyu Hokoku,2004(23):85-92.[6] Sunehag P. Using Two-Stage Conditional Word Frequency Models to Model Word Burstiness and Motivating TF-IDF[C].In:Proceedings of the 11th International Conference for Artificial Intelligence and Statistic.New Jersey:The Society for AI and Statistics,2007:8-16.[7] Kotov A, Zhai C X, Sproat R. Mining Named Entities with Temporally Correlated Bursts from Multilingual Web News Streams[C]. In: Proceedings of the 4th ACM International Conference on Web Search and Data Mining, Hong Kong, China. New York:ACM,2011:237-246.[8] 钱哲怡,李芳.基于关键词和命名实体识别的新闻话题线索抽取[J]. 计算机应用与软件,2011,28(12):168-171.(Qian Zheyi, Li Fang.Keyword and Name Entity Identification Based News Topic Thread Extraction[J].Computer Applications and Software,2011,28(12):168-171.)[9] Zhu Y Y,Shasha D. Efficient Elastic Burst Detection in Data Streams[C].In: Proceedings of the 9th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Washington, DC, USA.New York:ACM,2003:336-345.[10] Yuan Z J,Jia Y,Yang S Q. Online Burst Detection over High Speed Short Text Streams[C].In: Proceedings of the 7th International Conference on Computational Science,Beijing,China. Berlin,Heidelberg:Springer-Verlag,2007:717-725.[11] Krause A, Leskovec J, Guestrin C. Data Association for Topic Intensity Tracking[C].In: Proceedings of the 23rd International Conference on Machine Learning. New York:ACM,2006:497-504.[12] He Q, Chang K Y,Lim E P. Using Burstiness to Improve Clustering of Topics in News Streams[C]. In:Proceedings of the 7th IEEE International Conference on Data Mining. Washington:IEEE Computer Society,2007: 493-498.[13] Wang X R, McCallum A. Topics Over Time:A Non-Markov Continuous-Time Model of Topical Trends[C]. In: Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.New York:ACM, 2006:424-433.[14] Li W,Wang X R,McCallum A.A Continuous-Time Model of Topic Co-Occurrence Trends[C].In: Proceedings of the 21st National Conference on Artificial Intelligence Workshop on Event Extraciton and Systhesis,2006:48-53.[15] Ren L,Dunson D B, Carin L. The Dynamic Hierarchical Dirichlet Process[C].In: Proceedings of the 25th International Conference on Machine Learning.New York:ACM,2008:824-831.[16] Blei D M, Ng A Y, Jordan M I. Latent Dirichlet Allocation[J]. Journal of Machine Learning Research,2003,3:993-1022.[17] Blei D M, Lafferty J D.A Correlated Topic Model of Science[J].The Annals of Applied Statistics,2007,1(1):17-35.[18] Li W, McCallum A. Pachinko Allocation:Dag-Structured Mixture Models of Topic Correlations[C].In: Proceedings of the 23rd International Conference on Machine Learning.New York:ACM,2006:577-584.[19] Li W, Blei D, McCallum A. Nonparametric Bayes Pachinko Allocation[EB/OL].[2012-06-25].http://www.cs.princeton.edu/~blei/papers/LiBleiMcCallum2007.pdf.[20] Kim D I, Sudderth E B. The Doubly Correlated Nonparametric Topic Model[EB/OL].[2012-06-25]. http://www.cs.brown.edu/~daeil/docs/dcnt_2011.pdf.[21] Wang C, Blei D,Heckerman D. Continuous Time Dynamic Topic Models[C]. In: Proceedings of Uncertainty in Artificial Intelligence.Corvallis: AUAI Press,2008: 579-586.[22] Blei D M, Lafferty J D. Dynamic Topic Models[C].In: Proceedings of the 23rd International Conference on Machine Learning.New York:ACM, 2006:113-120.[23] Ni X C,Sun J T,Hu J, et al. Mining Multilingual Topics from Wikipedia[C].In: Proceedings of the 18th International Conference on World Wide Web.New York:ACM,2009: 1155-1156.[24] Zhao W X,Jiang J,He J,et al. Context Modeling for Ranking and Tagging Bursty Features in Text Streams[C].In: Proceedings of the 19th ACM International Conference on Information and Knowledge Management.New York:ACM,2010:1769-1772.[25] Yao J J,Cui B, Huang Y X,et al. Temporal and Social Context Based Burst Detection from Folksonomies[C].In:Proceedings of the 24th AAAI Conference on Artificial Intelligence.California, USA:AAAI,2010:1474-1479.[26] Morita K, Atlam E S,Fuketra M,et al. Word Classification and Hierarchy Using Co-Occurrence Word Information[J].Information Processing and Management,2004,40(6):957-972.[27] Sun A, Zeng D D, Chen H.Burst Detection from Multiple Data Streams: A Network-Based Approach[J]. IEEE Transactions on Systems, Man, and Cybernetics, Part C: Applications and Reviews,2010,40(3):258-267.[28] Wang X, Zhang K, Jin X M, et al. Mining Common Topics from Multiple Asynchronous Text Streams[C]. In: Proceedings of the 2nd ACM International Conference on Web Search and Data Mining.New York:ACM,2009: 192-201.