|
|
Review on Mining Methods of Correlated Bursty Topic Patterns |
Huang Yongwen |
National Science Library, Chinese Academy of Sciences, Beijing 100190, China |
|
|
Abstract This paper introduces the definition of correlated bursty topic patterns and studies the key issues of mining correlated bursty topic patterns such as detect bursty topics, locate bursty period of a bursty topic and discover correlated bursty topics.Finally, it analyzes the methods of mining correlated bursty topics from text collections, synchronous text streams and asynchronous text streams.
|
Received: 25 September 2012
Published: 24 January 2013
|
|
[1] Klan D, Karnstedt M, Pölitz C, et al. Towards Burst Detection for Non-Stationary Stream Data[EB/OL].[2012-06-25]. http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.150.1719&rep=rep1&type=pdf.[2] Kleinberg J. Bursty and Hierarchical Structure in Streams[J].Data Mining and Knowledge Discovery,2003,7(4): 373-397.[3] Wang X H, Zhai C X, Hu X,et al. Mining Correlated Bursty Topic Patterns from Coordinated Text Streams[C].In: Proceedings of the 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Jose, California, USA. New York:ACM,2007:784-793.[4] Yi J. Detecting Buzz from Time-Sequenced Document Streams[C].In: Proceedings of the 2005 IEEE International Conference on e-Technology, e-Commerce and e-Service.Washington:IEEE Computer Society,2005: 347-352.[5] Fujiki T, Nanno T, Suzuki Y, et al. Identification of Bursts in a Document Stream[J].Joho Shori Gakkai Kenkyu Hokoku,2004(23):85-92.[6] Sunehag P. Using Two-Stage Conditional Word Frequency Models to Model Word Burstiness and Motivating TF-IDF[C].In:Proceedings of the 11th International Conference for Artificial Intelligence and Statistic.New Jersey:The Society for AI and Statistics,2007:8-16.[7] Kotov A, Zhai C X, Sproat R. Mining Named Entities with Temporally Correlated Bursts from Multilingual Web News Streams[C]. In: Proceedings of the 4th ACM International Conference on Web Search and Data Mining, Hong Kong, China. New York:ACM,2011:237-246.[8] 钱哲怡,李芳.基于关键词和命名实体识别的新闻话题线索抽取[J]. 计算机应用与软件,2011,28(12):168-171.(Qian Zheyi, Li Fang.Keyword and Name Entity Identification Based News Topic Thread Extraction[J].Computer Applications and Software,2011,28(12):168-171.)[9] Zhu Y Y,Shasha D. Efficient Elastic Burst Detection in Data Streams[C].In: Proceedings of the 9th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Washington, DC, USA.New York:ACM,2003:336-345.[10] Yuan Z J,Jia Y,Yang S Q. Online Burst Detection over High Speed Short Text Streams[C].In: Proceedings of the 7th International Conference on Computational Science,Beijing,China. Berlin,Heidelberg:Springer-Verlag,2007:717-725.[11] Krause A, Leskovec J, Guestrin C. Data Association for Topic Intensity Tracking[C].In: Proceedings of the 23rd International Conference on Machine Learning. New York:ACM,2006:497-504.[12] He Q, Chang K Y,Lim E P. Using Burstiness to Improve Clustering of Topics in News Streams[C]. In:Proceedings of the 7th IEEE International Conference on Data Mining. Washington:IEEE Computer Society,2007: 493-498.[13] Wang X R, McCallum A. Topics Over Time:A Non-Markov Continuous-Time Model of Topical Trends[C]. In: Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.New York:ACM, 2006:424-433.[14] Li W,Wang X R,McCallum A.A Continuous-Time Model of Topic Co-Occurrence Trends[C].In: Proceedings of the 21st National Conference on Artificial Intelligence Workshop on Event Extraciton and Systhesis,2006:48-53.[15] Ren L,Dunson D B, Carin L. The Dynamic Hierarchical Dirichlet Process[C].In: Proceedings of the 25th International Conference on Machine Learning.New York:ACM,2008:824-831.[16] Blei D M, Ng A Y, Jordan M I. Latent Dirichlet Allocation[J]. Journal of Machine Learning Research,2003,3:993-1022.[17] Blei D M, Lafferty J D.A Correlated Topic Model of Science[J].The Annals of Applied Statistics,2007,1(1):17-35.[18] Li W, McCallum A. Pachinko Allocation:Dag-Structured Mixture Models of Topic Correlations[C].In: Proceedings of the 23rd International Conference on Machine Learning.New York:ACM,2006:577-584.[19] Li W, Blei D, McCallum A. Nonparametric Bayes Pachinko Allocation[EB/OL].[2012-06-25].http://www.cs.princeton.edu/~blei/papers/LiBleiMcCallum2007.pdf.[20] Kim D I, Sudderth E B. The Doubly Correlated Nonparametric Topic Model[EB/OL].[2012-06-25]. http://www.cs.brown.edu/~daeil/docs/dcnt_2011.pdf.[21] Wang C, Blei D,Heckerman D. Continuous Time Dynamic Topic Models[C]. In: Proceedings of Uncertainty in Artificial Intelligence.Corvallis: AUAI Press,2008: 579-586.[22] Blei D M, Lafferty J D. Dynamic Topic Models[C].In: Proceedings of the 23rd International Conference on Machine Learning.New York:ACM, 2006:113-120.[23] Ni X C,Sun J T,Hu J, et al. Mining Multilingual Topics from Wikipedia[C].In: Proceedings of the 18th International Conference on World Wide Web.New York:ACM,2009: 1155-1156.[24] Zhao W X,Jiang J,He J,et al. Context Modeling for Ranking and Tagging Bursty Features in Text Streams[C].In: Proceedings of the 19th ACM International Conference on Information and Knowledge Management.New York:ACM,2010:1769-1772.[25] Yao J J,Cui B, Huang Y X,et al. Temporal and Social Context Based Burst Detection from Folksonomies[C].In:Proceedings of the 24th AAAI Conference on Artificial Intelligence.California, USA:AAAI,2010:1474-1479.[26] Morita K, Atlam E S,Fuketra M,et al. Word Classification and Hierarchy Using Co-Occurrence Word Information[J].Information Processing and Management,2004,40(6):957-972.[27] Sun A, Zeng D D, Chen H.Burst Detection from Multiple Data Streams: A Network-Based Approach[J]. IEEE Transactions on Systems, Man, and Cybernetics, Part C: Applications and Reviews,2010,40(3):258-267.[28] Wang X, Zhang K, Jin X M, et al. Mining Common Topics from Multiple Asynchronous Text Streams[C]. In: Proceedings of the 2nd ACM International Conference on Web Search and Data Mining.New York:ACM,2009: 192-201. |
|
Viewed |
|
|
|
Full text
|
|
|
|
|
Abstract
|
|
|
|
|
Cited |
|
|
|
|
|
Shared |
|
|
|
|
|
Discussed |
|
|
|
|