[1] Xie W, Zhu F, Jiang J, et al. TopicSketch: Real-Time Bursty Topic Detection from Twitter [C]. In: Proceedings of the 13th International Conference on Data Mining, Dallas, Texas, USA. IEEE, 2013: 837-846.
[2] Dean J, Ghemawat S. MapReduce: Simplified Data Processing on Large Clusters [J]. Communications of the ACM, 2008, 51(1): 107-113.
[3] Hadoop [EB/OL]. [2014-07-15]. http://hadoop.apache.org/.
[4] Allan J, Carbonell J, Doddington G, et al. Topic Detection and Tracking Pilot Study Final Report [C]. In: Proceedings of the DARPA Broadcast News Transcription and Understanding Workshop, 1998: 194-218.
[5] Hofmann T. Probabilistic Latent Semantic Analysis [C]. In: Proceedings of the 15th Conference on Uncertainty in Artificial Intelligence. Morgan Kaufmann Publishers Inc., 1999: 289-296.
[6] Blei D M, Ng A Y, Jordan M I. Latent Dirichlet Allocation [J]. The Journal of Machine Learning Research, 2003, 3: 993-1022.
[7] 李文波, 孙乐, 张大鲲. 基于 Labeled-LDA 模型的文本分 类新算法[J]. 计算机学报, 2008, 31(4): 620-627. (Li Wenbo, Sun Le, Zhang Dakun. Text Classification Based on Labeled-LDA Model [J]. Chinese Journal of Computers, 2008, 31(4): 620-627.)
[8] Wang X, Zhai C, Hu X, et al. Mining Correlated Bursty Topic Patterns from Coordinated Text Streams [C]. In: Proceedings of the 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York, USA: ACM, 2007: 784-793.
[9] Lin C X, Zhao B, Mei Q, et al. PET: A Statistical Model for Popular Events Tracking in Social Communities [C]. In: Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York, USA: ACM, 2010: 929-938.
[10] Dubrawski A. Detection of Events in Multiple Streams of Surveillance Data [A].//Infectious Disease Informatics and Biosurveillance [M]. Springer US, 2011: 145-171.
[11] Diao Q, Jiang J, Zhu F, et al. Finding Bursty Topics from Microblogs [C]. In: Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics, Jeju Island, Korea. 2012: 536-544.
[12] 周刚, 邹鸿程, 熊小兵, 等. MB-SinglePass: 基于组合相似 度的微博话题检测[J]. 计算机科学, 2012, 39(10): 198-202. (Zhou Gang, Zou Hongcheng, Xiong Xiaobing, et al. MB-SinglePass: Microblog Topic Detection Based on Combined Similarity [J]. Computer Science, 2012, 39(10): 198-202.)
[13] 郭跇秀, 吕学强, 李卓. 基于突发词聚类的微博突发事件 检测方法[J]. 计算机应用, 2014, 34(2): 486-490. (Guo Yixiu, Lv Xueqiang, Li Zhuo. Bursty Topics Detection Approach on Chinese Microblog Based on Burst Words Clustering [J]. Journal of Computer Applications, 2014, 34(2): 486-490.)
[14] 王勇, 肖诗斌, 郭跇秀, 等. 中文微博突发事件检测研究[J]. 现代图书情报技术, 2013(2): 57-62. (Wang Yong, Xiao Shibin, Guo Yixiu, et al. Research on Chinese Micro-blog Bursty Topics Detection [J]. New Technology of Library and Information Service, 2013(2): 57-62.)
[15] 邱云飞, 程亮. 微博突发话题检测方法研究[J]. 计算机工 程, 2012, 38(9): 288-290. (Qiu Yunfei, Cheng Liang. Research on Sudden Topic Detection Method for Microblog[J]. Computer Engineering, 2012, 38(9): 288-290.)
[16] Kleinberg J. Bursty and Hierarchical Structure in Streams [C]. In: Proceedings of the 8th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York: ACM, 2002: 91-101.
[17] Ihler A, Hutchins J, Smyth P. Adaptive Event Detection with Time-Varying Poisson Processes [C]. In: Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.New York: ACM, 2006: 207-216.
[18] Nakahara T, Hamuro Y. Detecting Topics from Twitter Posts During TV Program Viewing [C]. In: Proceedings of the 13th International Conference on Data Mining, Dallas, Texas, USA. IEEE, 2013: 714-719.
[19] Zhang L, Jia Y, Zhou B, et al. Detecting Real-Time Burst Topics in Microblog Streams: How Sentiment Can Help [C]. In: Proceedings of the 22nd International Conference on World Wide Web Companion. 2013: 781-782.
[20] Koike D, Takahashi Y, Utsuro T, et al. Time Series Topic Modeling and Bursty Topic Detection of Correlated News and Twitter [C]. In: Proceedings of the 6th International Joint Conference on Natural Language Processing (IJCNLP), Nagoya, Japan. 2013: 917-921.
[21] He D, Parker D S. Topic Dynamics: An Alternative Model of Bursts in Streams of Topics [C]. In: Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.New York, USA: ACM, 2010: 443-452.
[22] 李锐, 王斌. 文本处理中的MapReduce 技术[J]. 中文信息 学报, 2012, 26(4): 9-20. (Li Rui, Wang Bin. MapReduce in Text Processing [J]. Journal of Chinese Information Processing, 2012, 26(4): 9-20.)
[23] Das A S, Datar M, Garg A, et al. Google News Personalization: Scalable Online Collaborative Filtering[C]. In: Proceedings of the 16th International Conference on World Wide Web. New York: ACM, 2007: 271-280.
[24] Choi H, Lee K H, Lee Y J. Parallel Labeling of Massive XML Data with MapReduce [J]. Journal of Supercomputing, 2013, 67(2): 408-437.
[25] 刘滔, 雷霖, 陈荦, 等. 基于MapReduce 的中文词性标注 CRF 模型并行化训练研究[J]. 北京大学学报: 自然科学版, 2013, 49(1): 147-152. (Liu Tao, Lei Lin, Chen Luo, et al. A Parallel Training Research of Chinese Part-of-Speech Tagging CRF Model Based on MapReduce [J]. Acta Scientiarum Naturalium Universitatis Pekinensis, 2013, 49(1): 147-152.)
[26] What is Apache Mahout? [EB/OL]. [2014-09-27]. http://mahout.apache.org/.
[27] Nallapati R, Cohen W, Lafferty J. Parallelized Variational EM for Latent Dirichlet Allocation: An Experimental Evaluation of Speed and Scalability [C]. In: Proceedings of the 17th IEEE International Conference on Data Mining Workshops, Omaha, Nebraska, USA. IEEE, 2007: 349-354.
[28] Zhai K, Boyd-Graber J, Asadi N. Using Variational Inference and MapReduce to Scale Topic Modeling [OL]. Eprint arXiv, 2011. arXiv: 1107.3765. |