Abstract:This paper presents a new method to infer the LDA topic evolution automatically based on seminal documents. The semantic distribution of the seminal documents is used to guide the successive model and link topics between consecutive time slices. The experiments are based on NIPS dataset and Chinese newswire of NPC and CPPCC,and the results show that the method can not only get the correct evolutions in various forms, but also avoid those related topics without evolution relationship.
单斌, 李芳. 基于种子文档LDA话题的演化研究[J]. 现代图书情报技术, 2011, 27(7/8): 104-109.
Shan Bin, Li Fang. Topic Evolution Based on Seminal Document and Topic Model. New Technology of Library and Information Service, 2011, 27(7/8): 104-109.
[1] Blei D M, Ng A Y, Jordan M I. Latent Dirichlet Allocation[J]. The Journal of Machine Learning Research,2003(3):993-1022.[2] Wang X, McCallum A. Topic over Time: A Non-markov Continuous-time Model of Topical Trends . In: Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Philadelphia,PA,USA.2006:424-433.[3] Rosen-Zvi M,Griffiths T,Steyvers M,et al. The Author-topic Model for Authors and Documents . In: Proceedings of the 20th Conference on Uncertainty in Artificial Intelligence,Banff,Canada.2004:487-494.[4] Blei D M,McAuliffe J D. Supervised Topic Models . In: Proceeding of the 22nd Annual Conference on Neural Information Processing Systems.2008.[5] Blei D M, LaffertyJ D. Dynamic Topic Model .In: Proceedings of the 23rd International Conference on Machine Learning,Pittsburgh,Pennsylvania.2006:113-120.[6] Wei X,Sun J,Wang X. Dynamic Mixture Models for Multiple Time Series .In: Proceedings of the 20th International Joint Conference on Artificial Intelligence.2007: 2909-2914.[7] 单斌,李芳.基于LDA话题演化研究方法综述[J]. 中文信息学报, 2010,24(6):43-49,68.[8] Makkonen J. Investigations on Event Evolution in TDT . In: Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology.2003:43-48.[9] 楚克明,李芳.基于LDA 话题关联的话题演化[J]. 上海交通大学学报, 2010,44(11):1501-1506.[10] Nallapati R M,Ahmed A,Xing E P,et al. Joint Latent Topic Models for Text and Citations . In: Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.New York:ACM Press,2008:542-550.[11] Alsumait L,Barbará D,Gentle J,et al. Topic Significance Ranking of LDA Generative Models . In: Proceeding of the European Conference on Machine Learning and Knowledge Discovery in Databases: Part I.2009:67-82.[12] GriffithsT L,Steyvers M. Finding Scientific Topics .In: Proceeding of the National Academy of Science of United States of America.2004,101:5228-5235.