[Objective] There are lots of irrelevant results among the topics identified by the LDA model, which poses negative effects to the accuracy of evolution analysis. This paper constructs topics evolution paths to analyze their evolution by filtering out noises and calculating relevance. [Methods] First, we filtered out irrelevant topics by their probability of appearing in all documents and the word propensity distribution of topics. Then, we calculated the Jensen-Shannon Divergence to identify related topics. Finally, we constructed the topic evolution paths based on the correlation between topics. [Results] The effectiveness of the proposed method was examined with scientific literature on “machine learning”, which yielded five evolution paths, i.e. rebirth, extinction, succession, division and merger. [Limitations] There are some subjective factors involving the estimated threshold values. [Conclusions] The proposed method could avoid the interference of noise topics, and then identify relevant topics from adjacent time intervals. It helps us discover the evolution of discipline topics more accurately.
(Tang Guoyuan, Zhang Wei.Development and Analysis of Subject Theme Evolution Based on Co-word Analysis Method[J]. Library and Information Service, 2015, 59(5): 128-136.)
Blei D M, Ng A Y, Jordan M I.Latent Dirichlet Allocation[J]. Journal of Machine Learning Research, 2003, 3: 993-1022.
Wang X, McCallum A. Topic over Time: A Non-Markov Continuous-Time Model of Topical Trends[C] //Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2006: 424-433.
(Zhu Na, Wang Fang.Identification of Knowledge Evolutionary Path Based on Topic Relevance: Taking the Case of 3D Printing Field[J]. Library and Information Service, 2016, 60(5): 101-109.)
(Qi Yashuang, Zhu Na, Zhai Yujia.A Comparative Study on Topic Heats Evolution in the Field of Information Science Between the Domestic and Foreign Research Based on DTM[J]. Library and Information Service, 2016, 60(16): 99-109.)
(Wang Yanpeng.Research Progress of Scientific and Technical Literature Topic Detection and Evolution Based on Topic Model in China[J]. Library and Information Service, 2016, 60(3): 130-137.)
Cao J, Xia T, Li J, et al.A Density-based Method for Adaptive LDA Model Selection[J]. Neurocomputing, 2009, 72(7-9): 1775-1781.