Data Analysis and Knowledge Discovery  2018, Vol. 2 Issue (1): 64-75    DOI: 10.11925/infotech.2096-3467.2017.1114
Analyzing Topic Evolution with Topic Filtering and Relevance
Jiabin Qu1,2,Shiyan Ou1()
1(School of Information Management, Nanjing University, Nanjing 210023, China)
2(Yantai University Library, Yantai 264005, China)
[Objective] There are lots of irrelevant results among the topics identified by the LDA model, which poses negative effects to the accuracy of evolution analysis. This paper constructs topics evolution paths to analyze their evolution by filtering out noises and calculating relevance. [Methods] First, we filtered out irrelevant topics by their probability of appearing in all documents and the word propensity distribution of topics. Then, we calculated the Jensen-Shannon Divergence to identify related topics. Finally, we constructed the topic evolution paths based on the correlation between topics. [Results] The effectiveness of the proposed method was examined with scientific literature on “machine learning”, which yielded five evolution paths, i.e. rebirth, extinction, succession, division and merger. [Limitations] There are some subjective factors involving the estimated threshold values. [Conclusions] The proposed method could avoid the interference of noise topics, and then identify relevant topics from adjacent time intervals. It helps us discover the evolution of discipline topics more accurately.

Key wordsDiscipline Topics Evolution      Topic Filtering      LDA Topic Model      Evolution Analysis     
Received: 07 November 2017      Published: 05 February 2018

Jiabin Qu,Shiyan Ou. Analyzing Topic Evolution with Topic Filtering and Relevance. Data Analysis and Knowledge Discovery, 2018, 2(1): 64-75.

