Data Analysis and Knowledge Discovery  2020, Vol. 4 Issue (6): 22-34    DOI: 10.11925/infotech.2096-3467.2019.1155
Evolution Analysis of Hot Topics with Trend-Prediction
Yue Lixin1,Liu Ziqiang2,3(),Hu Zhengyin2,3
1School of Information Resource Management, Renmin University of China, Beijing 100872, China
2Chengdu Library of Chinese Academy of Sciences, Chengdu 610041, China
3Department of Library, Information and Archives Management, School of Economics and Management, University of Chinese Academy of Sciences, Beijing 100190, China
[Objective] The paper constructs mathematical and content prediction models based on the external and internal characteristics academic articles, aiming to analyze the evolution of trending research topics. [Methods] With the help of LDA model, we identified the needed topics and constructed their time series. Then, we determined the popular topics by mean values and linear regression fitting. Finally, we predicted the trending topics with ARIMA and Word2Vec models based on the topic intensity and content. [Results] We conducted an empirical study to evaluate our models with stem cell research in the United States. We identified popular topics and predicted their development trends. [Limitations] There might be ambiguity in interpreting the documents, because the Word2Vec model analyzes trends of theme contents based on single words. [Conclusions] The proposed method can provide better prediction results than methods based on manual interpretation.

Key wordsTrend Prediction      Hot Topics      ARIMA Model      Word2Vec      Topic Evolution     
Received: 22 October 2019      Published: 07 July 2020
ZTFLH:  G350  
Yue Lixin,Liu Ziqiang,Hu Zhengyin. Evolution Analysis of Hot Topics with Trend-Prediction. Data Analysis and Knowledge Discovery, 2020, 4(6): 22-34.

模型 自相关函数(ACF) 偏自相关函数(PACF)
AR(p) 拖尾 p阶后截尾
MA(q) q阶后截尾 拖尾
ARMA(p, q) q阶后拖尾 p阶后拖尾
Determination of Model Parameters
Schematic Diagram of CBOW Model and Skip-Gram Model
Annual Distribution of Papers
Determination of Optimal Number of Topics
主题序号 主题词
Topic1 acute|intestinal|hematopoietic|kinase|term|epithelial|
Topic2 pathway|cancer|embryonic|hematopoietic|virus|cell|
Topic3 regulation|marrow|hematopoietic|embryonic|biology|
Topic4 resistance|cell|effect|hematopoietic|cancer|imaging|
Topic5 cell|cancer|breast|pancreatic|new|hematopoietic|
…… ……
Research Topics in the Field of Stem Cells in the United States (Partial)
Time Series of Theme in Stem Cell Field (2000-2018)
ARIMA(0, 0, 1) BIC:-77.88 ARIMA(1, 2, 0) BIC:-111.36
ARIMA(0, 0, 2) BIC:-80.42 ARIMA(1, 2, 1) BIC:-113.48
ARIMA(0, 1, 1) BIC:-127.98 ARIMA(1, 2, 2) BIC:-107.37
ARIMA(0, 1, 2) BIC:-119.46 ARIMA(2, 0, 0) BIC:-133.28
ARIMA(0, 2, 1) BIC:-110.23 ARIMA(2, 0, 1) BIC:-140.00
ARIMA(0, 2, 2) BIC:-109.99 ARIMA(2, 0, 2) BIC:-125.84
ARIMA(1, 0, 0) BIC:-136.18 ARIMA(2, 1, 0) BIC:-130.73
ARIMA(1, 0, 1) BIC:-139.63 ARIMA(2, 1, 1) BIC:-127.64
ARIMA(1, 0, 2) BIC:-132.44 ARIMA(2, 1, 2) BIC:-116.59
ARIMA(1, 1, 0) BIC:-136.13 ARIMA(2, 2, 0) BIC:-112.62
ARIMA(1, 1, 1) BIC:-129.06 ARIMA(2, 2, 1) BIC:-115.63
ARIMA(1, 1, 2) BIC:-117.46 ARIMA(2, 2, 2) BIC:-103.39
Determination of Model Parameters
Model Test Results
Prediction of Hot Topic Intensity Evolution Trend
Trend Forecast of Hot Topic Content
Research Hotspot of Stem Cell Based on VOSviewer
