Evolution Analysis of Hot Topics with Trend-Prediction
Yue Lixin1,Liu Ziqiang2,3(),Hu Zhengyin2,3
1School of Information Resource Management, Renmin University of China, Beijing 100872, China 2Chengdu Library of Chinese Academy of Sciences, Chengdu 610041, China 3Department of Library, Information and Archives Management, School of Economics and Management, University of Chinese Academy of Sciences, Beijing 100190, China
[Objective] The paper constructs mathematical and content prediction models based on the external and internal characteristics academic articles, aiming to analyze the evolution of trending research topics. [Methods] With the help of LDA model, we identified the needed topics and constructed their time series. Then, we determined the popular topics by mean values and linear regression fitting. Finally, we predicted the trending topics with ARIMA and Word2Vec models based on the topic intensity and content. [Results] We conducted an empirical study to evaluate our models with stem cell research in the United States. We identified popular topics and predicted their development trends. [Limitations] There might be ambiguity in interpreting the documents, because the Word2Vec model analyzes trends of theme contents based on single words. [Conclusions] The proposed method can provide better prediction results than methods based on manual interpretation.
岳丽欣,刘自强,胡正银. 面向趋势预测的热点主题演化分析方法研究*[J]. 数据分析与知识发现, 2020, 4(6): 22-34.
Yue Lixin,Liu Ziqiang,Hu Zhengyin. Evolution Analysis of Hot Topics with Trend-Prediction. Data Analysis and Knowledge Discovery, 2020, 4(6): 22-34.
( Liu Ziqiang, Wang Xiaoyue, Bai Rujiang. Research on Visualization Analysis Method of Discipline Topics Evolution from the Perspective of Multi-Dimensions: A Case Study of the Big Data in the Field of Library and Information Science in China[J]. Journal of Library Science in China, 2016,42(6):67-84.)
( Jing Fachong, Li Chenying, Han Mingjie, et al. Topic Analysis of Projects from Emerging Frontiers Division of NSF’s Directorate for Biological Science Based on Text Mining[J]. Journal of Modern Information, 2014,34(12):107-112.)
( Hou Jianhua, Wang Zhongyu. The Measurement of Knowledge Flow in Research Subject with an Empirical Analysis——Taking H-index Study as an Example[J]. Library and Information Service, 2017,61(10):87-93.)
( Fan Yunman, Ma Jianxia. Detection of Emerging Topics Based on LDA and Feature Analysis of Emerging Topics[J]. Journal of the China Society for Scientific and Technical Information, 2014,33(7):698-711.)
( Wang Xiaoguang, Cheng Qikai. Analysis on Evolution of Research Topics in a Discipline Based on NEViewer[J]. Journal of the China Society for Scientific and Technical Information, 2013,32(9):900-911.)
Yan E. Research Dynamics, Impact, and Dissemination: A Topic-Level Analysis[J]. Journal of the Association for Information Science and Technology, 2015,66(11):2357-2372.
( Zhou Yuan, Zhang Chao, Tang Jie, et al. Intelligent Identification of Field Development Trajectory Based on Topic Evolution: A Case Study of Artificial Intelligence[J]. Library and Information Service, 2018,62(14):62-71.)
( Qi Yashuang, Zhu Na, Zhai Yujia. A Comparative Study on Topic Heats Evolution in the Field of Information Science Between the Domestic and Foreign Research Based on DTM[J]. Library and Information Service, 2016,60(16):99-109.)
( Chen Wei, Lin Chaoran, Li Jinqiu, et al. Analysis of the Evolutionary Trend of Technical Topics in Patents Based on LDA and HMM: Taking Marine Diesel Engine Technology as an Example[J]. Journal of the China Society for Scientific and Technical Information, 2018,37(7):732-741.)
( Li Jing, Xu Lulu, Zhao Sujun. Prediction and Visualization of Emerging Topics of Fund Sponsored Projects Based on Time Series Analysis and SVM Model[J]. Information Studies: Theory & Application, 2019,42(1):118-123, 152.)
( Liu Ziqiang, Wang Xiaoyue, Bai Rujiang. Research on the Discipline Topic Evolution Analysis Method of Semantic Classification——A Case Study of Big Data in the Field of Library and Information Science in China[J]. Library and Information Service, 2016,60(15):76-85, 93.)
( Shen Wenjuan, Li Mingshi, Huang Chengquan. Review of Remote Sensing Algorithms for Monitoring Forest Disturbance from Time Series and Multi-source Data Fusion[J]. Journal of Remote Sensing, 2018,22(6):1005-1022.)
( Zhang Wenqiu, Fang Lei, Yang Jian, et al. Reconstruction of Stand-replacement Disturbance and Stand Age of Chinese Fir Plantation Based on a Landsat Time Series in Huitong County, Hunan[J]. Chinese Journal of Ecology, 2018,37(11):3467-3479.)
( Yang Binqing, Zhang Xilin. Forecast of Price of Rare Earths Neodymium Oxide and Dysprosium Oxide Based on ARIMA Time Series Model[J]. Journal of the Chinese Society of Rare Earths, 2017,35(5):680-686.)
( Zhou Lian. Exploration of the Working Principle and Application of Word2vec[J]. Sci-Tech Information Development & Economy, 2015,25(2):145-148.)
Mikolov T, Sutskever I, Chen K, et al. Distributed Representations of Words and Phrases and Their Compositionality [C]//Proceedings of the 26th International Conference on Neural Information Processing Systems. 2013: 3111-3119.
( Hou Haiyan, Guo Fangqi, Sun Taian, et al. Analysis of the Domestic and International Research Situation of Biotechnology in Shandong Province by VOSviewer[J]. Science and Management, 2018,38(2):25-33.)