[Objective] This paper employs text mining technology to automatically identify research topics from large amounts of scientific literature and then detects future trends. [Methods] First, we used the LDA model to find both topical prevalence and contents of articles published by the top ten computer science journals in China. Second, we described the evolution of major topics with the help of publishing dates. [Results] We extracted 18 topics from 29, 621 computer science papers and then identified 7 trending topics as well as 6 less popular ones. [Limitations] Our study did not include papers published overseas by Chinese authors. [Conclusions] The proposed method could help us learn the evolution of computer science research and then grasp the emerging trends.
杨海霞,高宝俊,孙含林. 基于LDA挖掘计算机科学文献的研究主题[J]. 现代图书情报技术, 2016, 32(11): 20-26.
Yang Haixia,Gao Baojun,Sun Hanlin. Extracting Topics of Computer Science Literature with LDA Model. New Technology of Library and Information Service, 2016, 32(11): 20-26.
(Guo Yu, Yu Haiyan.Biblio-metrilogical Analysis on Development Trends of Computer Science in China[J]. Application Research of Computers, 2007, 24(12): 18-31.)
(Chen Guoliang, Sun Guangzhong, Xu Yun, et al.Integrated Research of Parallel Computing: Status and Future[J]. Chinese Science Bulletin, 2009, 54(8): 1043-1049.)
(Zhang Jinwen, Ma Yuanliang.The Development Situation and Direction of Neurocomputer[J]. Computer Science, 1993, 20(6): 24-27.)
[7]
Zheng B, McLean D C, Lu X. Identifying Biological Concepts from a Protein-related Corpus with a Probabilistic Topic Model[J]. BMC Bioinformatics, 2006, 7(4): 58.
[8]
Hall D, Jurafsky D, Manning C D.Studying the History of Ideas Using Topic Models [C]. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing. 2008: 363-371.
[9]
Wu H, Wang M, Feng J, et al.Research Topic Evolution in “Bioinformatics”[C]. In: Proceedings of the 4th International Conference on Bioinformatics and Biomedical Engineering (iCBBE). IEEE, 2010: 1-4.
[10]
Sugimoto C R, Li D, Russell T G, et al.The Shifting Sands of Disciplinary Development: Analyzing North American Library and Information Science Dissertations Using Latent Dirichlet Allocation[J]. Journal of the American Society for Information Science and Technology, 2011, 62(1): 185-204.
[11]
Piepenbrink A, Nurmammadov E.Topics in the Literature of Transition Economies and Emerging Markets[J]. Scientometrics, 2015, 102(3): 2107-2130.
[12]
贺亮, 李芳. 科技文献话题演化研究[J]. 现代图书情报技术, 2012(4): 61-67.
[12]
(He Liang, Li Fang.Topic Evolution in Scientific Literature[J]. New Technology of Library and Information Service, 2012(4): 61-67.)
(Guan Peng, Wang Yuefen, Fu Zhu.Effect Analysis of Scientific Literature Extraction Based on LDA Topic Model with Different Corpus[J]. Library and Information Service, 2016, 60(2): 112-121.)
(Wang Yuefen, Fu Zhu, Chen Bikun.Analyzing Knowledge Structure Research with LDA Model[J]. New Technology of Library and Information Service, 2016(4): 8-19.)
(Wang Ping.Literature Knowledge Mining Based on Probabilistic Topic Model[J]. Journal of the China Society for Scientific and Technical Information, 2011, 30(6): 583-590.)
(Ye Chunlei, Leng Fuhai.Discovering the Topic of Science Literature Based on Citation-Topic Model[J]. Information Studies: Theory & Application, 2013, 36(9): 100-103.)
(Wang Ping.Topic Extraction and Evolution for Scientific Literature Based on Hierarchical Probabilistic Topic Model[J]. Library and Information Service, 2014, 58(22): 70-77.)
(Wang Jinlong, Xu Congfu, Geng Xueyu.Study on Research Topic Evolution Based on Probabilistic Graphical Models[J]. Journal of the China Society for Scientific and Technical Information, 2009, 28(3): 347-355.)
(Li Xiangdong, Liao Xiangpeng, Huang Li.Research and Implementation of Bibliographic Information Classification System in LDA Model[J]. New Technology of Library and Information Service, 2014 (5): 18-25.)
(Qin Xiaohui, Le Xiaoqiu.Topic Evolution Research on a Certain Field Based on LDA Topic Association Filter[J]. New Technology of Library and Information Service, 2015 (3): 18-25.)
(Yang Ruyi, Liu Dongsu, Li Hui.An Improved Topic Model Integrating Extra- Features[J]. New Technology of Library and Information Service, 2016 (1): 48-54.)
[23]
Grün B, Hornik K.Topicmodels: An R Package for Fitting Topic Models[J]. Journal of Statistical Software, 2011, 40(13): 1-30.
[24]
Blei D M, Lafferty J D.A Correlated Topic Model of Science[J]. The Annals of Applied Statistics, 2007, 1(1): 17-35.
[25]
Roberts M E, Stewart B M, Tingley D, et al.The Structural Topic Model and Applied Social Science[J]. Medical Journal of Australia, 2013, 155(6): 419-420.
[26]
Roberts M E, Stewart B M, Tingley D. stm: R Package for Structural Topic Models[J]. General Information, 2014, 57(1): 445-460.
[27]
Roberts M E, Stewart B M, Tingley D, et al.Structural Topic Models for Open-Ended Survey Responses[J]. American Journal of Political Science, 2014, 58(4): 1064-1082.