[Objective] This study tries to extract more semantic information from the science and technology literature, aiming to identify emerging trends from the documents of fund projects. [Methods] First, we proposed a new trend detection method based on the DTM model and text analytics. Then, we identified the topic probability distribution of the fund projects and constructed a new theme detection formula based on the text features. Finally, we detected the emerging trends in the field of NSF graphene. [Results] The proposed method identified emerging trends of fund projects and provided information for technology innovation. [Limitations] We only examined the fund project documents from the perspectives of the amount, length, and theme of funding. [Conclusions] The proposed method could effectively identify emerging trends of fund projects.
徐路路, 王效岳, 白如江, 周彦廷. 基于DTM模型和文本特征分析的基金项目新兴趋势探测研究* ——以NSF石墨烯领域为例[J]. 数据分析与知识发现, 2018, 2(3): 87-97.
Xu Lulu,Wang Xiaoyue,Bai Rujiang,Zhou Yanting. Detecting Emerging Trends of Funds Based on DTM Model and Text Analytics: Case Study of NSF Graphene Field. Data Analysis and Knowledge Discovery, 2018, 2(3): 87-97.
(Wang Xiaoyue, Bai Rujiang, Wang Xiaodi, et al.An Automatic Classification System of Mass Online Academic Literatures[J]. Library and Information Service, 2013, 57(16): 117-122.)
doi: 10.7536/j.issn.0252-3116.2013.16.022
[2]
人民日报.从“跟跑者”向“并行者” “领跑者”转变[EB/OL]. [2017-08-24]. .
[2]
(People’s Daily. From “Runner” to “Walker” “Leader” [EB/OL]. [2017-08-24].
(Liu Xiaoping, Leng Fuhai, Li Zexia.Methods and Approaches of International S&T Front Analysis[J]. Library and Information Service, 2012, 56(12): 60-65.)
[4]
Blei D M, Ng A Y, Jordan M I.Latent Dirichlet Allocation[J]. Journal of Machine Learning Research, 2003, 3: 993-1022.
[5]
Wang Y, Bai H, Stanton M, et al.PLDA: Parallel Latent Dirichlet Allocation for Large-Scale Applications[C] //Proceedings of International Conference on Algorithmic Applications in Management (AAIM 2009). Springer Berlin Heidelberg, 2009.
[6]
Wang X, McCallum A, Wei X. Topical N-Grams: Phrase and Topic Discovery, with an Application to Information Retrieval[C]//Proceedings of IEEE International Conference on Data Mining. 2007.
[7]
Hall D, Jurafsky D, Manning C D.Studying the History of Ideas Using Topic Models[C]//Proceedings of Conference on Empirical Methods in Natural Language Processing, Honolulu, Hawaii, USA. 2008.
[8]
Blei D M, Lafferty J D.Dynamic Topic Models[C]// Proceedings of the 23rd International Conference on Machine Learning.2006: 113-120.
[9]
Li D, Ding Y, Shuai X, et al.Adding Community and Dynamic to Topic Models[J]. Journal of Informetrics, 2012, 6(2): 237-253.
doi: 10.1016/j.joi.2011.11.004
[10]
Wang C, Blei D, Heckerman D.Continuous Time Dynamic Topic Models[C]// Proceedings of the 24th Conference on Uncertainty in Artificial Intelligence (UAI 2008). 2008: 579-586.
[11]
Iwata T, Yamada T, Sakurai Y, et al.Online Multiscale Dynamic Topic Models[C]// Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.2010: 663-672.
[12]
Roy S, Gevry D, Pottenger W M.Methodologies for Trend Detection in Textual Data Mining[J]. International Journal of Computer Science & Mobile Computing, 2001: 122-130.
[13]
Price D J.Networks of Scientific Papers[J]. Science, 1965, 149(3683): 510-515.
doi: 10.1126/science.149.3683.510
[14]
Kontostathis A, Galitsky L M, Pottenger W M, et al.A Survey of Emerging Trend Detection in Textual Data Mining[J]. Survey of Text Mining, 2007: 185-224.
[15]
Hoang L M.Emerging Trend Detection from Scientific Online Documents[D]. Japan Advanced Institute of Science and Technology, 2006.
(Fan Yunman, Ma Jianxia.Detection of Emerging Topics Based on LDA and Feature Analysis of Emerging Topics[J]. Journal of the China Society for Scientific and Technical Information, 2014, 33(7): 698-711.)
doi: 10.3772/j.issn.1000-0135.2014.07.003
(Yang Junlin, Shao Jiushu.View of the Development Trend in This Field from the Theoretical and Computational Chemistry Projects Funded by the National Science Foundation of the United States in Recent Years[J]. Bulletin of National Natural Science Foundation of China, 2005, 19(5): 292-294.)
doi: 10.3969/j.issn.1000-8217.2005.05.008
(Yang Liyuan, Zhu Qinghua.The Research Status of National Library, Information and Archives Management Discipline Based on the Analysis of the National Social Science Fund and the Natural Science Foundation from 2000 to 2006[J]. Information Studies: Theory & Application, 2007, 30(6): 756-759.)
doi: 10.3969/j.issn.1000-7490.2007.06.010
(Zhao Rongying, Zhao Junyin, Chen Bikun.Perspective on the Subject and Trend of the “Library, Information and Archives Management”: A Perspective of the National Science Foundation from 2001 to 2012[J]. Information Studies: Theory & Application, 2014, 37(2): 1-5.)
(Li Ying.Analysis on the Research Status of Library Information and Document Management Science in China: Based on Projects Granted by the National Natural Science Foundation of China and the National Social Science Foundation of China from 2009 to 2013[J]. Library and Information Service, 2014, 58(9): 31-36.)
doi: 10.13266/j.issn.0252-3116.2014.09.004
(Liang Weibo.Knowledge Mapping Analysis on the Logistics Projects Founded by National Science Foundation in the United States[J]. Journal of Intelligence, 2016, 35(10): 114-119.)
[22]
Tu Y N, Seng J L.Indices of Novelty for Emerging Topic Detection[J]. Information Processing & Management, 2012, 48(2): 303-325.
doi: 10.1016/j.ipm.2011.07.006