Please wait a minute...
Data Analysis and Knowledge Discovery  2018, Vol. 2 Issue (3): 87-97    DOI: 10.11925/infotech.2096-3467.2017.1085
Current Issue | Archive | Adv Search |
Detecting Emerging Trends of Funds Based on DTM Model and Text Analytics: Case Study of NSF Graphene Field
Xu Lulu, Wang Xiaoyue(), Bai Rujiang, Zhou Yanting
Institute of Scientific & Technical Information, Shandong University of Technology, Zibo 255049, China
Download: PDF (1658 KB)   HTML ( 1
Export: BibTeX | EndNote (RIS)      
Abstract  

[Objective] This study tries to extract more semantic information from the science and technology literature, aiming to identify emerging trends from the documents of fund projects. [Methods] First, we proposed a new trend detection method based on the DTM model and text analytics. Then, we identified the topic probability distribution of the fund projects and constructed a new theme detection formula based on the text features. Finally, we detected the emerging trends in the field of NSF graphene. [Results] The proposed method identified emerging trends of fund projects and provided information for technology innovation. [Limitations] We only examined the fund project documents from the perspectives of the amount, length, and theme of funding. [Conclusions] The proposed method could effectively identify emerging trends of fund projects.

Key wordsEmerging Trend      Fund Project      DTM Model      Characteristic Analysis      Detection Formula     
Received: 01 November 2017      Published: 03 April 2018
ZTFLH:  G250  

Cite this article:

Xu Lulu,Wang Xiaoyue,Bai Rujiang,Zhou Yanting. Detecting Emerging Trends of Funds Based on DTM Model and Text Analytics: Case Study of NSF Graphene Field. Data Analysis and Knowledge Discovery, 2018, 2(3): 87-97.

URL:

http://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/10.11925/infotech.2096-3467.2017.1085     OR     http://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/Y2018/V2/I3/87

文本特征 内容 分析 设计指标 实现方式
Title 项目申请名称 该部分为基金项目文本信息, 主要揭示申请项目方法、流程、框架等主题信息, 可通过主题探测模型分析获得。 主题强度 可利用DTM模型进行文本内容主题探测识别
Abstract 摘要信息
StartDate 申请项目的批准日期 该部分为基金项目的批准日期及结项日期, 即为项目执行期限, 资助期限越长表示该基金项目重要性越强。 资助时长 利用基金项目起止日期计算可得
EndDate 申请项目的结项日期
Awarded
Amount
申请项目资助金额 该部分为基金项目的资助金额, 以供研究人员进行科学研究经费开支, 资助金额越大说明基金委员评审专家认为该主题研究意义和价值越大。 资助强度 分析基金项目中资助金额指标可得
主题 t时期 t+1时期 主题特点 主题分类
主题1 低于平均主题水平 高于平均主题水平 该主题研究热度发展迅速、呈上升趋势, 主题资助金额明
显提升、科研单位及产出增加
新兴主题
主题2 高于平均主题水平 高于平均主题水平 该主题持续研究热度较高、资助金额及科研单位较多、资
助时长较长
热门主题
主题3 高于平均主题水平 低于平均主题水平 该主题前期发展较好、资助金额及科研单位较多, 但逐渐
衰老, 研究主题存在老化现象
衰老主题
主题4 低于平均主题水平 低于平均主题水平 该主题研究水平持续低于平均值、但主题资助金额有所提
升、成果产出逐渐积累、主题发展潜力大
潜在新兴主题
主题类型 阶段 主题特点 指标特点
新兴趋势 潜在阶段 研究热度明显上升、论文数少、被引量少、基金项
目数较少、发展趋势明显
资助金额、时长等特征要素低于同时期
不同主题平均值
新兴趋势 突破阶段 研究热度轻微上升并趋于平稳、论文数较多、基金
项目数较多、出现理论奠基性论文、发展趋势减缓
资助金额、时长等特征要素高于同时期
不同主题平均值
学部(Directorate) 机构(Organization) 项目数/项
工程(Engineering) 电子、通信与网络系统 101
土木、机械和制造业创新 94
产业创新与合作 42
新兴前沿办公室 8
工程教育和中心 13
计算机信息科学与工程
(Computer &
Information Science
& Engineering)
计算和通信基础 9
计算机和网络系统 7
先进基础设施 3
数理科学
(Mathematical &
Physical Sciences)
材料研究 216
化学 45
物理 14
天文科学 1
数学研究 12
生物科学
(Biological Sciences)
生物基础设施 5
Award Number Start Date State Award Instrument End Date Awarded Amount Topic
0747684 02/01/2008 MN Standard Grant 01/31/2014 $423,486.00 Topic1
0748910 02/01/2008 CA Continuing Grant 07/31/2013 $511,207.00 Topic8
0756958 08/01/2008 UT Continuing Grant 07/31/2011 $75,000.00 Topic2
0802216 07/01/2008 AZ Standard Grant 09/30/2011 $315,243.00 Topic1
0805220 06/15/2008 OH Continuing Grant 05/31/2011 $422,811.00 Topic5
主题 2008 2009 2010 2011 2012 2013 2014 2015 2016
Topic0 0.536806 0.385843 0.701674 0.660604 1.065147 1.033578 1.946747 1.308421
Topic1 0.679784 0.818186 0.658238 1.208104 1.047232 1.070715 1.442418 1.041759 1.504981
Topic2 1.121281 0.818186 0.801484 0.699304 0.811001 1.070715 0.96507 1.182918 0.58696
Topic3 0.916347 0.979223 1.070715 1.033578 1.845305 1.218901
Topic4 0.658238 0.787752 0.84818 0.903286 0.979223 0.94795 0.819842
Topic5 0.679784 2.794199 1.407899 0.913776 0.653303 0.84818 1.442418 0.970248 0.952771
Topic6 1.373507 2.794199 1.201312 1.308421 0.750168 0.84818 0.710484 0.755196 0.674185
Topic7 0.916347 0.653303 0.703973 1.033578 1.493767 1.208104
Topic8 1.373507 2.794199 1.201312 1.308421 1.845305 2.767026 1.507944 1.041759 2.190124
Topic9 1.121281 1.116123 1.201312 1.208104 1.918824 0.84818 0.710484 0.755196 0.548143
主题 2008 2009 2010 2011 2012 2013 2014 2015 2016
Topic0 -0.46319 -0.61416 -0.29833 -0.3394 0.06515 0.03358 0.94675 0.30842
Topic1 -0.32022 -0.18181 -0.34176 0.2081 0.04723 0.07071 0.44242 0.04176 0.50498
Topic2 0.121281 -0.18181 -0.19852 -0.3007 -0.189 0.07071 -0.03493 0.18292 -0.41304
Topic3 -0.08365 -0.02078 0.07071 0.03358 0.8453 0.2189
Topic4 -0.34176 -0.21225 -0.15182 -0.09671 -0.02078 -0.05205 -0.18016
Topic5 -0.32022 1.7942 0.4079 -0.08622 -0.3467 -0.15182 0.44242 -0.02975 -0.04723
Topic6 0.37351 1.7942 0.20131 0.30842 -0.24983 -0.15182 -0.28952 -0.2448 -0.32582
Topic7 -0.08365 -0.3467 -0.29603 0.03358 0.49377 0.2081
Topic8 0.37351 1.7942 0.20131 0.30842 0.8453 1.76703 0.50794 0.04176 1.19012
Topic9 0.12128 0.11612 0.20131 0.2081 0.91882 -0.15182 -0.28952 -0.2448 -0.45186
主题 主题词权重列表
Topic0:
optical
optical(53)|graphene(21)|radiation(20)|light(18)
|investigate(16)|terahertz(14)|infrared(14)|electrical(13)|nanomanufacturing(12)|layer(12)
Topic1: detection detection(36)|water(25)|sensor(21)|system(20)|lead(17)
|sensors(14)|based(14)|physics(13)|electrons(11)
Topic3: energy energy(28)|material(26)|chemistry(23)|program(19)
|infrared(19)|project(18)|faculty(14)|chemical(13)
|stem(13)|precursor(11)
Topic7: chemistry chemistry(23)|graphene(19)|nanoscale(19)|electrons(18)
|properties (17)|separation(17)|nano(16)|surface(16)
|storage(15)|energies(14)
[1] 王效岳, 白如江, 王晓笛, 等. 海量网络学术文献自动分类系统[J]. 图书情报工作, 2013, 57(16): 117-122.
doi: 10.7536/j.issn.0252-3116.2013.16.022
[1] (Wang Xiaoyue, Bai Rujiang, Wang Xiaodi, et al.An Automatic Classification System of Mass Online Academic Literatures[J]. Library and Information Service, 2013, 57(16): 117-122.)
doi: 10.7536/j.issn.0252-3116.2013.16.022
[2] 人民日报.从“跟跑者”向“并行者” “领跑者”转变[EB/OL]. [2017-08-24]. .
[2] (People’s Daily. From “Runner” to “Walker” “Leader” [EB/OL]. [2017-08-24].
[3] 刘小平, 冷伏海, 李泽霞. 国际科技前沿分析的方法和途径[J]. 图书情报工作, 2012, 56(12): 60-65.
[3] (Liu Xiaoping, Leng Fuhai, Li Zexia.Methods and Approaches of International S&T Front Analysis[J]. Library and Information Service, 2012, 56(12): 60-65.)
[4] Blei D M, Ng A Y, Jordan M I.Latent Dirichlet Allocation[J]. Journal of Machine Learning Research, 2003, 3: 993-1022.
[5] Wang Y, Bai H, Stanton M, et al.PLDA: Parallel Latent Dirichlet Allocation for Large-Scale Applications[C] //Proceedings of International Conference on Algorithmic Applications in Management (AAIM 2009). Springer Berlin Heidelberg, 2009.
[6] Wang X, McCallum A, Wei X. Topical N-Grams: Phrase and Topic Discovery, with an Application to Information Retrieval[C]//Proceedings of IEEE International Conference on Data Mining. 2007.
[7] Hall D, Jurafsky D, Manning C D.Studying the History of Ideas Using Topic Models[C]//Proceedings of Conference on Empirical Methods in Natural Language Processing, Honolulu, Hawaii, USA. 2008.
[8] Blei D M, Lafferty J D.Dynamic Topic Models[C]// Proceedings of the 23rd International Conference on Machine Learning.2006: 113-120.
[9] Li D, Ding Y, Shuai X, et al.Adding Community and Dynamic to Topic Models[J]. Journal of Informetrics, 2012, 6(2): 237-253.
doi: 10.1016/j.joi.2011.11.004
[10] Wang C, Blei D, Heckerman D.Continuous Time Dynamic Topic Models[C]// Proceedings of the 24th Conference on Uncertainty in Artificial Intelligence (UAI 2008). 2008: 579-586.
[11] Iwata T, Yamada T, Sakurai Y, et al.Online Multiscale Dynamic Topic Models[C]// Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.2010: 663-672.
[12] Roy S, Gevry D, Pottenger W M.Methodologies for Trend Detection in Textual Data Mining[J]. International Journal of Computer Science & Mobile Computing, 2001: 122-130.
[13] Price D J.Networks of Scientific Papers[J]. Science, 1965, 149(3683): 510-515.
doi: 10.1126/science.149.3683.510
[14] Kontostathis A, Galitsky L M, Pottenger W M, et al.A Survey of Emerging Trend Detection in Textual Data Mining[J]. Survey of Text Mining, 2007: 185-224.
[15] Hoang L M.Emerging Trend Detection from Scientific Online Documents[D]. Japan Advanced Institute of Science and Technology, 2006.
[16] 范云满, 马建霞. 基于LDA与新兴主题特征分析的新兴主题探测研究[J]. 情报学报, 2014, 33(7): 698-711.
doi: 10.3772/j.issn.1000-0135.2014.07.003
[16] (Fan Yunman, Ma Jianxia.Detection of Emerging Topics Based on LDA and Feature Analysis of Emerging Topics[J]. Journal of the China Society for Scientific and Technical Information, 2014, 33(7): 698-711.)
doi: 10.3772/j.issn.1000-0135.2014.07.003
[17] 杨俊林, 邵久书. 从美国国家科学基金会近年资助的理论与计算化学项目看该领域的发展趋势[J]. 中国科学基金, 2005, 19(5): 292-294.
doi: 10.3969/j.issn.1000-8217.2005.05.008
[17] (Yang Junlin, Shao Jiushu.View of the Development Trend in This Field from the Theoretical and Computational Chemistry Projects Funded by the National Science Foundation of the United States in Recent Years[J]. Bulletin of National Natural Science Foundation of China, 2005, 19(5): 292-294.)
doi: 10.3969/j.issn.1000-8217.2005.05.008
[18] 杨荔媛, 朱庆华. 我国图书馆、情报与档案管理学科的研究现状——基于2000-2006年国家社会科学基金和自然科学基金立项的分析[J]. 情报理论与实践, 2007, 30(6): 756-759.
doi: 10.3969/j.issn.1000-7490.2007.06.010
[18] (Yang Liyuan, Zhu Qinghua.The Research Status of National Library, Information and Archives Management Discipline Based on the Analysis of the National Social Science Fund and the Natural Science Foundation from 2000 to 2006[J]. Information Studies: Theory & Application, 2007, 30(6): 756-759.)
doi: 10.3969/j.issn.1000-7490.2007.06.010
[19] 赵蓉英, 赵浚吟, 陈必坤. 透视“图书馆、情报与档案管理”学科的研究主题与趋势——以2001-2012年国家科学基金为研究视角[J]. 情报理论与实践, 2014, 37(2): 1-5.
[19] (Zhao Rongying, Zhao Junyin, Chen Bikun.Perspective on the Subject and Trend of the “Library, Information and Archives Management”: A Perspective of the National Science Foundation from 2001 to 2012[J]. Information Studies: Theory & Application, 2014, 37(2): 1-5.)
[20] 李英. 我国图书情报与档案管理学科研究现状剖析——基于2009-2013年国家自然科学基金和国家社会科学基金立项的分析[J]. 图书情报工作, 2014, 58(9): 31-36.
doi: 10.13266/j.issn.0252-3116.2014.09.004
[20] (Li Ying.Analysis on the Research Status of Library Information and Document Management Science in China: Based on Projects Granted by the National Natural Science Foundation of China and the National Social Science Foundation of China from 2009 to 2013[J]. Library and Information Service, 2014, 58(9): 31-36.)
doi: 10.13266/j.issn.0252-3116.2014.09.004
[21] 梁伟波. 美国NSF资助物流项目的知识图谱分析[J]. 情报杂志, 2016, 35(10): 114-119.
[21] (Liang Weibo.Knowledge Mapping Analysis on the Logistics Projects Founded by National Science Foundation in the United States[J]. Journal of Intelligence, 2016, 35(10): 114-119.)
[22] Tu Y N, Seng J L.Indices of Novelty for Emerging Topic Detection[J]. Information Processing & Management, 2012, 48(2): 303-325.
doi: 10.1016/j.ipm.2011.07.006
[1] Bowen Liu,Rujiang Bai,Yanting Zhou,Xiaoyue Wang. Identifying Frontier Topics from Funding and Paper——Case Study of Carbon Nanotube[J]. 数据分析与知识发现, 2019, 3(8): 114-122.
[2] Wang Yuefen,Jin Jialin. Characteristics and Development Trends of Papers from “New Technology of Library and Information Service”[J]. 现代图书情报技术, 2016, 32(9): 1-16.
  Copyright © 2016 Data Analysis and Knowledge Discovery   Tel/Fax:(010)82626611-6626,82624938   E-mail:jishu@mail.las.ac.cn