Please wait a minute...
Data Analysis and Knowledge Discovery  2018, Vol. 2 Issue (8): 1-9    DOI: 10.11925/infotech.2096-3467.2018.0251
Current Issue | Archive | Adv Search |
Visualizing Appropriation of Research Funding with t-SNE Algorithm
Ting Chen1,2,3(),Guopeng Li3,Xiaomei Wang3
1National Science Library, Chinese Academy of Sciences, Beijing 100190, China
2University of Chinese Academy of Sciences, Beijing 100049, China
3Institutes of Science and Development, Chinese Academy of Sciences, Beijing 100190, China
Download: PDF(2765 KB)   HTML ( 2
Export: BibTeX | EndNote (RIS)      
Abstract  

[Objective] This paper designs a visualization method for the appropriation of research funding, aiming to more effectively present the locations of funded projects. [Methods] First, we retrieved 4,669 funded projects from NSF’s Information and Intelligent System. Then, we added topic tags to these projects using clustering algorithm and human interpretation. Third, we extracted the high-dimensional text features for the application documents with TF-IDF model and LSA model. Fourth, we used the t-SNE algorithm to project high-dimensional features into two or three-dimensional spaces for visualization. Finally, we examined the visualization results with pre-classified topic labels. [Results] The proposed method created maps of funded projects, in both two-dimensional or three-dimensional spaces. [Limitations] The algorithm parameters need to be adjusted manually. More research is needed to evaluate the proposed method with documents of projects funded by other agencies. [Conclusions] The proposed method could generate maps for the funded projects, which is a helpful tool for scientific management.

Key wordsResearch Awards      Funding Map      LSA      t-SNE      Visualization     
Received: 07 March 2018      Published: 08 September 2018

Cite this article:

Ting Chen,Guopeng Li,Xiaomei Wang. Visualizing Appropriation of Research Funding with t-SNE Algorithm. Data Analysis and Knowledge Discovery, 2018, 2(8): 1-9.

URL:

http://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/10.11925/infotech.2096-3467.2018.0251     OR     http://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/Y2018/V2/I8/1

[1] Talley E M, Newman D, Mimno D, et al.Database of NIH Grants Using Machine-Learned Categories and Graphical Clustering[J]. Nature Methods, 2011, 8(6): 443-444.
[2] 陈挺, 韩涛, 李泽霞, 等. 科研项目布局差异对比方法研究——以NSF和EUFP项目为例[J]. 现代图书情报技术, 2015(7-8): 89-96.
[2] (Chen Ting, Han Tao, Li Zexia, et al.Research on Comparison Method of Scientific Funding Layout——Take NSF and EUFP Grants for Instance[J]. New Technology of Library & Information Service, 2015(7-8): 89-96.)
[3] 陈挺, 李国鹏, 姜山, 等. NSF材料科学十年——基金项目分布及趋势变化分析[J]. 世界科技研究与发展, 2017, 39(5): 401-411.
[3] (Chen Ting, Li Guopeng, Jiang Shan, et al.Past Decade of NSF Material Science:An Analysis of Layout and Trend of Funded Projects[J]. World Sci-Tech R&D, 2017, 39(5): 401-411.)
[4] De-Miguel-Molina B, Cunningham S W, Palop F. Analyzing Funding Patterns and Their Evolution in Two Medical Research Topics[J]. International Journal of Innovation and Technology Management, 2017, 14(2). DOI: 10.1142/S0219877017400107.
[5] 王文娟, 马建霞. 基于LDA的科研项目主题挖掘与演化分析——以NSF海洋酸化研究为例[J]. 情报杂志, 2017, 36(7): 34-39.
[5] (Wang Wenjuan, Ma Jianxia.Topic Detection and Evolution Analysis of Research Project Based on LDA——A Case Study of Projects on Ocean Acidification Supported by NSF[J]. Journal of Intelligence, 2017, 36(7): 34-39.)
[6] Park J, Blume-Kohout M, Krestel R, et al.Analyzing NIH Funding Patterns over Time with Statistical Text Analysis[C]// Proceedings of the 30th AAAI Conference on Artificial Intelligence. 2016.
[7] Liu S, Cao N, Lv H. Interactive Visual Analysis of the NSF Funding Information[C]//Proceedings of 2008 IEEE Pacific Visualization Symposium. DOI: 10.1109/PACIFICVIS.2008.4475475.
[8] 王贤文, 刘则渊, 侯海燕. 全球主要国家的科学基金及基金论文产出现状: 基于Web of Science的分析[J]. 科学学研究, 2010, 28(1): 62-66.
[8] (Wang Xianwen, Liu Zeyuan, Hou Haiyan.Global Assessment of Science Funding and Funding Papers: A Study in Web of Science[J]. Studies in Science of Science, 2010, 28(1): 62-66.)
[9] 孙金伟, 刘迪, 王贤文, 等. 科学基金资助与SCI论文产出: 对10个国家的比较分析[J]. 科学学研究, 2013, 31(1): 36-42.
[9] (Sun Jinwei, Liu Di, Wang Xianwen, et al.Science Funding and SCI Papers Output: A Comparative Analysis on 10 Countries[J]. Studies in Science of Science, 2013, 31(1): 36-42.)
[10] Auranen O, Nieminen M.University Research Funding and Publication Performance - An International Comparison[J]. Research Policy, 2010, 39(6): 822-834.
[11] Wang J, Shapira P.Funding Acknowledgement Analysis: An Enhanced Tool to Investigate Research Sponsorship Impacts: The Case of Nanotechnology[J]. Scientometrics, 2011, 87(3): 563-586.
[12] Herr II B W, Talley E M, Burns G A P C, et al. The NIH Visual Browser: An Interactive Visualization of Biomedical Research[C]// Proceedings of the 13th International Conference on Information Visualisation. IEEE, 2009.
[13] Takahiro K, Katsutaro W, Naoya M.Funding Map for Research Project Relationships Using Paragraph Vectors[C]// Proceedings of the 16th International Conference on Scientometrics&Informetrics (ISSI), Wuhan, China. 2017.
[14] Abadi M, Agarwal A, Barham P, et al. TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems[OL]. arXiv:1603.04467. 2016.
[15] Salton G, Buckley C.Term-Weighting Approaches in Automatic Text Retrieval[J]. Information Processing & Management,1988, 24(5): 513-523.
[16] Foltz P W.Latent Semantic Analysis for Text-based Research[J]. Behavior Research Methods Instruments & Computers, 1996, 28(2): 197-202.
[17] Roweis S T, Saul L K.Nonlinear Dimensionality Reduction by Locally Linear Embedding[J]. Science, 2000, 290(5500): 2323-2326.
[18] Burges C J C. Dimension Reduction: A Guided Tour[J]. Foundations & Trends® in Machine Learning, 2010, 2(4): 262-286.
[19] Zhong G, Cheriet M.Large Margin Low Rank Tensor Analysis[J]. Neural Computation, 2014, 26(4):761-780.
[20] Li W, Cerise J E, Yang Y, et al.Application of t-SNE to Human Genetic Data[J]. Journal of Bioinformatics & Computational Biology, 2017, 15(4): 1750017. DOI: 10.1142/S0219720017500172.
[21] Pezzotti N, Lelieveldt B, Maaten L V D, et al. Approximated and User Steerable tSNE for Progressive Visual Analytics[J]. IEEE Transactions on Visualization & Computer Graphics, 2017, 23(7): 1739-1752.
[22] Liu S, Bremer P T, Thiagarajan J J, et al.Visual Exploration of Semantic Relationships in Neural Word Embeddings[J]. IEEE Transactions on Visualization & Computer Graphics, 2017, 24(1): 553-562.
[23] Maaten L V D, Hinton G. Visualizing Data Using t-SNE[J]. Journal of Machine Learning Research, 2008, 9(2605): 2579-2605.
[24] Hinton G, Roweis S.Stochastic Neighbor Embedding[J]. Advances in Neural Information Processing Systems, 2002, 41(4): 833-840.
[25] Kullback S, Leibler R A.On Information and Sufficiency[J]. Annals of Mathematical Statistics, 1951, 22(1): 79-86.
[26] Embedding Projector[EB/OL]. [2018-02-20]. .
[1] Haici Yang,Jun Wang. Visualizing Knowledge Graph of Academic Inheritance in Song Dynasty[J]. 数据分析与知识发现, 2019, 3(6): 109-116.
[2] Yanan Yang,Wenhui Zhao,Jian Zhang,Shen Tan,Beibei Zhang. Visualizing Policy Texts Based on Multi-View Collaboration[J]. 数据分析与知识发现, 2019, 3(6): 30-41.
[3] Jiang Wu,Guanjun Liu,Xian Hu. An Overview of Online Medical and Health Research: Hot Topics, Theme Evolution and Research Content[J]. 数据分析与知识发现, 2019, 3(4): 2-12.
[4] Zhiqiang Wu,Zhongming Zhu,Wei Liu,Sili Wang. Research and Practice on the Extension of Knowledge Analysis and Visualization Function in CSpace[J]. 数据分析与知识发现, 2019, 3(3): 112-119.
[5] Sinan Yang,Jian Xu,Pingping Ye. Review of Online Sentiment Visualization Techniques[J]. 数据分析与知识发现, 2018, 2(5): 77-87.
[6] Li Wang,Lixue Zou,Xiwen Liu. Visualizing Document Correlation Based on LDA Model[J]. 数据分析与知识发现, 2018, 2(3): 98-106.
[7] Shihai Tian,Deli Lyu. An Early Warning Algorithm for Public Opinion of Safety Emergency[J]. 数据分析与知识发现, 2017, 1(2): 11-18.
[8] Xiufang Xie,Xiaolin Zhang. Integrated Analysis and Visualization of Sci-Tech Roadmaps: Case Study of Renewable Energy[J]. 数据分析与知识发现, 2017, 1(1): 16-25.
[9] Chen Ting,Wang Xiaomei,Lv Weimin. ng-info-chart: The Visualization Component Based on Customized HTML Tags[J]. 现代图书情报技术, 2016, 32(6): 88-95.
[10] Zhao Yiping,Bi Qiang. Using Linked Data to Retrieve Similar Documents from the Academic Resource Websites[J]. 现代图书情报技术, 2016, 32(3): 41-49.
[11] Li Jinhua,An Zhongjie. Analyzing Geographical Coordinates Data for Micro-blog Trending Events[J]. 现代图书情报技术, 2016, 32(2): 90-101.
[12] Lixin Xia,Ying Tan. Analysis and Visualization of the LOD Network Structure[J]. 现代图书情报技术, 2016, 32(1): 65-72.
[13] Peng Hao, Xu Jian, Xiao Zhuo. Sentiment Analysis of Web Reviews Based on Comparative Sentence Extraction[J]. 现代图书情报技术, 2015, 31(12): 48-56.
[14] Zheng Yangyang, Xu Jian, Xiao Zhuo. Utilization of Sentiment Analysis and Visualization in Online Video Bullet-screen Comments[J]. 现代图书情报技术, 2015, 31(11): 82-90.
[15] Xie Xiaqing, Wu Xu. Application of Visualization Technology for “Classic Reading” Platform[J]. 现代图书情报技术, 2015, 31(11): 96-103.
  Copyright © 2016 Data Analysis and Knowledge Discovery   Tel/Fax:(010)82626611-6626,82624938   E-mail:jishu@mail.las.ac.cn