Data Analysis and Knowledge Discovery  2018, Vol. 2 Issue (3): 98-106    DOI: 10.11925/infotech.2096-3467.2017.1058
Visualizing Document Correlation Based on LDA Model
Li Wang(),Lixue Zou,Xiwen Liu
National Science Library, Chinese Academy of Sciences, Beijing 100190, China
University of Chinese Academy of Sciences, Beijing 100049, China
[Objective] This paper tries to construct data analysis model for the topics of scientific research based on machine learning. [Methods] First, we clustered data with the Latent Dirichlet Allocation model. Then, we investigated the correlation among year, institution and research types with the help of Python modules. Finally, we revealed and visualized the key research areas of every year or institution. [Results] We analyzed 101,813 papers and patents of graphene industray research. The proposed method finished the topic identification, correlation analysis, and visualization in about two miniutes. [Limitations] More research is needed to explore the network analysis issues. [Conclusions] Machine learning provides enormous potentiality for intelligence studies, especially the large volume text analytics and visualization.

Key wordsLDA Model      Data Analysis      Machine Learning      Python      Data Visualization     
Received: 24 October 2017      Published: 03 April 2018

Li Wang,Lixue Zou,Xiwen Liu. Visualizing Document Correlation Based on LDA Model. Data Analysis and Knowledge Discovery, 2018, 2(3): 98-106.

