New Technology of Library and Information Service  2014, Vol. 30 Issue (7): 41-47    DOI: 10.11925/infotech.1003-3513.2014.07.06
Study on Keyword Extraction with LDA and TextRank Combination
Gu Yijun1, Xia Tian2,3
1. Schools of Cyber Security, People's Public Security University of China, Beijing 100038, China;
2. Key Laboratory of Data Engineering and Knowledge Engineering, MOE, Renmin University of China, Beijing 100872, China;
3. School of Information Resource Management, Renmin University of China, Beijing 100872, China
[Objective] Realize keyword extraction through the merger of the internal structure information of single document and the topic information among documents.[Methods] LDA is used for topic modeling and influence calculation of candidate keywords, then, the Text Rank algorithm is improved and the importance of the candidate words is uneven transferred by topic influences and word adjacency relations. Furthermore, the probability transition matrix for iterative calculation is built and used to extract keywords.[Results] The effective combination of LDA and Text Rank is achieved, and the keyword extraction results are improved significantly when the data set presents strong topic distribution.[Limitations] High-cost multi-document topic analysis is required for combination method.[Conclusions] Document keywords are associated with document itself and the related documents collection,combination of these two aspects is an effective way to improve the results of keyword extraction.

Key wordsKeyword extraction      LDA      Text Rank      Graph model     
Received: 07 February 2014      Published: 20 October 2014
:  TP393  

Cite this article:

Gu Yijun, Xia Tian. Study on Keyword Extraction with LDA and TextRank Combination. New Technology of Library and Information Service, 2014, 30(7): 41-47.

