New Technology of Library and Information Service  2014, Vol. 30 Issue (3): 73-79    DOI: 10.11925/infotech.1003-3513.2014.03.11
Research on Keyphrase Extraction from Scholarly Article Outline
He Yuanbiao1,2, Le Xiaoqiu1, Zhang Fan1,2
1 National Science Library, Chinese Academy of Sciences, Beijing 100190, China;
2 University of Chinese Academy of Sciences, Beijing 100049, China
[Objective] According to the succinct and hierarchical character of scholarly article outlines, this paper concentrates on finding a method to extract important and meaningful phrases from the outlines. [Methods] This paper first adopts a combined method of linguistic rules and terminology dictionaries to identify the candidate phrases. Then, it calculates tf-idf based on syntactic dependencies between phrases, and quantifies the hierarchical feature according to hierarchical structure of outline. At last, it combines the tf-idf and the hierarchical feature to rank candidate phrases, and selects the keyphrases. [Results] Experiments show that the F-score of the candidate phrases identification reaches 89.57%, and the F-score of candidate phrases selection reaches 36.89%. [Limitations] In this method, the inadequate phrase extraction rules and the empirical values involved in weight setting during tf-idf calculation lead to non-optimal effect. [Conclusions] This method can effectively extract the keyphrase from outlines, and is suitable for keyphrase extraction from hierarchical structure.

Key wordsCandidate phrases identification      Candidate phrases selection      Syntactic dependencies      Hierarchical feature     
Received: 26 September 2013      Published: 15 April 2014
He Yuanbiao, Le Xiaoqiu, Zhang Fan. Research on Keyphrase Extraction from Scholarly Article Outline. New Technology of Library and Information Service, 2014, 30(3): 73-79.

