New Technology of Library and Information Service  2016, Vol. 32 Issue (3): 67-73    DOI: 10.11925/infotech.1003-3513.2016.03.09
A Scientific Research Object Labeling System Based on Active earning
He Huixin,Liu Lijuan()
Tongfang Knowledge Network Technology Co., Ltd. (Beijing), Beijing 100192, China
[Objective] This study aims to identify the research object attribute instance from the paper titles. With the help of limited labeled samples, we could maximumize the accuracy of research object recognition. [Methods] We first analyzed the grammatical features of scientific research objects based on conditional random field sequence labeling algorithm. Second, we recognized and extracted research objects using a small amount of samples. Finally, we introduced an active learning iterative labeling system based on unlabeled data to improve the research object recognition accuracy. [Results] The results showed that the proposed method could efficiently use the unlabeled data, and increase the accuracy of the research object recognition to 78.3%. [Limitations] The proposed algorithm needs to be further optimized to improve its efficiency. [Conclusions] The proposed method performed well on the research object attributes identification, which is the foundation for further mining the knowledge system and the structure of science and technology literature.

Key wordsScientific literature      Research objects      Conditional Random Fields      Iterative labeling system      Active learning     
Received: 13 October 2015      Published: 12 April 2016

He Huixin,Liu Lijuan. A Scientific Research Object Labeling System Based on Active earning. New Technology of Library and Information Service, 2016, 32(3): 67-73.

