New Technology of Library and Information Service  2014, Vol. 30 Issue (7): 34-40    DOI: 10.11925/infotech.1003-3513.2014.07.05
Application of Machine Learning with Limited Corpus to Identify Structure of Scientific Abstracts Automatically
Bai Guangzu1,3, He Yuanbiao2,3, Ma Jianxia1, Liu Jianhuaz2,3, Zou Yimin4
1. Lanzhou Library, ChineseAcademy of Sciences, Lanzhou 730000, China;
2. National Science Library, Chinese Academy of Sciences, Beijing 100190, China;
3. University of Chinese Academy of Sciences, Beijing 100049, China;
4. College of Economics&Management, Zhejiang Normal University, Jinhua 321004, China
[Objective] This study aims to identify structural contents of scientific abstract automatically by classifying the academic abstracts sentences based on machine learning with limited samples.[Methods] This paper designs a variety of text features to represent scientific abstract sentences, then extracts these features from the academic abstracts based on natural language processing techniques so as to instruct Naive Bayesian Model and Support Vector Machines in training, and ultimately identifies the structure of academic abstracts automatically by using these models.[Results]Experiments show that the method can achieve fairly even better recognition accuracy compared with previous methods by using less training corpus.[Limitations] Due to the lack of feature words and core verbs in abstract sentences with"METHOD" class label, it resulted in a lower recognition accuracy on these sentences.[Conclusions] This method is an effective approach to achieve the automatic recognition of academic abstracts structure by using limited corpus.

Key wordsScience abstract      Structure identifying      Machine learning     
Received: 08 October 2013      Published: 20 October 2014
:  G356.7  

Cite this article:

Bai Guangzu, He Yuanbiao, Ma Jianxia, Liu Jianhuaz, Zou Yimin. Application of Machine Learning with Limited Corpus to Identify Structure of Scientific Abstracts Automatically. New Technology of Library and Information Service, 2014, 30(7): 34-40.

