New Technology of Library and Information Service  2013, Vol. 29 Issue (9): 54-59    DOI: 10.11925/infotech.1003-3513.2013.09.09
An Automatic Term Extraction System of Improved C-value Based on Effective Word Frequency
Xiong Liyan, Tan Long, Zhong Maosheng
School of Information Engineering, East China Jiaotong University, Nanchang 330013, China
Abstract  Existing Chinese term automatic extraction methods focus on the high-frequency characteristics and unithood indicators of terms, while low frequency terms and termhood indicators lack of effective treatment methods. In response to these problems, this paper introduces the background corpus into C-value method and proposes the concepts of word field distribution degree and effective word frequency. Then the paper automatically extracts the terms by calculating EC-value (Effective C-value) of candidate terms, and improves the extraction performance of low-frequency terms combined with the term cluster recognition and mining. The term extraction experiment in the computer field shows that the proposed improved method (EC-value method) can measure the termhood of terms more effectively, and improve the extraction performance of low-frequency terms.
Key wordsAutomatic term extraction      EC-value      Effective word frequency      Term cluster     
Received: 17 June 2013      Published: 27 September 2013
:  TP391.1  

Xiong Liyan, Tan Long, Zhong Maosheng. An Automatic Term Extraction System of Improved C-value Based on Effective Word Frequency. New Technology of Library and Information Service, 2013, 29(9): 54-59.

