New Technology of Library and Information Service  2008, Vol. 24 Issue (6): 34-40    DOI: 10.11925/infotech.1003-3513.2008.06.07
Analysis of the Factors Affecting the Performance of CRF-based Keywords Extraction Model
Zhang Chengmin1,2   Xu Xin3   Zhang Chengzhi 4,5
1(Department of Information Management, Nanjing University, Nanjing 210093,China)
2(Library of China Pharmaceutical University, Nanjing 210009,China)
3(Department of Informatics, East China Normal University, Shanghai  200241,China)
4(Department of Information Management, Nanjing University of Science & Technology, Nanjing 210094,China)
5(Institute of Scientific & Technical Information of China, Beijing 100038,China)
 The CRF model can use the features of documents more sufficiently and effectively. Keywords extraction based on CRF is proposed and implemented. The factors affecting the performance of the CRF-based keyword extraction model are analyzed. The factors include: the performance of text segmentation, the scale of training corpus, the number of figure and the parameters setting of the CRF model.

Key wordsAutomatic indexing      Keywords extraction      Conditional random fields      Machine learning     
Received: 31 January 2008      Published: 25 June 2008



Corresponding Authors: Zhang Chengmin     E-mail:
About author:: Zhang Chengmin,Xu Xin,Zhang Chengzhi

Cite this article:

Zhang Chengmin,Xu Xin,Zhang Chengzhi. Analysis of the Factors Affecting the Performance of CRF-based Keywords Extraction Model. New Technology of Library and Information Service, 2008, 24(6): 34-40.

