%A Wang Hao, Zou Jieli, Deng Sanhong %T Model Construction and Experiment Analysis of Automatic Indexing for Chinese Books %0 Journal Article %D 2013 %J Data Analysis and Knowledge Discovery %R 10.11925/infotech.1003-3513.2013.07-08.08 %P 55-62 %V 29 %N 7/8 %U {https://manu44.magtech.com.cn/Jwk_infotech_wk3/CN/abstract/article_3763.shtml} %8 2013-08-25 %X For the problem of automatic keywords indexing for Chinese books, this paper introduces the machine learning algorithm of Condition Radom Fields to deal with it. The method generates an annotation model including semantic relations and rule features among sequence entities though training the large number of existing keywords data of Chinese books indexed by manual, then uses the annotation model for machine predicting so that to automatically extract the books' keywords. The paper mainly solves two problems. First, because the parameters choice of CRFs will affect the indexing performance, the authors make comparative tests from several angles so as to identify the optimal parameter set of CRFs for the specific problem of keywords indexing for Chinese books. Second, the authors discusse the effect of different observed features to the keywords indexing, and demonstrate four observed features which can improve the indexing performance effectively through the experiments analysis. Finally, the optimal model of keywords indexing oriented to Chinese books is constructed.