This paper brings forward two models for person-name entity extraction based on the comparison of math theory between HMM and CRFs, one using word role label based HMM and the other using character role label based CRFs, then validates and compares the effect of both by open-testing and applying in practice, and thereby proves in practice that CRFs is fitter for sequence labeling and object classifying than HMM.
王昊,邓三鸿. HMM和CRFs在信息抽取应用中的比较研究[J]. 现代图书情报技术, 2007, 2(12): 57-63.
Wang Hao,Deng Sanhong. Comparative Study on HMM and CRFs Applying in Information Extraction. New Technology of Library and Information Service, 2007, 2(12): 57-63.
[1] 傅爱平. 计算语言学和自然语言信息处理研究和应用综述[EB/OL].[2007-10-01]. http://ling.cass.cn/yingyong/courses/nlpbase.htm
[2] 王昊. 基于层次模式匹配的命名实体识别模型[J]. 现代图书情报技术, 2007(5):62-68
[3] Zhou G D, Su J. Named Entity Recognition Using an HMM-based Chunk Tagger[C]. In:Proceedings of the 40th Annual Meeting of the ACL. Philadelphia, PA., USA, 2002:473-480
[4] Settles B. Biomedical Named Entity Recognition Using Conditional Random Fields and Rich Feature Sets[C]. In:Proceedings of the International Joint Workshop on Natural Language Processing in Biomedicine and its Application(NLPBA). Geneva,Switzerland, 2004:103-107
[5] 詹卫东. 词汇分析(二)——从词串到词性标记串[EB/OL]. [2007-10-01]. http://ccl.pku.edu.cn/ doubtfire/course/computational linguistics/contents/Chapter_07_2_pdf_format.pdf.
[6] 钱晶, 张杰, 张涛. 基于最大熵的汉语人名地名识别方法研究[J]. 小型微型计算机系统, 2006, 27(9):1761-1765
[7] 向晓雯. 基于条件随机场的中文命名实体识别[D].厦门:厦门大学,2006.
[8] laputa. 最大熵模型与自然语言处理[EB/OL]. [2007-10-01]. http://www.cs.caltech.edu/~weixl/research/read/summary/MaxEnt2.ppt.
[9] 黄昌宁, 赵海. 由字构词——中文分词新方法[C]. 中国中文信息学会第六次全国会员代表大会暨成立二十五周年学术会议,2006
[10] 郭家清, 蔡东风, 王智超,等.一种基于条件随机场的人名识别[J]. 通讯与计算机,2007,4(2):22-25
[11] CRF++-0.49[CP/OL].[2007-10-01]. http://sourceforge.net