Please wait a minute...
New Technology of Library and Information Service  2007, Vol. 2 Issue (12): 57-63    DOI: 10.11925/infotech.1003-3513.2007.12.12
Current Issue | Archive | Adv Search |
Comparative Study on HMM and CRFs Applying in Information Extraction
Wang Hao  Deng Sanhong
(Department of Information Management, Nanjing University,Nanjing 210093,China)
Download:
Export: BibTeX | EndNote (RIS)      
Abstract  

This paper brings forward two models for person-name entity extraction based on the comparison of math theory between HMM and CRFs, one using word role label based HMM and the other using character role label based CRFs, then validates and compares the effect of both by open-testing and applying in practice, and thereby proves in practice that CRFs is fitter for sequence labeling and object classifying than HMM.

Key wordsHMM      CRFs      Information extraction      Person-name entity extraction      Role label      Feature     
Received: 11 October 2007      Published: 25 December 2007
: 

TP311

 
Corresponding Authors: Wang Hao     E-mail: ywhaowang810710@sina.com
About author:: Wang Hao,Deng Sanhong

Cite this article:

Wang Hao,Deng Sanhong. Comparative Study on HMM and CRFs Applying in Information Extraction. New Technology of Library and Information Service, 2007, 2(12): 57-63.

URL:

https://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/10.11925/infotech.1003-3513.2007.12.12     OR     https://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/Y2007/V2/I12/57

[1] 傅爱平. 计算语言学和自然语言信息处理研究和应用综述[EB/OL].[2007-10-01]. http://ling.cass.cn/yingyong/courses/nlpbase.htm
[2] 王昊. 基于层次模式匹配的命名实体识别模型[J]. 现代图书情报技术, 2007(5):62-68
[3] Zhou G D, Su J. Named Entity Recognition Using an HMM-based Chunk Tagger[C]. In:Proceedings of the 40th Annual Meeting of the ACL. Philadelphia, PA., USA, 2002:473-480
[4] Settles B. Biomedical Named Entity Recognition Using Conditional Random Fields and Rich Feature Sets[C]. In:Proceedings of the International Joint Workshop on Natural Language Processing in Biomedicine  and its Application(NLPBA). Geneva,Switzerland, 2004:103-107
[5] 詹卫东. 词汇分析(二)——从词串到词性标记串[EB/OL]. [2007-10-01]. http://ccl.pku.edu.cn/ doubtfire/course/computational linguistics/contents/Chapter_07_2_pdf_format.pdf.
[6] 钱晶, 张杰, 张涛. 基于最大熵的汉语人名地名识别方法研究[J]. 小型微型计算机系统, 2006, 27(9):1761-1765
[7] 向晓雯. 基于条件随机场的中文命名实体识别[D].厦门:厦门大学,2006.
[8] laputa. 最大熵模型与自然语言处理[EB/OL]. [2007-10-01]. http://www.cs.caltech.edu/~weixl/research/read/summary/MaxEnt2.ppt.
[9] 黄昌宁, 赵海. 由字构词——中文分词新方法[C]. 中国中文信息学会第六次全国会员代表大会暨成立二十五周年学术会议,2006
[10] 郭家清, 蔡东风, 王智超,等.一种基于条件随机场的人名识别[J]. 通讯与计算机,2007,4(2):22-25
[11] CRF++-0.49[CP/OL].[2007-10-01]. http://sourceforge.net

[1] Fan Shaoping,Zhao Yuxuan,An Xinying,Wu Qingqiang. Classification Model for Medical Entity Relations with Convolutional Neural Network[J]. 数据分析与知识发现, 2021, 5(9): 75-84.
[2] Chen Jie,Ma Jing,Li Xiaofeng. Short-Text Classification Method with Text Features from Pre-trained Models[J]. 数据分析与知识发现, 2021, 5(9): 21-30.
[3] Ma Jiangwei, Lv Xueqiang, You Xindong, Xiao Gang, Han Junmei. Extracting Relationship Among Military Domains with BERT and Relation Position Features[J]. 数据分析与知识发现, 2021, 5(8): 1-12.
[4] Chai Qingfeng, Shi Linyan, Mei Shan, Xiong Haitao, He Huixin. Extracting Knowledge Elements of Sci-Tech Literature Based on Artificial and Machine Features[J]. 数据分析与知识发现, 2021, 5(8): 132-144.
[5] Tan Ying, Tang Yifei. Extracting Citation Contents with Coreference Resolution[J]. 数据分析与知识发现, 2021, 5(8): 25-33.
[6] Xu Yuemei, Wang Zihou, Wu Zixin. Predicting Stock Trends with CNN-BiLSTM Based Multi-Feature Integration Model[J]. 数据分析与知识发现, 2021, 5(7): 126-138.
[7] Wang Hao, Lin Kerou, Meng Zhen, Li Xinlei. Identifying Multi-Type Entities in Legal Judgments with Text Representation and Feature Generation[J]. 数据分析与知识发现, 2021, 5(7): 10-25.
[8] Zhang Guobiao,Li Jie. Detecting Social Media Fake News with Semantic Consistency Between Multi-model Contents[J]. 数据分析与知识发现, 2021, 5(5): 21-29.
[9] Meng Zhen,Wang Hao,Yu Wei,Deng Sanhong,Zhang Baolong. Vocal Music Classification Based on Multi-category Feature Fusion[J]. 数据分析与知识发现, 2021, 5(5): 59-70.
[10] Lin Kerou,Wang Hao,Gong Lijuan,Zhang Baolong. Disambiguation of Chinese Author Names with Multiple Features[J]. 数据分析与知识发现, 2021, 5(4): 90-102.
[11] Wang Yuzhu,Xie Jun,Chen Bo,Xu Xinying. Multi-modal Sentiment Analysis Based on Cross-modal Context-aware Attention[J]. 数据分析与知识发现, 2021, 5(4): 49-59.
[12] Liang Jiaming, Zhao Jie, Zheng Peng, Huang Liushen, Ye Minqi, Dong Zhenning. Framework for Computing Trust in Online Short-Rent Platform Using Feature Selection of Images and Texts[J]. 数据分析与知识发现, 2021, 5(2): 129-140.
[13] Li Xiao, Qu Jiansheng. Review of Application and Evolution of Meta-Analysis in Social Sciences[J]. 数据分析与知识发现, 2021, 5(11): 1-12.
[14] Han Pu, Zhang Wei, Zhang Zhanpeng, Wang Yuxin, Fang Haoyu. Sentiment Analysis of Weibo Posts on Public Health Emergency with Feature Fusion and Multi-Channel[J]. 数据分析与知识发现, 2021, 5(11): 68-79.
[15] Zheng Xinman, Dong Yu. Constructing Degree Lexicon for STI Policy Texts[J]. 数据分析与知识发现, 2021, 5(10): 81-93.
  Copyright © 2016 Data Analysis and Knowledge Discovery   Tel/Fax:(010)82626611-6626,82624938   E-mail:jishu@mail.las.ac.cn