GPE-entity Recognition Based on Conditional Random Fields
Zong Ping1,2 Shi Shuicai1,2 Wang Tao1,2 Lv Xueqiang1,2
1(Chinese Information Processing Research Center, Beijing Information Science &Technology University, Beijing 100101,China) 2(Beijing TRS Information Technology Co.Ltd., Beijing 100101,China )
采用基于条件随机场的方法,对ACE评测的英文语料中的地理行政类型实体(Geographical Political Entities, GPE)及其子类型进行识别。提出一种从ACE语料中选取的特征集,并根据不同的特征组合对GPE识别的贡献与其它特征集进行比较,实验表明该特征集能取得较高的召回率和准确率。
This paper detects Geographical Political Entities (GPE) and it subtypes from the English corpus of Automatic Content Extraction (ACE) evaluation, based on Conditional Random Fields (CRFs). A feature set is extracted from the ACE corpus, and contributions of different feature sets to the detection of GPE entities are evaluated in the experiments. The results show that the feature set extracted in this paper can get higher rate of recall and accuracy.
宗萍,施水才,王涛,吕学强. 基于条件随机场的英文地理行政实体识别*[J]. 现代图书情报技术, 2009, 3(2): 51-55.
Zong Ping,Shi Shuicai,Wang Tao,Lv Xueqiang. GPE-entity Recognition Based on Conditional Random Fields. New Technology of Library and Information Service, 2009, 3(2): 51-55.
[1] Linguistic Data Consortium.ACE(Automatic Content Extraction) English Annotation Guidelines for Entities Version 6.1[EB/OL].[2008-03-29].http://projects.ldc.upenn.edu/ace.
[2] ZHOU GD, SU J. Named Entity Recognition Using an HMMbased Chunk Tagger[C]. In: Proceedings of the 40th Annual Meeting of the Association for Computation Linguistics, Philadelphia. USA:Association for Computational Linguistics,2002:473-480.
[3] Bender O, Ney H. Maximum Entropy Models for Named Entity Recognition [C]. In: Proceedings of the Conference on Computational Natural Language Learning,Edmonton,Canada. USA:Association for Computational Linguistics,2003:148-151.
[4] Lafferty J,McCallum A,Pereira F.Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Seqquence Data[J].The Journal of Manchine Learning Research,2001, ICML01:282-289.
[5] Hacioglu K,Douglas B,Chen Y. Detection of Entity Mentions Occurring in English and Chinese Text[C].In:Proceedings of Human Language Technology Conference and Conference on Empirical Methods in Natural Language Processing,Cannada.USA:Association for Computational Linguistics,2005(10): 379-386.
[6] The ACE 2008 Evaluation Plan.Assessment of Detection and Recognition of Entities and Relations Within and Across Documents[EB/OL].[2008-05-07].http://www.nist.gov/speech/tests/ace/ace08/doc/.
[7] Sutton C,McCallum A,Rohanimanesh K.Dynamic Conditional Random Fields:Factorized Probabilistic Models for Labeling and Segmenting Sequence Data[J].The Journal of Machine Learning Research,2007,8(3):693-723.
[8] 廖先桃.CRF理论、工具包的使用及在NE上的应用[R/OL].[2008-04-02].http://ir.hit.edu.cn/phpwebsite/index.php?module=documents&JAS_DocumentManager_op=downloadFile &JAS_File_id=215.
[9] 张海雷,曹菲菲,陈文亮,等.基于多层次特征集成的中文实体指代识别[J],中文信息学报,2007,21(5):126-130.
[10] 向晓雯.基于条件随机场的中文命名实体识别[D],厦门:厦门大学,2006.
[11] Florian R,Hassan H,Jing H,et al.Factorizing Complex Models: A Case Study in Mention Detection[J].Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics. 2006(9):473-480.
[12] 郭家清.基于条件随机场的命名实体识别研究[D],沈阳:沈阳航空工业学院,2007.