|
|
GPE-entity Recognition Based on Conditional Random Fields |
Zong Ping1,2 Shi Shuicai1,2 Wang Tao1,2 Lv Xueqiang1,2 |
1(Chinese Information Processing Research Center, Beijing Information Science &Technology University, Beijing 100101,China)
2(Beijing TRS Information Technology Co.Ltd., Beijing 100101,China ) |
|
|
Abstract This paper detects Geographical Political Entities (GPE) and it subtypes from the English corpus of Automatic Content Extraction (ACE) evaluation, based on Conditional Random Fields (CRFs). A feature set is extracted from the ACE corpus, and contributions of different feature sets to the detection of GPE entities are evaluated in the experiments. The results show that the feature set extracted in this paper can get higher rate of recall and accuracy.
|
Received: 18 November 2008
Published: 25 February 2009
|
|
Corresponding Authors:
Zong Ping
E-mail: zong.ping@trs.com.cn
|
About author:: Zong Ping,Shi Shuicai,Wang Tao,Lv Xueqiang |
[1] Linguistic Data Consortium.ACE(Automatic Content Extraction) English Annotation Guidelines for Entities Version 6.1[EB/OL].[2008-03-29].http://projects.ldc.upenn.edu/ace.
[2] ZHOU GD, SU J. Named Entity Recognition Using an HMMbased Chunk Tagger[C]. In: Proceedings of the 40th Annual Meeting of the Association for Computation Linguistics, Philadelphia. USA:Association for Computational Linguistics,2002:473-480.
[3] Bender O, Ney H. Maximum Entropy Models for Named Entity Recognition [C]. In: Proceedings of the Conference on Computational Natural Language Learning,Edmonton,Canada. USA:Association for Computational Linguistics,2003:148-151.
[4] Lafferty J,McCallum A,Pereira F.Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Seqquence Data[J].The Journal of Manchine Learning Research,2001, ICML01:282-289.
[5] Hacioglu K,Douglas B,Chen Y. Detection of Entity Mentions Occurring in English and Chinese Text[C].In:Proceedings of Human Language Technology Conference and Conference on Empirical Methods in Natural Language Processing,Cannada.USA:Association for Computational Linguistics,2005(10): 379-386.
[6] The ACE 2008 Evaluation Plan.Assessment of Detection and Recognition of Entities and Relations Within and Across Documents[EB/OL].[2008-05-07].http://www.nist.gov/speech/tests/ace/ace08/doc/.
[7] Sutton C,McCallum A,Rohanimanesh K.Dynamic Conditional Random Fields:Factorized Probabilistic Models for Labeling and Segmenting Sequence Data[J].The Journal of Machine Learning Research,2007,8(3):693-723.
[8] 廖先桃.CRF理论、工具包的使用及在NE上的应用[R/OL].[2008-04-02].http://ir.hit.edu.cn/phpwebsite/index.php?module=documents&JAS_DocumentManager_op=downloadFile &JAS_File_id=215.
[9] 张海雷,曹菲菲,陈文亮,等.基于多层次特征集成的中文实体指代识别[J],中文信息学报,2007,21(5):126-130.
[10] 向晓雯.基于条件随机场的中文命名实体识别[D],厦门:厦门大学,2006.
[11] Florian R,Hassan H,Jing H,et al.Factorizing Complex Models: A Case Study in Mention Detection[J].Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics. 2006(9):473-480.
[12] 郭家清.基于条件随机场的命名实体识别研究[D],沈阳:沈阳航空工业学院,2007. |
|
Viewed |
|
|
|
Full text
|
|
|
|
|
Abstract
|
|
|
|
|
Cited |
|
|
|
|
|
Shared |
|
|
|
|
|
Discussed |
|
|
|
|