Recognizing Named Entity from Free-text Customer Reviews——A Maximum Entropy Model-based Approach
Yu Chuanming1,2, Huang Jianqiu2, Guo Fei2
1. School of Information Safety and Engineering, Zhongnan University of Economics and Law, Wuhan 430073, China;
2. Business School, University of Shanghai for Science and Technology, Shanghai 200093, China
Abstract:This paper introduces the concept of Named Entity Recognition (NER), analyzes two basic approaches, the rule-based approach and the statistical approach, and conducts an empirical study on Chinese dish name recognition based on the theory of Maximum Entropy Model (MEM). According to the characteristics of Chinese named entity, 6 feature templates are designed. Experimental results show that adding tagging features to the basic simple feature template can efficiently improve the performance of Named Entity Recognition. The features in order to improve recognition performance are as follow: tagging features, combination of POS features, forward POS dependency features and word form features.
余传明, 黄建秋, 郭飞. 从客户评论中识别命名实体——基于最大熵模型的实现[J]. 现代图书情报技术, 2011, 27(5): 77-82.
Yu Chuanming, Huang Jianqiu, Guo Fei. Recognizing Named Entity from Free-text Customer Reviews——A Maximum Entropy Model-based Approach. New Technology of Library and Information Service, 2011, 27(5): 77-82.
[1] Grishman R, Sundheim B. Message Understanding Conference-6: A Brief History . In: Proceedings of the 16th International Conference on Computational Linguistics (COLING), Kopenhagen.1996:466-471.[2] Srihari R K, Li W, Cornell T, et al. InfoXtract: A Customizable Intermediate Level Information Extraction Engine [J]. Journal of Natural Language Engineering, 2008, 14(1): 33-69.[3] Hirschman L, Gaizauskas R. Natural Language Question Answering:The View from Here [J]. Journal of Natural Language Engineering, 2001, 7(4):275-300.[4] Frost R A, Hafiz R, Callaghan P. Parser Combinators for Ambiguous Left-Recursive Grammars . In: Proceedings of the 10th International Symposium on Practical Aspects of Declarative Languages (PADL), ACM-SIGPLAN, San Francisco.2008,4902: 167-181.[5] Geer D. Statistical Translation Gains Respect [J]. IEEE Computer, 2005,38(10):18-21.[6] Halpin H, Robu V, Shepherd H. The Complex Dynamics of Collaborative Tagging . In: Proceedings of the 16th International Conference on the World Wide Web (WWW'07), Banff, Canada. New York, NY, USA:ACM Press, 2007:211-220.[7] Manning C D, Schütze H. Foundations of Statistical Natural Language Processing [M]. 1st Edition. MIT Press, 1999.[8] Farmakiotou D, Karkaletsis V, Koutsias J, et al. Rule-based Named Entity Recognition for Greek Financial Texts . In: Proceedings of the Workshop on Computational Lexicography and Multimedia Dictionaries. 2000:75-78.[9] 李楠,郑荣廷,吉久明,等.基于启发式规则的中文化学物质命名识别研究[J]. 现代图书情报技术 ,2010(5):13-17.[10] Yang T. Computational Verb Decision Trees [J]. International Journal of Computational Cognition, 2006, 4 (4): 34-46.[11] Bechet F, Nasr A, Genet F. Tagging Unknown Proper Names Using Decision Trees . In: Proceedings of the 38th Annual Meeting of the Association for Computational Linguistics, HongKong, China.2000:77-84.[12] Rabiner L R. A Tutorial on Hidden Markov Models and Selected Applications in Speech Recognition .// Waibel A, Lee K F. Readings in Speech Recognition[M]. San Francisco, CA, USA:Morgan Kaufmann Publishers Inc., 1990: 267-296.[13] Zhou G, Su J. Named Entity Recognition Using an HMM-based Chunk Tagger . In: Proceedings of the 40th Annual Meeting on Association for Computational Linguistics.2002:473-480.[14] Uffink J. Can the Maximum Entropy Principle be Explained as a Consistency Requirement? [J]. Studies in History and Philosophy of Modern Physics, 1995, 26(3): 223-261.[15] Borthwick A E. A Maximum Entropy Approach to Named Entity Recognition . New York, NY, USA:New York University, 1999.[16] Moens M F. Information Extraction: Algorithms and Prospects in a Retrieval Context [M]. New York: Springer, 2006: 105-106.[17] Berger A L, Pietra V J D, Pietra S A D. A Maximum Entropy Approach to Natural Language Processing [J]. Computational Linguistics, 1996, 22(1):39-71.[18] 曲晓棠, 沈晓红. 基于最大熵模型的中文命名实体识别研究[J]. 科技信息 ,2008(30):15-17.