Please wait a minute...
Advanced Search
现代图书情报技术  2011, Vol. 27 Issue (5): 77-82     https://doi.org/10.11925/infotech.1003-3513.2011.05.12
  应用实践 本期目录 | 过刊浏览 | 高级检索 |
从客户评论中识别命名实体——基于最大熵模型的实现
余传明1,2, 黄建秋2, 郭飞2
1. 中南财经政法大学信息与安全工程学院 武汉 430073;
2. 上海理工大学管理学院 上海 200093
Recognizing Named Entity from Free-text Customer Reviews——A Maximum Entropy Model-based Approach
Yu Chuanming1,2, Huang Jianqiu2, Guo Fei2
1. School of Information Safety and Engineering, Zhongnan University of Economics and Law, Wuhan 430073, China;
2. Business School, University of Shanghai for Science and Technology, Shanghai 200093, China
全文: PDF (921 KB)   HTML  
输出: BibTeX | EndNote (RIS)      
摘要 介绍命名实体识别的基本概念,分析两种命名实体识别的基本方法:基于规则的命名实体识别方法和基于统计的命名实体识别方法,并以最大熵模型为理论基础,对中文菜名识别进行实证研究。根据中文命名实体的特点,设计6种特征模板。实验结果表明,在简单特征模板的基础上增加标注特征能有效提高命名实体的识别效果。对改进识别效果有用的特征依次为:标注特征、词性组合特征、后向词性依赖特征和词形特征。
服务
把本文推荐给朋友
加入引用管理器
E-mail Alert
RSS
作者相关文章
余传明
黄建秋
郭飞
关键词 命名实体识别最大熵模型客户评论文本挖掘    
Abstract:This paper introduces the concept of Named Entity Recognition (NER), analyzes two basic approaches, the rule-based approach and the statistical approach, and conducts an empirical study on Chinese dish name recognition based on the theory of Maximum Entropy Model (MEM). According to the characteristics of Chinese named entity, 6 feature templates are designed. Experimental results show that adding tagging features to the basic simple feature template can efficiently improve the performance of Named Entity Recognition. The features in order to improve recognition performance are as follow: tagging features, combination of POS features, forward POS dependency features and word form features.
Key wordsNamed entity recognition    Maximum entropy model    User reviews    Text mining
收稿日期: 2011-04-28      出版日期: 2011-07-11
: 

TP391

 
引用本文:   
余传明, 黄建秋, 郭飞. 从客户评论中识别命名实体——基于最大熵模型的实现[J]. 现代图书情报技术, 2011, 27(5): 77-82.
Yu Chuanming, Huang Jianqiu, Guo Fei. Recognizing Named Entity from Free-text Customer Reviews——A Maximum Entropy Model-based Approach. New Technology of Library and Information Service, 2011, 27(5): 77-82.
链接本文:  
https://manu44.magtech.com.cn/Jwk_infotech_wk3/CN/10.11925/infotech.1003-3513.2011.05.12      或      https://manu44.magtech.com.cn/Jwk_infotech_wk3/CN/Y2011/V27/I5/77
[1] Grishman R, Sundheim B. Message Understanding Conference-6: A Brief History . In: Proceedings of the 16th International Conference on Computational Linguistics (COLING), Kopenhagen.1996:466-471.

[2] Srihari R K, Li W, Cornell T, et al. InfoXtract: A Customizable Intermediate Level Information Extraction Engine [J]. Journal of Natural Language Engineering, 2008, 14(1): 33-69.

[3] Hirschman L, Gaizauskas R. Natural Language Question Answering:The View from Here [J]. Journal of Natural Language Engineering, 2001, 7(4):275-300.

[4] Frost R A, Hafiz R, Callaghan P. Parser Combinators for Ambiguous Left-Recursive Grammars . In: Proceedings of the 10th International Symposium on Practical Aspects of Declarative Languages (PADL), ACM-SIGPLAN, San Francisco.2008,4902: 167-181.

[5] Geer D. Statistical Translation Gains Respect [J]. IEEE Computer, 2005,38(10):18-21.

[6] Halpin H, Robu V, Shepherd H. The Complex Dynamics of Collaborative Tagging . In: Proceedings of the 16th International Conference on the World Wide Web (WWW'07), Banff, Canada. New York, NY, USA:ACM Press, 2007:211-220.

[7] Manning C D, Schütze H. Foundations of Statistical Natural Language Processing [M]. 1st Edition. MIT Press, 1999.

[8] Farmakiotou D, Karkaletsis V, Koutsias J, et al. Rule-based Named Entity Recognition for Greek Financial Texts . In: Proceedings of the Workshop on Computational Lexicography and Multimedia Dictionaries. 2000:75-78.

[9] 李楠,郑荣廷,吉久明,等.基于启发式规则的中文化学物质命名识别研究[J]. 现代图书情报技术 ,2010(5):13-17.

[10] Yang T. Computational Verb Decision Trees [J]. International Journal of Computational Cognition, 2006, 4 (4): 34-46.

[11] Bechet F, Nasr A, Genet F. Tagging Unknown Proper Names Using Decision Trees . In: Proceedings of the 38th Annual Meeting of the Association for Computational Linguistics, HongKong, China.2000:77-84.

[12] Rabiner L R. A Tutorial on Hidden Markov Models and Selected Applications in Speech Recognition .// Waibel A, Lee K F. Readings in Speech Recognition[M]. San Francisco, CA, USA:Morgan Kaufmann Publishers Inc., 1990: 267-296.

[13] Zhou G, Su J. Named Entity Recognition Using an HMM-based Chunk Tagger . In: Proceedings of the 40th Annual Meeting on Association for Computational Linguistics.2002:473-480.

[14] Uffink J. Can the Maximum Entropy Principle be Explained as a Consistency Requirement? [J]. Studies in History and Philosophy of Modern Physics, 1995, 26(3): 223-261.

[15] Borthwick A E. A Maximum Entropy Approach to Named Entity Recognition . New York, NY, USA:New York University, 1999.

[16] Moens M F. Information Extraction: Algorithms and Prospects in a Retrieval Context [M]. New York: Springer, 2006: 105-106.

[17] Berger A L, Pietra V J D, Pietra S A D. A Maximum Entropy Approach to Natural Language Processing [J]. Computational Linguistics, 1996, 22(1):39-71.

[18] 曲晓棠, 沈晓红. 基于最大熵模型的中文命名实体识别研究[J]. 科技信息 ,2008(30):15-17.
[1] 黄名选,蒋曹清,卢守东. 基于词嵌入与扩展词交集的查询扩展*[J]. 数据分析与知识发现, 2021, 5(6): 115-125.
[2] 许光,任明,宋城宇. 西方媒体新闻中的中国经济形象提取*[J]. 数据分析与知识发现, 2021, 5(5): 30-40.
[3] 代冰,胡正银. 基于文献的知识发现新近研究综述 *[J]. 数据分析与知识发现, 2021, 5(4): 1-12.
[4] 余传明, 王曼怡, 林虹君, 朱星宇, 黄婷婷, 安璐. 基于深度学习的词汇表示模型对比研究*[J]. 数据分析与知识发现, 2020, 4(8): 28-40.
[5] 徐晨飞, 叶海影, 包平. 基于深度学习的方志物产资料实体自动识别模型构建研究*[J]. 数据分析与知识发现, 2020, 4(8): 86-97.
[6] 夏天. 面向中文学术文本的单文档关键短语抽取 *[J]. 数据分析与知识发现, 2020, 4(7): 76-86.
[7] 高原,施元磊,张蕾,曹天奕,冯筠. 基于游记文本的游客游览行程重构*[J]. 数据分析与知识发现, 2020, 4(2/3): 165-172.
[8] 马建霞,袁慧,蒋翔. 基于Bi-LSTM+CRF的科学文献中生态治理技术相关命名实体抽取研究*[J]. 数据分析与知识发现, 2020, 4(2/3): 78-88.
[9] 杜建. 医学知识不确定性测度的进展与展望*[J]. 数据分析与知识发现, 2020, 4(10): 14-27.
[10] 刘婧茹,宋阳,贾睿,张翼鹏,罗勇,马敬东. 基于BiLSTM-CRF中文临床文本中受保护的健康信息识别*[J]. 数据分析与知识发现, 2020, 4(10): 124-133.
[11] 关鹏,王曰芬. 国内外专利网络研究进展*[J]. 数据分析与知识发现, 2020, 4(1): 26-39.
[12] 黄名选,卢守东,徐辉. 基于加权关联模式挖掘与规则后件扩展的跨语言信息检索 *[J]. 数据分析与知识发现, 2019, 3(9): 77-87.
[13] 杨亚楠,赵文辉,张健,谭珅,张贝贝. 基于多视图协同的政策文本可视化研究*[J]. 数据分析与知识发现, 2019, 3(6): 30-41.
[14] 黄菡,王宏宇,王晓光. 结合主动学习的条件随机场模型用于法律术语的自动识别*[J]. 数据分析与知识发现, 2019, 3(6): 66-74.
[15] 张梦吉,杜婉钰,郑楠. 引入新闻短文本的个股走势预测模型[J]. 数据分析与知识发现, 2019, 3(5): 11-18.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
版权所有 © 2015 《数据分析与知识发现》编辑部
地址:北京市海淀区中关村北四环西路33号 邮编:100190
电话/传真:(010)82626611-6626,82624938
E-mail:jishu@mail.las.ac.cn