|
|
Research on Recognition of Chinese Chemical Substance Names |
Zheng Rongting,Li Nan,Ji Jiuming,Teng Qingqing |
(Library of East China University of Science and Technology, Shanghai 200237,China) |
|
|
Abstract This article uses the model of CRF to conduct an experiment for comparing recognition performance and recognition efficiency between the way based on char labeled and the way based on word labeled. The experiment result shows that the performance of based on char is better than that of based on word at the expense of costing more time. In addition, it also pays more attention to the quantity of feature’s influence on the experiment performance.
|
Received: 12 April 2010
Published: 26 July 2010
|
|
Fund: *本文系上海市科委软科学研究基金项目“基于知识集成的上海研发公共服务平台协同机制研究”(项目编号:056921012)的研究成果之一。
*本文系2010“图书馆信息技术的应用、服务和创新”学术研讨会论文。 |
Corresponding Authors:
Ji Jiuming
E-mail: jjm@mail.lib.ecust.edu.cn
|
[1] ICTCLAS简介[EB/OL].[2009-05-18]. http://ictclas.org/sub_1_1.html.
[2] He Y, Kayaal P M. Biological Entity Recognition with Conditional Random Fields[C].In: Proceedings of AMIA Annual Symposium.2008: 293-297.
[3] 梁樑, 李祎. 商品文本中药物名称和化学名称识别的研究[J]. 烟台大学学报:自然科学与工程版,2002,15(4):280-285.
[4] 宋丹,孙济庆.基于规则的化学特征词自动标引研究[J].情报学报,2009,28(5):689-692.
[5] Klinger R, Koláik C, Fluck J, et al. Detection of IUPAC and IUPAC-like Chemical Names[J]. Bioinformatics, 2008, 24(13):i268-i276.
[6] Lafferty J, McCallum A, Pereira F. Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data[C]. In: Proceedings of the 18th International Conference on Machine Learning. San Francisco, CA,USA:Morgan Kaufmann Publishers Inc., 2001: 282-289.
[7] 王昊,苏新宁.基于CRFs的角色标注人名识别模型在网络舆情分析中的应用[J].情报学报, 2009, 28(1):88-96.
[8] 黄昌宁,赵海.中文分词十年回顾[J].中文信息学报,2007,21(3):8-19.
[9] 许晓丽,卢志茂,张格森.基于条件随机场的中文命名实体识别研究[J].中国新技术新产品, 2009(2):15.
[10] 贾美英,杨炳儒,郑德权,等. 采用CRF技术的军事情报术语自动抽取研究[J].计算机工程与应用,2009,45(32):126-129.
[11] Van Rijsbergen C J. Information Retrieval[M]. 2nd Edition. London: Butterworth, 1979. |
|
Viewed |
|
|
|
Full text
|
|
|
|
|
Abstract
|
|
|
|
|
Cited |
|
|
|
|
|
Shared |
|
|
|
|
|
Discussed |
|
|
|
|