Please wait a minute...
New Technology of Library and Information Service  2014, Vol. 30 Issue (1): 24-27    DOI: 10.11925/infotech.1003-3513.2014.01.04
DIGITAL LIBRARY Current Issue | Archive | Adv Search |
Research on Plant Growth and Development Stage Named Entity Recognition for Text Mining
Wang Run, He Lin, Wang Dongbo, Huang Shuiqing, Fan Yuanbiao
College of Information Science and Technology,Nanjing Agricultural University,Nanjing 210095,China
Export: BibTeX | EndNote (RIS)      
Abstract  [Objective] This paper researches in the extraction that identifies plant growth and development stage entity from text. [Context] PDSE is a kind of named entity essentially. Named entities recognition has become one of most valuable basic technologies in Natural Language Processing field,which is used widely in many Natural Language Processing systems. [Methods] It adopts multiple strategies based on conditional random field and rules,with putting forward and realizing a method of CRF template,characteristic function and extraction rules for the features of plant growth and development stage entity. Also,it tests the extraction effect by articles from the PubMed database. [Results] The experiment shows that the proposed hybrid strategies can obtain high accuracy and recall rate. [Conclusions] This research has a certain significant reference for biology text extraction.
Key wordsPlant growth and development stage      Named entity recognition      CRF      Feature selection     
Received: 14 February 2014      Published: 14 February 2014
:  TP391  

Cite this article:

Wang Run,He Lin,Wang Dongbo,Huang Shuiqing,Fan Yuanbiao. Research on Plant Growth and Development Stage Named Entity Recognition for Text Mining. New Technology of Library and Information Service, 2014, 30(1): 24-27.

URL:     OR

[1] 宗萍,施水才,王涛,等. 基于条件随机场的英文地理行政实体识别[J]. 现代图书情报技术,2009(2):51-55.(Zong Ping,Shi Shuicai,Wang Tao,et al. GPE-entity Recognition Based on Conditional Random Fields [J]. New Technology of Library and Information Service,2009(2):51-55.)
[2]周雅倩,郭以昆,黄萱菁,等. 基于最大熵方法的中英文基本名词短语识别[J]. 计算机研究与发展,2003,40(3):440-446.(Zhou Yaqian,Guo Yikun,Huang Xuanjing,et al. Chinese and English BaseNP Recognition Based on a Maximum Entropy Model[J]. Journal of Computer Research and Development,2003,40(3):440-446.)
[3]张朝胜,郭剑毅,线岩团,等. 基于条件随机场的英文产品命名实体识别[J]. 计算机工程与科学,2010,32(6):115-117.(Zhang Chaosheng,Guo Jianyi,Xian Yantuan,et al. Named Entity Recognition of the Products with English Based on Conditional Random Fields[J]. Computer Engineering and Science,2010,32(6):115-117.)
[4]Ferro L,Gerber L,Mani I,et al.TIDES 2005 Standard for the Annotation of Temporal Expressions[R]. MITRE,2005:1-65.
[5]ACE(Automatic Content Extraction) Chinese Annotation Guidelines for TIMEX2(Summary)[EB/OL]. [2013-12-19].
[6]Saquete E,Martínez-Barco P. Grammar Specification for the Recognition of Temporal Expressions[C]. In:Proceedings of Machine Translation and Multilingual Applications in the New Millennium.2000.
[7]Schilder F,Habel C. From Temporal Expressions to Temporal Information:Semantic Tagging of News Messages[C]. In:Proceedings of the Workshop on Temporal and Spatial Information Processing(TASIP’01),Morristown,NJ. Stroudsburg:Association for Computational Linguistics,2001:Article No.9.
[8]Brill E. Transformation-based Error-driven Learning and Natural Language Processing:A Case Study in Part-of-Speech Tagging[J]. Computational Linguistics,1995,21(4):543-565.
[9]贺瑞芳,秦兵,潘越群,等. 基于启发式错误驱动学习的中文时间表达式识别[J]. 高技术通讯,2008,18(12):1258-1262.(He Ruifang,Qin Bing,Pan Yuequn,et al. Recognizing Chinese Time Expressions Based on Heuristic Error-driven Learning[J]. High Technology Letters,2008,18(12):1258-1262.)
[10]Hacioglu K,Chen Y,Douglas B. Automatic Time Expression Labeling for English and Chinese Text[C]. In:Proceedings of the 6th International Conference on Computational Linguis- tics and Intelligent Text Processing(CICLing’05). Berlin,Heidelberg:Springer-Verlag,2005:548-559.
[11]Ahn D D,Adafre S F,De Rijke M. Towards Task-based Temporal Extraction and Recognition[C]. In:Proceedings of Dagstuhl Workshop on Annotating,Extracting,and Reasoning about Time and Events. 2005.
[12]欧阳佑,李素建.条件随机域模型和实验分析[C]. 见:第三届学生计算语言学研讨会论文集,沈阳,辽宁,中国.中国中文信息学会,2006:134-139.(Ou Yangyou,Li Sujian. Conditional Random Fields for Temporal Expression Recognition[C]. In: Proceedings of the SWCL-2006, Shenyang, Liaoning Province, China.Chinese Information Association of China, 2006:134-139.)
[13]朱莎莎,刘宗田,付剑锋,等. 基于条件随机场的中文时间短语识别[J]. 计算机工程,2011,37(15):164-167.(Zhu Shasha,Liu Zongtian,Fu Jianfeng,et al. Chinese Temporal Phrase Recognition Based on Conditional Random Fields[J]. Computer Engineering,2011,37(15):164-167.)
[14]许旭阳,李弼程,张先飞,等. 基于条件随机场与自定义规则的时间表达式识别[J]. 情报学报,2011,30(10):1065-1071.(Xu Xuyang,Li Bicheng,Zhang Xianfei,et al. Recognition of Time Expressions Based on Conditional Random Fields and Rules[J]. Journal of the China Society for Scientific and Technical Information,2011,30(10):1065-1071.)
[15]孙镇,王惠临. 命名实体识别研究进展综述[J]. 现代图书情报技术,2010(6):42-47.(Sun Zhen,Wang Huilin. Overview on the Advance of the Research on Named Entity Recognition[J]. New Technology of Library and Information Service,2010(6):42-47.)
[16]Lafferty J D,McCallum A,Pereira F C N. Conditional Random Fields:Probabilistic Models for Segmenting and Labeling Sequence Data[C]. In:Proceedings of the 18th International Conference on Machine Learning(ICML’01). San Francisco:Morgan Kaufmann Publishers Inc.,2001:282-289.
[17]CRF++:Yet Another CRF Toolkit[EB/OL]. [2013-07-15]. source =navbar.
[1] Wang Hao, Lin Kerou, Meng Zhen, Li Xinlei. Identifying Multi-Type Entities in Legal Judgments with Text Representation and Feature Generation[J]. 数据分析与知识发现, 2021, 5(7): 10-25.
[2] Yu Xuehan, He Lin, Xu Jian. Extracting Events from Ancient Books Based on RoBERTa-CRF[J]. 数据分析与知识发现, 2021, 5(7): 26-35.
[3] Hu Haotian,Ji Jinfeng,Wang Dongbo,Deng Sanhong. An Integrated Platform for Food Safety Incident Entities Based on Deep Learning[J]. 数据分析与知识发现, 2021, 5(3): 12-24.
[4] Liang Jiaming, Zhao Jie, Zheng Peng, Huang Liushen, Ye Minqi, Dong Zhenning. Framework for Computing Trust in Online Short-Rent Platform Using Feature Selection of Images and Texts[J]. 数据分析与知识发现, 2021, 5(2): 129-140.
[5] Xu Chenfei, Ye Haiying, Bao Ping. Automatic Recognition of Produce Entities from Local Chronicles with Deep Learning[J]. 数据分析与知识发现, 2020, 4(8): 86-97.
[6] Gao Yuan,Shi Yuanlei,Zhang Lei,Cao Tianyi,Feng Jun. Reconstructing Tour Routes Based on Travel Notes[J]. 数据分析与知识发现, 2020, 4(2/3): 165-172.
[7] Xue Fuliang,Liu Lifang. Fine-Grained Sentiment Analysis with CRF and ATAE-LSTM[J]. 数据分析与知识发现, 2020, 4(2/3): 207-213.
[8] Ma Jianxia,Yuan Hui,Jiang Xiang. Extracting Name Entities from Ecological Restoration Literature with Bi-LSTM+CRF[J]. 数据分析与知识发现, 2020, 4(2/3): 78-88.
[9] Liu Jingru,Song Yang,Jia Rui,Zhang Yipeng,Luo Yong,Ma Jingdong. A BiLSTM-CRF Model for Protected Health Information in Chinese[J]. 数据分析与知识发现, 2020, 4(10): 124-133.
[10] Na Ma,Zhixiong Zhang,Pengmin Wu. Automatic Identification of Term Citation Object with Feature Fusion[J]. 数据分析与知识发现, 2020, 4(1): 89-98.
[11] Han Huang,Hongyu Wang,Xiaoguang Wang. Automatic Recognizing Legal Terminologies with Active Learning and Conditional Random Field Model[J]. 数据分析与知识发现, 2019, 3(6): 66-74.
[12] Cheng Zhou,Hongqin Wei. Evaluating and Classifying Patent Values Based on Self-Organizing Maps and Support Vector Machine[J]. 数据分析与知识发现, 2019, 3(5): 117-124.
[13] Jiaming Liang,Jie Zhao,Zhou Jianlong,Zhenning Dong. Detecting Collusive Fraudulent Online Transaction with Implicit User Behaviors[J]. 数据分析与知识发现, 2019, 3(5): 125-138.
[14] Tingxin Wen,Yangzi Li,Jingshuang Sun. News Hotspots Discovery Method Based on Multi Factor Feature Selection and AFOA/K-means[J]. 数据分析与知识发现, 2019, 3(4): 97-106.
[15] Zhanglu Tan,Zhaogang Wang,Han Hu. Study on a Method of Feature Classification Selection Based on χ2 Statistics[J]. 数据分析与知识发现, 2019, 3(2): 72-78.
  Copyright © 2016 Data Analysis and Knowledge Discovery   Tel/Fax:(010)82626611-6626,82624938