Please wait a minute...
New Technology of Library and Information Service  2014, Vol. 30 Issue (1): 24-27    DOI: 10.11925/infotech.1003-3513.2014.01.04
DIGITAL LIBRARY Current Issue | Archive | Adv Search |
Research on Plant Growth and Development Stage Named Entity Recognition for Text Mining
Wang Run, He Lin, Wang Dongbo, Huang Shuiqing, Fan Yuanbiao
College of Information Science and Technology,Nanjing Agricultural University,Nanjing 210095,China
Download: PDF(471 KB)   HTML  
Export: BibTeX | EndNote (RIS)      
Abstract  [Objective] This paper researches in the extraction that identifies plant growth and development stage entity from text. [Context] PDSE is a kind of named entity essentially. Named entities recognition has become one of most valuable basic technologies in Natural Language Processing field,which is used widely in many Natural Language Processing systems. [Methods] It adopts multiple strategies based on conditional random field and rules,with putting forward and realizing a method of CRF template,characteristic function and extraction rules for the features of plant growth and development stage entity. Also,it tests the extraction effect by articles from the PubMed database. [Results] The experiment shows that the proposed hybrid strategies can obtain high accuracy and recall rate. [Conclusions] This research has a certain significant reference for biology text extraction.
Key wordsPlant growth and development stage      Named entity recognition      CRF      Feature selection     
Received: 14 February 2014      Published: 14 February 2014
:  TP391  

Cite this article:

Wang Run,He Lin,Wang Dongbo,Huang Shuiqing,Fan Yuanbiao. Research on Plant Growth and Development Stage Named Entity Recognition for Text Mining. New Technology of Library and Information Service, 2014, 30(1): 24-27.

URL:     OR

[1] 宗萍,施水才,王涛,等. 基于条件随机场的英文地理行政实体识别[J]. 现代图书情报技术,2009(2):51-55.(Zong Ping,Shi Shuicai,Wang Tao,et al. GPE-entity Recognition Based on Conditional Random Fields [J]. New Technology of Library and Information Service,2009(2):51-55.)
[2]周雅倩,郭以昆,黄萱菁,等. 基于最大熵方法的中英文基本名词短语识别[J]. 计算机研究与发展,2003,40(3):440-446.(Zhou Yaqian,Guo Yikun,Huang Xuanjing,et al. Chinese and English BaseNP Recognition Based on a Maximum Entropy Model[J]. Journal of Computer Research and Development,2003,40(3):440-446.)
[3]张朝胜,郭剑毅,线岩团,等. 基于条件随机场的英文产品命名实体识别[J]. 计算机工程与科学,2010,32(6):115-117.(Zhang Chaosheng,Guo Jianyi,Xian Yantuan,et al. Named Entity Recognition of the Products with English Based on Conditional Random Fields[J]. Computer Engineering and Science,2010,32(6):115-117.)
[4]Ferro L,Gerber L,Mani I,et al.TIDES 2005 Standard for the Annotation of Temporal Expressions[R]. MITRE,2005:1-65.
[5]ACE(Automatic Content Extraction) Chinese Annotation Guidelines for TIMEX2(Summary)[EB/OL]. [2013-12-19].
[6]Saquete E,Martínez-Barco P. Grammar Specification for the Recognition of Temporal Expressions[C]. In:Proceedings of Machine Translation and Multilingual Applications in the New Millennium.2000.
[7]Schilder F,Habel C. From Temporal Expressions to Temporal Information:Semantic Tagging of News Messages[C]. In:Proceedings of the Workshop on Temporal and Spatial Information Processing(TASIP’01),Morristown,NJ. Stroudsburg:Association for Computational Linguistics,2001:Article No.9.
[8]Brill E. Transformation-based Error-driven Learning and Natural Language Processing:A Case Study in Part-of-Speech Tagging[J]. Computational Linguistics,1995,21(4):543-565.
[9]贺瑞芳,秦兵,潘越群,等. 基于启发式错误驱动学习的中文时间表达式识别[J]. 高技术通讯,2008,18(12):1258-1262.(He Ruifang,Qin Bing,Pan Yuequn,et al. Recognizing Chinese Time Expressions Based on Heuristic Error-driven Learning[J]. High Technology Letters,2008,18(12):1258-1262.)
[10]Hacioglu K,Chen Y,Douglas B. Automatic Time Expression Labeling for English and Chinese Text[C]. In:Proceedings of the 6th International Conference on Computational Linguis- tics and Intelligent Text Processing(CICLing’05). Berlin,Heidelberg:Springer-Verlag,2005:548-559.
[11]Ahn D D,Adafre S F,De Rijke M. Towards Task-based Temporal Extraction and Recognition[C]. In:Proceedings of Dagstuhl Workshop on Annotating,Extracting,and Reasoning about Time and Events. 2005.
[12]欧阳佑,李素建.条件随机域模型和实验分析[C]. 见:第三届学生计算语言学研讨会论文集,沈阳,辽宁,中国.中国中文信息学会,2006:134-139.(Ou Yangyou,Li Sujian. Conditional Random Fields for Temporal Expression Recognition[C]. In: Proceedings of the SWCL-2006, Shenyang, Liaoning Province, China.Chinese Information Association of China, 2006:134-139.)
[13]朱莎莎,刘宗田,付剑锋,等. 基于条件随机场的中文时间短语识别[J]. 计算机工程,2011,37(15):164-167.(Zhu Shasha,Liu Zongtian,Fu Jianfeng,et al. Chinese Temporal Phrase Recognition Based on Conditional Random Fields[J]. Computer Engineering,2011,37(15):164-167.)
[14]许旭阳,李弼程,张先飞,等. 基于条件随机场与自定义规则的时间表达式识别[J]. 情报学报,2011,30(10):1065-1071.(Xu Xuyang,Li Bicheng,Zhang Xianfei,et al. Recognition of Time Expressions Based on Conditional Random Fields and Rules[J]. Journal of the China Society for Scientific and Technical Information,2011,30(10):1065-1071.)
[15]孙镇,王惠临. 命名实体识别研究进展综述[J]. 现代图书情报技术,2010(6):42-47.(Sun Zhen,Wang Huilin. Overview on the Advance of the Research on Named Entity Recognition[J]. New Technology of Library and Information Service,2010(6):42-47.)
[16]Lafferty J D,McCallum A,Pereira F C N. Conditional Random Fields:Probabilistic Models for Segmenting and Labeling Sequence Data[C]. In:Proceedings of the 18th International Conference on Machine Learning(ICML’01). San Francisco:Morgan Kaufmann Publishers Inc.,2001:282-289.
[17]CRF++:Yet Another CRF Toolkit[EB/OL]. [2013-07-15]. source =navbar.
[1] Han Huang,Hongyu Wang,Xiaoguang Wang. Automatic Recognizing Legal Terminologies with Active Learning and Conditional Random Field Model[J]. 数据分析与知识发现, 2019, 3(6): 66-74.
[2] Cheng Zhou,Hongqin Wei. Evaluating and Classifying Patent Values Based on Self-Organizing Maps and Support Vector Machine[J]. 数据分析与知识发现, 2019, 3(5): 117-124.
[3] Jiaming Liang,Jie Zhao,Zhou Jianlong,Zhenning Dong. Detecting Collusive Fraudulent Online Transaction with Implicit User Behaviors[J]. 数据分析与知识发现, 2019, 3(5): 125-138.
[4] Tingxin Wen,Yangzi Li,Jingshuang Sun. News Hotspots Discovery Method Based on Multi Factor Feature Selection and AFOA/K-means[J]. 数据分析与知识发现, 2019, 3(4): 97-106.
[5] Zhanglu Tan,Zhaogang Wang,Han Hu. Study on a Method of Feature Classification Selection Based on χ2 Statistics[J]. 数据分析与知识发现, 2019, 3(2): 72-78.
[6] Xiaoxiao Zhu,Zunqi Yang,Jing Liu. Construction of an Adverse Drug Reaction Extraction Model Based on Bi-LSTM and CRF[J]. 数据分析与知识发现, 2019, 3(2): 90-97.
[7] Li Yu,Li Qian,Changlei Fu,Huaming Zhao. Extracting Fine-grained Knowledge Units from Texts with Deep Learning[J]. 数据分析与知识发现, 2019, 3(1): 38-45.
[8] Huihui Tang,Hao Wang,Zixuan Zhang,Xueying Wang. Extracting Names of Historical Events Based on Chinese Character Tags[J]. 数据分析与知识发现, 2018, 2(7): 89-100.
[9] Guoming Feng,Xiaodong Zhang,Suhui Liu. DBLC Model for Word Segmentation Based on Autonomous Learning[J]. 数据分析与知识发现, 2018, 2(5): 40-47.
[10] Tingxin Wen,Yangzi Li,Jingshuang Sun. Extracting Text Features with Improved Fruit Fly Optimization Algorithm[J]. 数据分析与知识发现, 2018, 2(5): 59-69.
[11] Huiying Qi,Jianguang Guo. Integrating Multi-Source Clinical Research Data Based on CDISC Standard[J]. 数据分析与知识发现, 2018, 2(5): 88-93.
[12] Xinyue Fan,Lei Cui. Using Text Mining to Discover Drug Side Effects: Case Study of PubMed[J]. 数据分析与知识发现, 2018, 2(3): 79-86.
[13] Zhipeng Li,Weizhong Li. Feature Selection Based on Modified QPSO Algorithm[J]. 数据分析与知识发现, 2017, 1(7): 82-89.
[14] Yue Zhang,Dongbo Wang,Danhao Zhu. Segmenting Chinese Words from Food Safety Emergencies[J]. 数据分析与知识发现, 2017, 1(2): 64-72.
[15] Xiangdong Li,Tao Ruan,Kang Liu. Automatic Classification of Documents from Wikipedia[J]. 数据分析与知识发现, 2017, 1(10): 43-52.
  Copyright © 2016 Data Analysis and Knowledge Discovery   Tel/Fax:(010)82626611-6626,82624938