Please wait a minute...
Data Analysis and Knowledge Discovery  2017, Vol. 1 Issue (1): 37-46    DOI: 10.11925/infotech.2096-3467.2017.01.05
Orginal Article Current Issue | Archive | Adv Search |
Extracting Semantic Knowledge from Plant Species Diversity Collections
Jianhua Liu1,2(),Ying Wang1,Zhixiong Zhang1,Chuanxi Li3
1National Science Library, Chinese Academy of Sciences, Beijing 100190, China
2University of Chinese Academy of Sciences, Beijing 100049, China
3China Great Wall Asset Management Co., Ltd, Beijing 100045, China
Download: PDF(4615 KB)   HTML ( 47
Export: BibTeX | EndNote (RIS)      
Abstract  

[Objective]This paper aims to extract semantic knowledge from the biodiversity studies. [Methods] We proposed a new knowledge extraction framework focusing on species. It included various entities as well as the relationship among them. The new method was then examined with various specialized databases. [Results] The species-oriented knowledge extraction framework, could successfully retrieve semantic information from the target entities and the relations among them. This method expanded the scope of knowledge extraction practice in the biodiversity field. [Limitations] The recall and precision ratio of the new method was effected by the dictionaries and rules. More studies are needed to examine the semantic relationship among the named entities beyond co-occurrence, hierarchical and simple syntactic relations. [Conclusions] The proposed method expands the contents and methods of knowledge extraction in biodiversity research. It supports the semantic information retrieval and computation.

Key wordsPlant Species Diversity      Plant Species      Knowledge Extraction      Relation Extraction     
Received: 14 April 2016      Published: 22 February 2017

Cite this article:

Jianhua Liu,Ying Wang,Zhixiong Zhang,Chuanxi Li. Extracting Semantic Knowledge from Plant Species Diversity Collections. Data Analysis and Knowledge Discovery, 2017, 1(1): 37-46.

URL:

http://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/10.11925/infotech.2096-3467.2017.01.05     OR     http://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/Y2017/V1/I1/37

[1] Thessen A E, Cui H, Mozzherin D.Applications of Natural Language Processing in Biodiversity Science[J]. Advances in Bioinformatics, 2012. DOI: 10.1155/2012/391574.
[2] Naderi N, Kappler T, Baker C J, et al.OrganismTagger: Detection, Normalization and Grounding of Organism Entities in Biomedical Documents[J]. Bioinformatics, 2011, 27(19): 2721-2729.
[3] Species [EB/OL]. [2016-04-12]. .
[4] Gerner M, Nenadic G, Bergman C M.LINNAEUS: A Species Name Identification System for Biomedical Literature[J]. BMC Bioinformatics, 2010. DOI: 10.1186/1471-2105-11-85.
[5] The NCBI Taxonomy Homepage [EB/OL]. [2016-04-12]. .
[6] Page R D M. BioNames: Linking Taxonomy, Texts, and Trees [OL]. .
[7] Species 2000 [EB/OL]. [2016-04-12]. .
[8] Akella L M, Norton C N, Miller H.NetiNeti: D1iscovery of Scientific Names from Text Using Machine Learning Methods[J]. BMC Bioinformatics, 2012. DOI: 10.1186/1471- 2105-13-211.
[9] The OrganismTagger System [EB/OL]. [2016-04-12]. .
[10] Koning D, Sarlar I N, Moritz T.Taxongrab: Extracting Taxonomic Names from Text[J]. Biodiversity Informatics, 2005, 2: 79-82.
[11] Taylor A.Extracting Knowledge from Biological Descriptions[C]//Proceedings of the 2nd International Conference on Building and Sharing Very Large-Scale Knowledge Bases. 1995: 114-119.
[12] Tang X, Heidorn P B.Using Automatically Extracted Information in Species Page Retrieval[C]//Proceedings of TDWG 2007. 2007.
[13] Cui H.CharaParser for Fine-grained Semantic Annotation of Organism Morphological Descriptions[J]. Journal of the Society for Information Science and Technology, 2012, 63(4): 738-754.
[14] 段宇锋, 黄思思. 中文植物物种多样性描述文本的信息抽取研究[J]. 现代图书情报技术, 2016(1): 87-96.
[14] (Duan Yufeng, Huang Sisi.Information Extraction from Chinese Plant Species Diversity Description Text[J]. New Technology of Library and Information Service, 2016(1): 87-96.)
[15] Li C, Liakata M, Rebholz-Schuhmann D.Biological Network Extraction from Scientific Literature: State of the Art and Challenges[J]. Briefings in Bioinformatics, 2013. DOI: 10.1093/bib/bbt006.
[16] Skusa A, Rüegg A, K?hler J.Extraction of Biological Interaction Networks from Scientific Literature[J]. Briefings in Bioinformatics, 2005, 6(3): 263-276.
[17] 白光祖, 何远标, 马建霞, 等. 利用小样本量机器学习实现学术文摘结构的自动识别[J]. 现代图书情报技术, 2014(7-8): 34-40.
[17] (Bai Guangzu, He Yuanbiao, Ma Jianxia, et al. Application of Machine Learning with Limited Corpus to Identify Structure of Scientific Abstracts Automatically, 2014 (7-8): 34-40.)
[18] 许哲平, 崔金钟, 覃海宁, 等. 中国植物物种多样性 e-Science 平台建设构想[J]. 植物物种多样性, 2010, 18(5): 480-488.
[18] (Xu Zheping, Cui Jinzhong, Qin Haining, et al.On the Architecture of Biodiversity e-Science Infrastructure in China[J]. Biodiversity Science, 2010, 18(5): 480-488.)
[19] Jiang W, Guan Y, Wang X L.Improving Feature Extraction in Named Entity Recognition Based on Maximum Entropy Model[C]//Proceedings of the 5th International Conference on Machine Learning and Cybernetics. 2006: 2630-2635.
[20] De Marneffe M-C, Manning C D. Stanford Typed Dependencies Manual [OL]. .
[21] Hearst M A.Automatic Acquisition of Hyponyms from Large Text Corpora[C]// Proceedings of the 14th International Conference on Computational Linguistics, 1992.
[1] Hongxia Xu,Chunwang Li. Review of Knowledge Extraction of Scientific Literature[J]. 数据分析与知识发现, 2019, 3(3): 14-24.
[2] Ying Wang,Li Qian,Jing Xie,Zhijun Chang,Beibei Kong. Building Knowledge Graph with Sci-Tech Big Data[J]. 数据分析与知识发现, 2019, 3(1): 15-26.
[3] Qin Zhang,Hongmei Guo,Zhixiong Zhang. Extracting Entity Relationship with Word Embedding Representation Features[J]. 数据分析与知识发现, 2017, 1(9): 8-15.
[4] Yufeng Duan,Sisi Huang. Information Extraction from Chinese Plant Species Diversity Description Text[J]. 现代图书情报技术, 2016, 32(1): 87-96.
[5] Duan Yufeng, Huang Sisi. Research on Construction of Chinese Plant Species Diversity Domain Ontology Based on BFO[J]. 现代图书情报技术, 2015, 31(12): 72-79.
[6] Hua Bolin. Extracting Information Method Term from Chinese Academic Literature[J]. 现代图书情报技术, 2013, (6): 68-75.
[7] Huang Xun, You Hongliang, Yu Yang. A Review of Relation Extraction[J]. 现代图书情报技术, 2013, 29(11): 30-39.
[8] Wang Xiuyan, Cui Lei. Overview of Semantic Relations Extraction Between Biomedical Entities by Key Verbs[J]. 现代图书情报技术, 2011, 27(9): 21-27.
[9] Liu Jianhua ,Zhang Zhixiong. Relation Extraction Based on Stanford Parser[J]. 现代图书情报技术, 2009, 25(5): 1-5.
[10] Miao Chen,Xiaozhong Liu,Jian Qin. Semantic Relation Extraction from Socially-generated Tags:A Methodology for Metadata Generation[J]. 现代图书情报技术, 2009, 3(3): 38-45.
[11] Jiang Caihong,Qiao Xiaodong ,Zhu Lijun. Ontology-based Patent Abstracts' Knowledge Extraction[J]. 现代图书情报技术, 2009, 3(2): 23-28.
[12] Zhang Zhixiong,Wu Zhenxin,Liu Jianhua,Xu Jian,Hong Na,Zhao Qi. Analysis of State-of-the-Art Knowledge Extraction Technologies[J]. 现代图书情报技术, 2008, 24(8): 2-11.
[13] Xu Jian,Zhang Zhixiong,Wu Zhenxin. Review on Techniques of Entity Relation Extraction[J]. 现代图书情报技术, 2008, 24(8): 18-23.
[14] Zhou Ning,Wang Miao. Research on Special Domain Oriented Knowledge Management Model Based on MUDs[J]. 现代图书情报技术, 2008, 24(5): 33-38.
[15] Hua Bolin. Stop-word Processing Technique in Knowledge Extraction[J]. 现代图书情报技术, 2007, 2(8): 48-51.
  Copyright © 2016 Data Analysis and Knowledge Discovery   Tel/Fax:(010)82626611-6626,82624938   E-mail:jishu@mail.las.ac.cn