|
|
Classification Model for Medical Entity Relations with Convolutional Neural Network |
Fan Shaoping1,Zhao Yuxuan2,An Xinying1,Wu Qingqiang3( ) |
1Institute of Medical Information / Medical Library, Chinese Academy of Medical Sciences & Peking Union Medical College, Beijing 100020, China 2School of Finance, Central University of Finance and Economics, Beijing 102206, China 3School of Informatics, Xiamen University, Xiamen 361005, China |
|
|
Abstract [Objective] This paper proposes a new classification model for entity relationship based on the Convolutional Neural Network (CNN) with multi-features embedding, aiming to improve the classification results and simplify feature calculation. [Objective] Based on the existing algorithms of embedded features, our CNN model integrated word positions and lexical features, as well as demonstrated the representation methods for the features. These features did not require complex algorithm calculation, which improved the model's performance. [Results] We examined the proposed model with the Bio-Medical corpus of AIMed, GENIA and ChemProt. The F1 scores were 0.7342, 0.9764 and 0.8900, respectively. This model yielded the best results with the GENIA and ChemProt datasets. [Limitations] Our model did not include the prior domain knowledge from biomedical field. [Conclusions] The proposed model could effectively conduct entity relationship classification, which also help the research on relation extraction and knowledgebase construction in bio-medical field.
|
Received: 07 January 2021
Published: 29 June 2021
|
|
Fund:*National Natural Science Foundation of China(71704188);National Key Research and Development Program of China(2016YFC0901902-2) |
Corresponding Authors:
Wu Qingqiang
E-mail: wuqq@xmu.edu.cn
|
[1] |
The Precision Medicine Initiative[EB/OL].[2019-12-01].https://obamawhitehouse.archives.gov/precision-medicine .
|
[2] |
科技部关于发布国家重点研发计划精准医学研究等重点专项2016年度项目申报指南的通知[EB/OL]. [2019-12-01]. http://www.most.gov.cn/tztg/201603/t20160308_124542.html .
|
[2] |
(Notice of the Ministry of Science and Technology on Issuing 2016 Annual Project Application Guidelines for National Key R & D Plan, Precision Medicine Research and Other Key Special Projects [EB/OL]. [2019-12-01]. http://www.most.gov.cn/tztg/201603/t20160308_124542.html
|
[3] |
刘雷, 王星. 精准医学知识库的构建[J]. 中华医学图书情报杂志, 2018, 27(6):1-9.
|
[3] |
( Liu Lei, Wang Xing. Development of Knowledge Base for Precision Medicine[J]. Chinese Journal of Medical Library and Information Science, 2018, 27(6):1-9.)
|
[4] |
Disease Ontology[EB/OL].[2019-12-01].https://disease-ontology.org/ .
|
[5] |
KEGG: Kyoto Encyclopedia of Genes and Genomes[EB/OL].[2019-12-01].https://www.kegg.jp/ .
|
[6] |
PharmGKB[EB/OL].[2019-12-01].https://www.pharmgkb.org/ .
|
[7] |
Hendrickx I, Kim S N, Kozareva Z, et al. Semeval-2010 Task 8: Multi-Way Classification of Semantic Relations Between Pairs of Nominals [C]// Proceedings of the 5th International Workshop on Semantic Evaluation. 2010: 33-38.
|
[8] |
Corpus for Relation Classification in Medical Field[EB/OL].[2021-01-10].https://github.com/yangshuothtf/corpus_relation_classification .
|
[9] |
Afzal H, Eales J, Stevens R, et al. Mining Semantic Networks of Bioinformatics E-Resources from the Literature[J]. Journal of Biomedical Semantics, 2011, 2 (S1): Article No. S4.
|
[10] |
Segura-Bedmar I, Martínez P, de Pablo-Sanchez C. Using a Shallow Linguistic Kernel for Drug-Drug Interaction Extraction[J]. Journal of Biomedical Informatics, 2011, 44(5):789-804.
doi: 10.1016/j.jbi.2011.04.005
pmid: 21545845
|
[11] |
Zhao Z H, Yang Z H, Luo L, et al. Drug-Drug Interaction Extraction from Biomedical Literature Using Syntax Convolutional Neural Network[J]. Bioinformatics, 2016, 32(22):3444-3453.
|
[12] |
Zhao Z H, Yang Z H, Sun C, et al. A Hybrid Protein-Protein Interaction Triple Extraction Method for Biomedical Literature [C]//Proceedings of 2017 IEEE International Conference on Bioinformatics and Biomedicine (BIBM). IEEE, 2017.
|
[13] |
Corbett P, Boyle J. Improving the Learning of Chemical-Protein Interactions from Literature Using Transfer Learning and Specialized Word Embeddings[J]. Database, 2018. DOI: 10.1093/database/bay066.
doi: 10.1093/database/bay066
|
[14] |
Yan X, Mou L L, Li G, et al. Classifying Relations via Long Short Term Memory Networks along Shortest Dependency Paths [C]//Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing. 2015: 1785-1794.
|
[15] |
王天时. 基于特征嵌入表示的文本分类方法研究[D]. 济南: 山东师范大学, 2020.
|
[15] |
( Wang Tianshi. Research on Text Classification Method Based on Feature Embedding Representation[D]. Ji'nan: Shandong Normal University, 2020.)
|
[16] |
Lee J, Seo S, Choi Y S. Semantic Relation Classification via Bidirectional LSTM Networks with Entity-Aware Attention Using Latent Entity Typing[J]. Symmetry, 2019, 11(6):785.
doi: 10.3390/sym11060785
|
[17] |
Sahu S K, Anand A, Oruganty K, et al. Relation Extraction from Clinical Texts Using Domain Invariant Convolutional Neural Network [C]∥Proceedings of the 15th Workshop on Biomedical Natural Language Processing. Stroudsburg: Association for Computational Linguistics, 2016:206-215.
|
[18] |
Quan C Q, Hua L, Sun X, et al. Multichannel Convolutional Neural Network for Biological Relation Extraction[J]. BioMed Research International, 2016: 1-10.
|
[19] |
Peng Y F, Lu Z Y. Deep Learning for Extracting Protein-Protein Interactions from Biomedical Literature [C]//Proceedings of the BioNLP 2017 Workshop. 2017: 29-38.
|
[20] |
Sahu S K, Anand A. Drug-Drug Interaction Extraction from Biomedical Texts Using Long Short Term Memory Network[J]. Journal of Biomedical Informatics, 2018, 86:15-24.
doi: 10.1016/j.jbi.2018.08.005
|
[21] |
Peng Y F, Rios A, Kavuluru R, et al. Extracting Chemical-Protein Relations with Ensembles of SVM and Deep Learning Models[J]. Database the Journal of Biological Database & Curation, DOI: 10.1093/database/bay073.
doi: 10.1093/database/bay073
|
[22] |
Zhang Y J, Lin H F, Yang Z H, et al. A Hybrid Model Based on Neural Networks for Biomedical Relation Extraction[J]. Journal of Biomedical Informatics, 2018, 81:83-92.
doi: 10.1016/j.jbi.2018.03.011
|
[23] |
Zeng D J, Liu K, Lai S W, et al. Relation Classification via Convolutional Deep Neural Network [C]//Proceedings of the 25th International Conference on Computational Linguistics: Technical Papers. 2014: 2335-2344.
|
[24] |
Socher R, Huval B, Manning C D, et al. Semantic Compositionality Through Recursive Matrix-Vector Spaces [C]// Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning. 2012:1201-1211.
|
[25] |
Nguyen T H, Grishman R. Relation Extraction: Perspective from Convolutional Neural Networks [C]//Proceedings of the 1st Workshop on Vector Space Modeling for Natural Language Processing. 2015: 39-48.
|
[26] |
Nguyen T H, Grishman R. Employing Word Representations and Regularization for Domain Adaptation of Relation Extraction [C]//Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers). 2014:68-74.
|
[27] |
Choi S P. Extraction of Protein-Protein Interactions (PPIs) from the Literature by Deep Convolutional Neural Networks with Various Feature Embeddings[J]. Journal of Information Science, 2018, 44(1):60-73.
doi: 10.1177/0165551516673485
|
[28] |
Porumb M, Barbantan I, Lemnaru C, et al. REMed: Automatic Relation Extraction from Medical Documents [C]//Proceedings of the 17th International Conference on Information Integration and Web-based Applications & Services. ACM, 2015: 19.
|
[29] |
Mikolov T, Chen K, Corrado G, et al. Efficient Estimation of Word Representations in Vector Space[OL]. arXiv Preprint, arXiv: 1301.3781.
|
[30] |
Krizhevsky A, Sutskever I, Hinton G. ImageNet Classification with Deep Convolutional Neural Networks[J]. Communications of the ACM, 2017, 60(6):84-90.
doi: 10.1145/3065386
|
[31] |
Yang Y M. An Evaluation of Statistical Approaches to Text Categorization[J]. Information Retrieval, 1999, 1(1-2):69-90.
doi: 10.1023/A:1009982220290
|
[32] |
Bunescu R, Ge R F, Kate R J, et al. Comparative Experiments on Learning Information Extractors for Proteins and Their Interactions[J]. Artificial Intelligence in Medicine, 2005, 33(2):139-155.
pmid: 15811782
|
[33] |
Ohta T, Pyysalo S, Kim J D, et al. A Re-evaluation of Biomedical Named Entity-Term Relations[J]. Journal of Bioinformatics and Computational Biology, 2010, 8(5):917-928.
doi: 10.1142/S0219720010005014
|
[34] |
Taboureau O, Nielsen S K, Audouze K, et al. ChemProt: A Disease Chemical Biology Database[J]. Nucleic Acids Research, 2011, 39(S1):D367-D372.
doi: 10.1093/nar/gkq906
|
[35] |
AIMed [DB/OL].[2021-01-12]. ftp://ftp.cs.utexas.edu/pub/mooney/bio-data/ .
|
[36] |
Relation Annotation [EB/OL].[2021-01-12]. http://www.geniaproject.org/genia-corpus/relation-corpus .
|
[37] |
BioCreative VII [EB/OL]. [2021-01-12]. http://www.biocreative.org .
|
[38] |
Chang Y C, Chu C H, Su Y C, et al. PIPE: A Protein-Protein Interaction Passage Extraction Module for BioCreative Challenge[J]. Database, DOI: 10.1093/database/baw101.
doi: 10.1093/database/baw101
|
[39] |
Björne J, Salakoski T. Generalizing Biomedical Event Extraction [C]//Proceedings of the 2011 BioNLP Shared Task Workshop. 2011: 183-191.
|
[40] |
Ramesh B P, Prasad R, Miller T, et al. Automatic Discourse Connective Detection in Biomedical Text[J]. Journal of the American Medical Informatics Association, 2012, 19(5):800-808.
doi: 10.1136/amiajnl-2011-000775
|
[41] |
Hsieh Y L, Chang Y C, Chang N W, et al. Identifying Protein-Protein Interactions in Biomedical Literature Using Recurrent Neural Networks with Long Short-Term Memory [C]//Proceedings of the 8th International Joint Conference on Natural Language Processing. 2017.
|
[42] |
Yadav S, Ekbal A, Saha S, et al. Feature Assisted Stacked Attentive Shortest Dependency Path Based Bi-LSTM Model for Protein-Protein Interaction[J]. Knowledge-Based Systems, 2019, 166:18-29.
doi: 10.1016/j.knosys.2018.11.020
|
[43] |
Lim S, Kang J. Chemical-Gene Relation Extraction Using Recursive Neural Network[J]. Database, DOI: 10.1093/database/bay060.
doi: 10.1093/database/bay060
|
[44] |
Beltagy I, Lo K, Cohan A. SciBERT: A Pretrained Language Model for Scientific Text [C]//Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing. 2019: 3615-3620.
|
[45] |
胡正银, 刘蕾蕾, 代冰, 等. 基于领域知识图谱的生命医学学科知识发现探析[J]. 数据分析与知识发现, 2020, 4(11):1-14.
|
[45] |
( Hu Zhengyin, Liu Leilei, Dai Bing, et al. Discovering Subject Knowledge in Life and Medical Sciences with Knowledge Graph[J]. Data Analysis and Knowledge Discovery, 2020, 4(11):1-14.)
|
[46] |
Beltagy I, Lo K, Cohan A. SciBERT: Pretrained Contextualized Embeddings for Scientific Text[OL]. arXivPreprint,arXiv: 1903. 10676.
|
[47] |
Zhu Y, Li L S, Lu H B, et al. Extracting Drug-Drug Interactions from Texts with BioBERT and Multiple Entity-aware Attentions[J]. Journal of Biomedical Informatics, 2020, 106:103451.
doi: S1532-0464(20)30079-4
pmid: 32454243
|
|
Viewed |
|
|
|
Full text
|
|
|
|
|
Abstract
|
|
|
|
|
Cited |
|
|
|
|
|
Shared |
|
|
|
|
|
Discussed |
|
|
|
|