|
|
Extracting Medical Entity Relationships with Domain-Specific Knowledge and Distant Supervision |
Jing Shenqi1,2,3,Zhao Youlin1() |
1School of Information Management, Nanjing University, Nanjing 210023, China 2School of Biomedical Engineering and Informatics, Nanjing Medical University, Nanjing 211166, China 3Center for Data Management, The First Affiliated Hospital of Nanjing Medical University (Jiangsu Province Hospital), Nanjing 210096, China |
|
|
Abstract [Objective] This paper proposes a distant supervised model to extract medical entity relationships based on Medical Domain-Specific Knowledge, aiming to reduce the cost of data labeling and potential errors of the existing models. [Methods] First, we used a multi-instance strategy to reduce the noise of distant supervised labeled data. Then, we utilized a pre-trained language model (MedicalBERT) to encode the labeled texts. Third, with the description of the entities in the medical knowledge base, we provided supervision signals for medical relationship extraction, and improved the accuracy of the semantic encoding. [Results] Compared with the existing models, performance of our new algorithm was up to 5.4% higher for Precision, 2.5% higher for Recall, and 4.1% higher for F1. In addition, F1-score for the complicated extraction tasks reached 93.8%. [Limitations] More research is needed to examine the proposed method with more sentences. [Conclusions] Our new model could effectively extract medical entity relationships and benefit related research.
|
Received: 28 October 2021
Published: 28 July 2022
|
|
Fund:National Key R&D Program of China(2018YFC1314900);Key R&D Program of Jiangsu(BE2020721) |
Corresponding Authors:
Zhao Youlin
E-mail: sobzyl@hhu.edu.cn
|
[1] |
李丽双, 袁光辉, 刘晗喆. 基于位置降噪和丰富语义的电子病历实体关系抽取[J]. 中文信息学报, 2021, 35(8): 89-97.
|
[1] |
(Li Lishuang, Yuan Guanghui, Liu Hanzhe. Entity Relationship Extraction from Electronic Medical Records Based on Location Noise Reduction and Rich Semantics[J]. Journal of Chinese Information Processing, 2021, 35(8): 89-97.)
|
[2] |
昝红英, 关同峰, 张坤丽, 等. 面向医学文本的实体关系抽取研究综述[J]. 郑州大学学报(理学版), 2020, 52(4): 1-15.
|
[2] |
Zan Hongying, Guan Tongfeng, Zhang Kunli, et al. Review on Entity Relation Extraction for Medical Text[J]. Journal of Zhengzhou University(Natural Science Edition), 2020, 52(4): 1-15.)
|
[3] |
杨锦锋, 于秋滨, 关毅, 等. 电子病历命名实体识别和实体关系抽取研究综述[J]. 自动化学报, 2014, 40(8): 1537-1562.
|
[3] |
(Yang Jinfeng, Yu Qiubin, Guan Yi, et al. An Overview of Research on Electronic Medical Record Oriented Named Entity Recognition and Entity Relation Extraction[J]. Acta Automatica Sinica, 2014, 40(8): 1537-1562.)
|
[4] |
Jelier R, Jenster G, Dorssers L C J, et al. Co-Occurrence Based Meta-Analysis of Scientific Texts: Retrieving Biological Relationships Between Genes[J]. Bioinformatics, 2005, 21(9): 2049-2058.
pmid: 15657104
|
[5] |
Yang Y L, Lai P T, Tsai R T H. A Hybrid System for Temporal Relation Extraction from Discharge Summaries[C]// Proceedings of the 19th International Conference on Technologies and Applications of Artificial Intelligence. 2014: 379-386.
|
[6] |
Seol J W, Yi W J, Choi J, et al. Causality Patterns and Machine Learning for the Extraction of Problem-Action Relations in Discharge Summaries[J]. International Journal of Medical Informatics, 2017, 98: 1-12.
|
[7] |
Nikfarjam A, Emadzadeh E, Gonzalez G. Towards Generating a Patient’s Timeline: Extracting Temporal Relationships from Clinical Notes[J]. Journal of Biomedical Informatics, 2013, 46: S40-S47.
|
[8] |
Hendrickx I, Kim S N, Kozareva Z, et al. SemEval-2010 Task 8: Multi-Way Classification of Semantic Relations Between Pairs of Nominals[C]// Proceedings of the 2009 Workshop on Semantic Evaluations:Recent Achievements and Future Directions. 2009: 94-99.
|
[9] |
Doddington G. The Automatic Content Extraction(ACE) Program-Tasks, Data, and Evaluation[C]// Proceedings of the 4th International Conference on Language Resources and Evaluation. 2004: 837-840
|
[10] |
Wei C H, Peng Y, Leaman R, et al. Overview of the BioCreative V Chemical Disease Relation(CDR) Task[C]// Proceedings of the 5th BioCreative Challenge Evaluation Workshop. 2015:154-166.
|
[11] |
Uzuner Ö, South B R, Shen S Y, et al. 2010 i2b2/VA Challenge on Concepts, Assertions, and Relations in Clinical Text[J]. Journal of the American Medical Informatics Association, 2011, 18(5): 552-556.
doi: 10.1136/amiajnl-2011-000203
pmid: 21685143
|
[12] |
Mintz M, Bills S, Snow R, et al. Distant Supervision for Relation Extraction Without Labeled Data[C]// Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP. 2009: 1003-1011.
|
[13] |
Riedel S, Yao L M, McCallum A. Modeling Relations and Their Mentions Without Labeled Text[C]// Proceedings of the 2010 Joint European Conference on Machine Learning and Knowledge Discovery in Databases. 2010: 148-163.
|
[14] |
Zeng D J, Liu K, Chen Y B, et al. Distant Supervision for Relation Extraction via Piecewise Convolutional Neural Networks[C]// Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing. 2015: 1753-1762.
|
[15] |
Jiang X, Wang Q, Li Peng, et al. Relation Extraction with Multi-Instance Multi-Label Convolutional Neural Networks[C]// Proceedings of the 26th International Conference on Computational Linguistics. 2016: 1471-1480.
|
[16] |
Feng X C, Guo J, Qin B, et al. Effective Deep Memory Networks for Distant Supervised Relation Extraction[C]// Proceedings of the 26th International Joint Conference on Artificial Intelligence. 2017: 4003-4008.
|
[17] |
Ji G L, Liu K, He S Z, et al. Distant Supervision for Relation Extraction with Sentence-Level Attention and Entity Descriptions[C]// Proceedings of the 31st AAAI Conference on Artificial Intelligence. 2017: 3060-3066.
|
[18] |
杨穗珠, 刘艳霞, 张凯文, 等. 远程监督关系抽取综述[J]. 计算机学报, 2021, 44(8): 1636-1660.
|
[18] |
(Yang Suizhu, Liu Yanxia, Zhang Kaiwen, et al. Survey on Distantly-Supervised Relation Extraction[J]. Chinese Journal of Computers, 2021, 44(8): 1636-1660.)
|
[19] |
Devlin J, Chang M W, Lee K, et al. BERT: Pre-Training of Deep Bidirectional Transformers for Language Understanding[OL]. arXiv Preprint, arXiv: 1810.04805.
|
[20] |
Donnelly K. SNOMED-CT: The Advanced Terminology and Coding System for eHealth[J]. Studies in Health Technology and Informatics, 2006, 121: 279-290.
pmid: 17095826
|
[21] |
Lipscomb C E. Medical Subject Headings(MeSH)[J]. Bulletin of the Medical Library Association, 2000, 88(3): 265-266.
pmid: 10928714
|
[22] |
Wu T X, Gao C, Qi G L, et al. KG-Buddhism: The Chinese Knowledge Graph on Buddhism[C]// Proceedings of the 7th Joint International Semantic Technology Conference. 2017: 259-267.
|
[23] |
Vaswani A, Shazeer N, Parmar N, et al. Attention is All You Need[C]// Proceedings of the 31st International Conference on Neural Information Processing Systems. 2017: 6000-6010.
|
[24] |
Liu W Y, Wen Y D, Yu Z D, et al. Large-Margin Softmax Loss for Convolutional Neural Networks[C]// Proceedings of the 33rd International Conference on Machine Learning. 2016: 507-516.
|
[25] |
Kingma D P, Ba J. Adam: A Method for Stochastic Optimization[OL]. arXiv Preprint, arXiv: 1412.6980.
|
[26] |
Fleiss J L, Cohen J. The Equivalence of Weighted Kappa and the Intraclass Correlation Coefficient as Measures of Reliability[J]. Educational and Psychological Measurement, 1973, 33(3): 613-619.
|
[27] |
Surdeanu M, Tibshirani J, Nallapati R, et al. Multi-Instance Multi-Label Learning for Relation Extraction[C]// Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning. 2012: 455-465.
|
[28] |
Li Y, Long G D, Shen T, et al. Self-Attention Enhanced Selective Gate with Entity-Aware Embedding for Distantly Supervised Relation Extraction[J]. Proceedings of the AAAI Conference on Artificial Intelligence, 2020, 34(5): 8269-8276.
|
[29] |
Srivastava N, Hinton G, Krizhevsky A, et al. Dropout: A Simple Way to Prevent Neural Networks from Overfitting[J]. Journal of Machine Learning Research, 2014, 15(1):1929-1958.
|
|
Viewed |
|
|
|
Full text
|
|
|
|
|
Abstract
|
|
|
|
|
Cited |
|
|
|
|
|
Shared |
|
|
|
|
|
Discussed |
|
|
|
|