|
|
Identifying Named Entities of Chinese Electronic Medical Records Based on RoBERTa-wwm Dynamic Fusion Model |
Zhang Yunqiu(),Wang Yang,Li Bocheng |
School of Public Health, Jilin University, Changchun 130021, China |
|
|
Abstract [Objective] This paper proposes an entity recognition model based on RoBERTa-wwm dynamic fusion, aiming to improve the entity identification of Chinese electronic medical records. [Methods] First, we merged the semantic representations generated by each Transformer layer of the pre-trained language model RoBERTa-wwm. Then, we input the bi-directional long short-term memory network and the conditional random field module to recognize the entities of the electronic medical records. [Results] We examined our new model with the dataset of “2017 National Knowledge Graph and Semantic Computing Conference (CCKS 2017)” and self-annotated electronic medical records. Their F1 values reached 94.08% and 90.08%, which were 0.23% and 0.39% higher than the RoBERTa-wwm-BiLSTM-CRF model. [Limitations] The RoBERTa-wwm used in this paper completed the pre-training process with non-medical corpus. [Conclusions] The proposed method could improve the results of entity recognition tasks.
|
Received: 31 August 2021
Published: 07 January 2022
|
|
Fund:Humanities and Social Science Foundation of Ministry of Education(18YJA870017);Jilin Province Social Science Foundation(2019B59);Graduate Innovation Foundation of Jilin University(101832020CX279) |
Corresponding Authors:
Zhang Yunqiu,ORCID:0000-0002-9790-9581
E-mail: yunqiu@jlu.edu.cn
|
[1] |
Shen L, Li Q, Wang W, et al. Treatment Patterns and Direct Medical Costs of Metastatic Colorectal Cancer Patients: A Retrospective Study of Electronic Medical Records from Urban China[J]. Journal of Medical Economics, 2020, 23(5):456-463.
doi: 10.1080/13696998.2020.1717500
pmid: 31950863
|
[2] |
刘浏, 王东波. 命名实体识别研究综述[J]. 情报学报, 2018, 37(3):329-340.
|
[2] |
( Liu Liu, Wang Dongbo. A Review on Named Entity Recognition[J]. Journal of the China Society for Scientific and Technical Information, 2018, 37(3):329-340.)
|
[3] |
McCallum A, Li W. Early Results for Named Entity Recognition with Conditional Random Fields, Feature Induction and Web-Enhanced Lexicons[C]// Proceedings of the 7th Conference on Natural Language Learning at HLT-NAACL 2003. 2003:188-191.
|
[4] |
黄菡, 王宏宇, 王晓光. 结合主动学习的条件随机场模型用于法律术语的自动识别[J]. 数据分析与知识发现, 2019, 3(6):66-74.
|
[4] |
( Huang Han, Wang Hongyu, Wang Xiaoguang. Automatic Recognizing Legal Terminologies with Active Learning and Conditional Random Field Model[J]. Data Analysis and Knowledge Discovery, 2019, 3(6):66-74.)
|
[5] |
Zhao S J. Named Entity Recognition in Biomedical Texts Using an HMM Model[C]// Proceedings of the International Joint Workshop on Natural Language Processing in Biomedicine and Its Applications - JNLPBA ’04. 2004: 87-90.
|
[6] |
冯静, 李正武, 张登云, 等. 基于隐马尔可夫模型的桥梁检测文本命名实体识别[J]. 交通世界, 2020(8):32-33.
|
[6] |
( Feng Jing, Li Zhengwu, Zhang Dengyun, et al. Named Entity Recognition of Bridge Detection Text Based on Hidden Markov Model[J]. TranspoWorld, 2020(8):32-33.)
|
[7] |
Kazama J, Makino T, Ohta Y, et al. Tuning Support Vector Machines for Biomedical Named Entity Recognition[C]// Proceedings of the ACL-02 Workshop on Natural Language Processing in the Biomedical Domain. 2002: 1-8.
|
[8] |
晏雷, 周兰江, 张建安, 等. 融合多特征的老挝机构名实体识别方法[J]. 现代电子技术, 2020, 43(19):122-125.
|
[8] |
( Yan Lei, Zhou Lanjiang, Zhang Jianan, et al. Lao Organization Name Entity Recognition Method Fusing Multiple Features[J]. Modern Electronics Technique, 2020, 43(19):122-125.)
|
[9] |
Cocos A, Fiks A G, Masino A J. Deep Learning for Pharmacovigilance: Recurrent Neural Network Architectures for Labeling Adverse Drug Reactions in Twitter Posts[J]. Journal of the American Medical Informatics Association, 2017, 24(4):813-821.
doi: 10.1093/jamia/ocw180
pmid: 28339747
|
[10] |
Ji B, Liu R, Li S S, et al. A BiLSTM-CRF Method to Chinese Electronic Medical Record Named Entity Recognition[C]// Proceedings of the 2018 International Conference on Algorithms, Computing and Artificial Intelligence. 2018: 1-6.
|
[11] |
刘婧茹, 宋阳, 贾睿, 等. 基于BiLSTM-CRF中文临床文本中受保护的健康信息识别[J]. 数据分析与知识发现, 2020, 4(10):124-133.
|
[11] |
( Liu Jingru, Song Yang, Jia Rui, et al. A BiLSTM-CRF Model for Protected Health Information in Chinese[J]. Data Analysis and Knowledge Discovery, 2020, 4(10):124-133.)
|
[12] |
Giorgi J M, Bader G D. Towards Reliable Named Entity Recognition in the Biomedical Domain[J]. Bioinformatics, 2020, 36(1):280-286.
doi: 10.1093/bioinformatics/btz504
pmid: 31218364
|
[13] |
赵丹丹, 黄德根, 孟佳娜, 等. 多头注意力与字词融合的中文命名实体识别[J/OL]. 计算机工程与应用. [2021-08-25]. http://kns.cnki.net/kcms/detail/11.2127.TP.20210726.1521.024.html.
|
[13] |
( Zhao Dandan, Huang Degen, Meng Jiana, et al. Chinese Named Entity Recognition by Integrating Multi-heads Attention Mechanism and Character and Words Fusion[J/OL]. Computer Engineering and Applications. [2021-08-25]. http://kns.cnki.net/kcms/detail/11.2127.TP.20210726.1521.024.html. )
|
[14] |
廖开际, 邹珂欣, 席运江. 一种在线医疗社区问答文本实体识别方法: 基于卷积神经网络和双向长短期记忆神经网络[J]. 科技管理研究, 2021, 41(8):173-179.
|
[14] |
( Liao Kaiji, Zou Kexin, Xi Yunjiang. An Online Medical Community Q&A Text Entity Recognition Method: Based on CNN and BiLSTM[J]. Science and Technology Management Research, 2021, 41(8):173-179.)
|
[15] |
Lee J, Yoon W, Kim S, et al. BioBERT: A Pre-Trained Biomedical Language Representation Model for Biomedical Text Mining[J]. Bioinformatics, 2020, 36(4):1234-1240.
|
[16] |
Devlin J, Chang M W, Lee K, et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding[OL]. arXiv Preprint, arXiv:1810.04805.
|
[17] |
Jawahar G, Sagot B, Seddah D. What does BERT Learn about the Structure of Language?[C]// Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. 2019.
|
[18] |
Albilali E, Altwairesh N, Hosny M. What does BERT Learn from Arabic Machine Reading Comprehension Datasets?[C]// Proceedings of the 6th Arabic Natural Language Processing Workshop. 2021: 32-41.
|
[19] |
Cui Y M, Che W X, Liu T, et al. Pre-Training with Whole Word Masking for Chinese BERT[OL]. arXiv Preprint, arXiv:1906.08101.
|
[20] |
Greff K, Srivastava R K, Koutník J, et al. LSTM: A Search Space Odyssey[J]. IEEE Transactions on Neural Networks and Learning Systems, 2017, 28(10):2222-2232.
doi: 10.1109/TNNLS.2016.2582924
|
|
Viewed |
|
|
|
Full text
|
|
|
|
|
Abstract
|
|
|
|
|
Cited |
|
|
|
|
|
Shared |
|
|
|
|
|
Discussed |
|
|
|
|