[Objective] This paper tries to turn medical imaging diagnosis reports into structured data, aiming to effectively extract information from these free-text-reports. [Methods] First, we analyzed the text characteristics of medical imaging diagnosis reports, and proposed a structuring method based on entity recognition and rule extraction. Then, we annotated 800 reports to construct datasets for model evaluation. [Results] The proposed method had a precision rate of 0.87 for all entities from the medical imaging diagnostic reports, which was 4.03% higher than that of the BERT-BiLSTM-CRF. Its recall rate was also 2.81% higher than that of the BERT-BiLSTM-CRF. Compared with the method of dependency analysis, the proposed model improved the recognition precision of medical exam items and results by 5.62% and 2.31%. [Limitations] We only examined the proposed method with diagnostic PET-CT imaging reports from one hospital. [Conclusions] This study successfully converts the free texts of medical imaging diagnostic reports to structured data. It not only optimizes the classification, storage, and retrieval of medical reports, but also provides supports for future research on medical imaging.
盛羽, 胡慧荣, 王聪聪, 杨晟艺. 医学影像诊断报告的结构化研究*[J]. 数据分析与知识发现, 2022, 6(10): 46-56.
Sheng Yu, Hu Huirong, Wang Congcong, Yang Shengyi. Analyzing Structures of Medical Imaging Diagnosis Reports. Data Analysis and Knowledge Discovery, 2022, 6(10): 46-56.
Desai S B, Pareek A, Lungren M P. Deep Learning and Its Role in COVID-19 Medical Imaging[J]. Intelligence-Based Medicine, 2020, 3-4: Article No. 100013.
(Wang Ping, Chen Liang, Hu Lei. Artifical Intelligence Combined with Structured Reporting Enables Clinical Integration of Coronary CTA[J]. China Digital Medicine, 2021, 16(11):50-54.)
Shi Y H, Wang Q. The Artificial Intelligence-Enabled Medical Imaging: Today and Its Future[J]. Chinese Medical Sciences Journal, 2019, 34(2): 71-75.
[5]
Lin M Q, Wynne J F, Zhou B R, et al. Artificial Intelligence in Tumor Subregion Analysis Based on Medical Imaging: A Review[J]. Journal of Applied Clinical Medical Physics, 2021, 22(7): 10-26.
doi: 10.1002/acm2.13321
[6]
Rocha D M, Brasil L M, Lamas J M, et al. Evidence of the Benefits, Advantages and Potentialities of the Structured Radiological Report: An Integrative Review[J]. Artificial Intelligence in Medicine, 2020, 102: 101770.
doi: 10.1016/j.artmed.2019.101770
[7]
van Ginneken A M, Stam H, Moorman P W. A Multi-Strategy Approach for Medical Records of Specialists[J]. International Journal of Bio-Medical Computing, 1996, 42(1-2): 21-26.
pmid: 8880265
[8]
van Ginneken A M. The Computerized Patient Record: Balancing Effort and Benefit[J]. International Journal of Medical Informatics, 2002, 65(2): 97-119.
pmid: 12052424
(Xiao Qiang, Wu Weibin, Chen Lianzhong. Application of Free Structure Input Method in Electronic Medical Record System[J]. Hospital Administration Journal of Chinese PLA, 2005, 12(3): 222.)
[10]
Friedman C, Liu H, Shagina L, et al. Evaluating the UMLS as a Source of Lexical Knowledge for Medical Language Processing[C]// Proceedings of AMIA Symposium. 2001: 189-193.
[11]
Sevenster M, van Ommering R, Qian Y C. Automatically Correlating Clinical Findings and Body Locations in Radiology Reports Using MedLEE[J]. Journal of Digital Imaging, 2012, 25(2): 240-249.
doi: 10.1007/s10278-011-9411-0
pmid: 21796490
[12]
Morwal S. Named Entity Recognition Using Hidden Markov Model (HMM)[J]. International Journal on Natural Language Computing, 2012, 1(4): 15-23.
[13]
Ning H, Yang H, Tan Y Z, et al. A Method of Chinese Named Entity Recognition Based on Maximum Entropy Model[C]// Proceedings of 2009 International Conference on Mechatronics and Automation. 2009: 2472-2477.
[14]
Corbett P, Copestake A. Cascaded Classifiers for Confidence-Based Chemical Named Entity Recognition[J]. BMC Bioinformatics, 2008, 9(S11): Article No. S4.
[15]
Lee K J, Hwang Y S, Kim S, et al. Biomedical Named Entity Recognition Using Two-Phase Model Based on SVMs[J]. Journal of Biomedical Informatics, 2004, 37(6): 436-447.
pmid: 15542017
[16]
Cejuela J M, Bojchevski A, Uhlig C, et al. Nala: Text Mining Natural Language Mutation Mentions[J]. Bioinformatics, 2017, 33(12): 1852-1858.
doi: 10.1093/bioinformatics/btx083
pmid: 28200120
[17]
de Bruijn B, Cherry C, Kiritchenko S, et al. Machine-Learned Solutions for Three Stages of Clinical Information Extraction: The State of the Art at I2B2 2010[J]. Journal of the American Medical Informatics Association, 2011, 18(5): 557-562.
doi: 10.1136/amiajnl-2011-000150
pmid: 21565856
[18]
Lei J B, Tang B Z, Lu X Q, et al. A Comprehensive Study of Named Entity Recognition in Chinese Clinical Text[J]. Journal of the American Medical Informatics Association, 2014, 21(5): 808-814.
doi: 10.1136/amiajnl-2013-002381
pmid: 24347408
(Ye Feng, Chen Yingying, Zhou Gengui, et al. Intelligent Recognition of Named Entity in Electronic Medical Records[J]. Chinese Journal of Biomedical Engineering, 2011, 30(2): 256-262.)
[20]
Wu Y H, Jiang M, Xu J, et al. Clinical Named Entity Recognition Using Deep Learning Models[C]// Proceedings of AMIA Annual Symposium. 2017:1812-1819.
[21]
Lyu C, Chen B, Ren Y F, et al. Long Short-Term Memory RNN for Biomedical Named Entity Recognition[J]. BMC Bioinformatics, 2017, 18(1): 462.
doi: 10.1186/s12859-017-1868-5
pmid: 29084508
[22]
Li L Q, Hou L. Named Entity Recognition in Chinese Electronic Medical Records Based on the Model of Bidirectional Long Short-Term Memory with a Conditional Random Field Layer[J]. Studies in Health Technology and Informatics, 2019, 264: 1524-1525.
doi: 10.3233/SHTI190516
pmid: 31438213
[23]
Xue K, Zhou Y M, Ma Z Y, et al. Fine-Tuning BERT for Joint Entity and Relation Extraction in Chinese Medical Text[C]// Proceedings of 2019 IEEE International Conference on Bioinformatics and Biomedicine. 2019: 892-897.
(Zhang Fangcong, Qin Qiuli, Jiang Yong, et al. Named Entity Recognition for Chinese EMR with RoBERTa-WWM-BiLSTM-CRF[J]. Data Analysis and Knowledge Discovery, 2022, 6(2/3): 251-262.)
(Zhang Yunqiu, Wang Yang, Li Bocheng. Identifying Named Entities of Chinese Electronic Medical Records Based on RoBERTa-WWM Dynamic Fusion Model[J]. Data Analysis and Knowledge Discovery, 2022, 6(2/3): 242-250.)
[26]
金征宇, 龚启勇. 医学影像学[M]. 3版. 北京: 人民卫生出版社, 2015
[26]
Jin Zhengyu, Gong Qiyong. Medical Imaging[M]. The 3rd Edition. Beijing: People’s Medical Publishing House, 2015.)
(Cao Yiyi, Zhou Yinghua, Shen Fahai, et al. Research on Named Entity Recognition of Chinese Electronic Medical Record Based on CNN-CRF[J]. Journal of Chongqing University of Posts and Telecommunications (Natural Science Edition), 2019, 31(6): 869-875.)
[28]
Gao M, Xiao Q F, Wu S C, et al. An Attention-Based ID-CNNs-CRF Model for Named Entity Recognition on Clinical Electronic Medical Records[C]// Proceedings of International Conference on Artificial Neural Networks. 2019: 231-242.
[29]
Wang Z K, Guan H. Research on Named Entity Recognition of Doctor-Patient Question Answering Community Based on BiLSTM-CRF Model[C]// Proceedings of IEEE International Conference on Bioinformatics and Biomedicine. 2020: 1641-1644.
[30]
Wei H, Gao M Y, Zhou A, et al. Named Entity Recognition from Biomedical Texts Using a Fusion Attention-Based BiLSTM-CRF[J]. IEEE Access, 2019, 7: 73627-73636.
doi: 10.1109/ACCESS.2019.2920734
[31]
Wei K W, Wen B. Named Entity Recognition Method for Educational Emergency Field Based on BERT[C]// Proceedings of IEEE 12th International Conference on Software Engineering and Service Science. 2021: 145-149.
(Tian Chiyuan, Chen Dehua, Wang Mei, et al. Structured Processing for Pathological Reports Based on Dependency Parsing[J]. Journal of Computer Research and Development, 2016, 53(12): 2669-2680.)