|
|
A BiLSTM-CRF Model for Chinese Clinical Protected Health Information Recognition
|
Liu Jingru,Song Yang,Jia Rui,Zhang Yipeng,Luo Yong,Ma Jingdong
|
(School of Medical and Health Management, Tongji Medical College, Huazhong University of Science and Technology, Wuhan 430030)
(School of Public Health, Chengdu University of Traditional Chinese Medicine, Chengdu, 611137)
(Sichuan Province Electronic Medical Record Engineering Technology Research Center, Chengdu 610041)
(Sichuan Jiuzhen Technology Co., Ltd., Chengdu, 610041)
|
|
|
Abstract
[Objective] In order to protect private information in clinical texts and effectively identify protected health information (PHI) from unstructured structured texts, an automated scheme for removing private information from clinical records using a BiLSTM-CRF model is proposed. [Method] The discharge summary from the Electronic health records of a health information platform was selected as experimental data. According to the 18 PHI regulations specified by HIPAA combined with the characteristics of the experimental data, 7 PHI categories and 15 PHI types were determined. The BiLSTM-CRF model is used to effectively identify protected health information from unstructured clinical records. [Result] The accuracy rate, recall rate and F value of all entity category recognition were 98.66%, 99.36%, and 99.01% respectively, and the wrong labels were summarized and analyzed.. [Limitations] The optimization of model performance based on corpus characteristics needs to be improved, and the clinical text quality after automatic recognition of PHI has not been evaluated in this study. [Conclusion] The BiLSTM-CRF model realizes the automatic recognition of named entities without feature engineering, which is helpful to promote the sharing and utilization of clinical information.
|
Published: 10 July 2020
|
|
|
Viewed |
|
|
|
Full text
|
|
|
|
|
Abstract
|
|
|
|
|
Cited |
|
|
|
|
|
Shared |
|
|
|
|
|
Discussed |
|
|
|
|