[Objective] To meet the real world data-driven clinical research, this study aimed to develop a clinical scale information extraction method based on semantic alignment, which could facilitate cohort identification. [Methods] The NIHSS (National Institutes of Health Stroke Scale) was used in this study. We analyzed its features in both clinical trials and real-world electronic medical records. Thereafter, a semantic alignment-based clinical scale information extraction method was proposed, and validated on the clinical trials data (ClinicalTrials.gov) and the open electronic medical record dataset MIMIC-III. [Results] The F1 values of the NIHSS total score and item scores extraction were 0.9535 and 0.9267, respectively. Besides, we could identify patients who met NIHSS criteria effectively in two test tasks. [Limitations] The method feasibility for other clinical scales did not be validated. Also, the method should be applied in the real-world trial recuriment scenario for furher improvement. [Conclusions] The proposed method showed to be an effective approach to solve the problem of semantic consistency of clinical scale information between clinical research and electronic medical record data.
杨林, 黄晓硕, 王嘉阳, 李姣. 基于语义对齐的临床量表信息提取方法及其临床试验队列识别的应用研究
[J]. 数据分析与知识发现, 0, (): 1-.
Yang Lin, Huang Xiaoshuo, Wang Jiayang, Li Jiao. Semantic Alignment-based Clinical Scale Information Extraction and its Application in Cohort Identification
. Data Analysis and Knowledge Discovery, 0, (): 1-.