%A Yang Lin, Huang Xiaoshuo, Wang Jiayang, Ding Lingling, Li Zixiao, Li Jiao %T Identifying Subtypes of Clinical Trial Diseases with BERT-TextCNN %0 Journal Article %D 2022 %J Data Analysis and Knowledge Discovery %R 10.11925/infotech.2096-3467.2021.0712 %P 69-81 %V 6 %N 4 %U {https://manu44.magtech.com.cn/Jwk_infotech_wk3/CN/abstract/article_5330.shtml} %8 2022-04-25 %X

[Objective] This study develops a method to identify disease subtypes based on BERT-TextCNN, which could facilitate cohort selection for clinical trials. [Methods] We transformed the disease subtype identification into a single-label classification task based on BERT-TextCNN. Then, we examined our new model with clinical trials data for strokes from ClinicalTrials.gov. [Results] The BERT-TextCNN based on the LP method yielded the best weighted macro-average F1 value of 0.905 3. It identified stroke subtypes for participants of a clinical trial. [Limitations] More research is needed to evaluate our model with other diseases and data sets. [Conclusions] The proposed method could be an effective approach to identify complex disease subtypes.