Please wait a minute...
Data Analysis and Knowledge Discovery  2020, Vol. 4 Issue (11): 63-73    DOI: 10.11925/infotech.2096-3467.2020.0469
Current Issue | Archive | Adv Search |
Predicting Hospital Readmissions with Deep Learning: Case Study of Heart Diseases
Da Jingwei1,Yan Jiaqi1(),Deng Sanhong1,2,Wang Zhongmin3
1School of Information Management, Nanjing University, Nanjing 210023, China
2Jiangsu Key Laboratory of Data Engineering and Knowledge Service, Nanjing 210023, China
3Jiangsu Province Hospital (The First Affiliated Hospital of Nanjing Medical University), Nanjing 210029, China
Download: PDF (990 KB)   HTML ( 22
Export: BibTeX | EndNote (RIS)      
Abstract  

[Objective] This paper uses the deep learning method to predict possible readmissions of patients based on their electronic medical records, aiming to improve hospital management. [Methods] We proposed a model based on character-level convolution neural network to process the unstructured texts. Then, with the help of structured data (demographics, clinical records and administrative data) to predict the hospital readmission cases. [Results] The deep learning model combining structured and unstructured data yielded better prediction results at F1-score of 0.735. Compared with the models only using structured or unstructured data, the F1-score was increased by 12.9% and 2.1%, respectively. [Limitations] The experimental medical records were collected from one hospital, which has some impacts on prediction results. [Conclusions] The proposed model provides references for researchers of hospital readmission prediction and hospital administrators.

Key wordsHospital Readmission      Deep Learning      Heart Disease      Predictive Analysis     
Received: 27 May 2020      Published: 02 September 2020
ZTFLH:  TP391  
Corresponding Authors: Yan Jiaqi     E-mail: jiaqiyan@nju.edu.cn

Cite this article:

Da Jingwei,Yan Jiaqi,Deng Sanhong,Wang Zhongmin. Predicting Hospital Readmissions with Deep Learning: Case Study of Heart Diseases. Data Analysis and Knowledge Discovery, 2020, 4(11): 63-73.

URL:

https://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/10.11925/infotech.2096-3467.2020.0469     OR     https://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/Y2020/V4/I11/63

Research Framework
特征子集 特征名称 数据类型 数据描述
人口统计学数据 性别 分类变量 男(2 394, 65.4%); 女(1 266, 34.6%)
婚姻状态 分类变量 已婚(3 510, 95.9%); 未婚(150, 4.1%)
临床数据 收缩压 数值型变量 平均值=129.582; 方差=18.512
舒张压 数值型变量 平均值=72.314; 方差=12.130
行政数据 住院天数 数值型变量 平均值=14.879; 方差=10.311
ICD_10编码 分类变量 I25.101(57.6%); I25.105(29.3%);其他(13.1%)
美托洛尔使用 分类变量 是 (1 197, 32.7%); 否 (2 463, 67.3%)
厄贝沙坦使用 分类变量 是(215, 5.9%); 否(3 445, 94.1%)
是否手术 分类变量 是 (201, 5.5%); 否 (3 459, 94.5%)
Description Statistics of Structured Data
Examples of Unstructured Data
变量 χ2 Spearman相关系数 p
性别 1.493 - 0.224
是否手术 34.905 - 0.000
ICD-10编码 81.145 - 0.000
婚姻状态 41.540 - 0.000
美托洛尔使用 88.998 - 0.000
厄贝沙坦使用 5.284 - 0.024
住院天数 - 0.391 0.000
收缩压 - -0.095 0.000
舒张压 - -0.163 0.000
Correlation Analysis of Independent Variable and Dependent Variable
Architecture of SUCM Model
参数名称 参数值
词嵌入维度(Word Embedding) 64
句子维度(Sentences Dimension) 300
卷积核个数(Number of Filter) 32
卷积核长度(Filter Length) 5
学习率(Learning Rate) 0.001
Parameter Settings of SUCM Model
模 型 ACC F1 P R AUC
结构化数据 NB 0.704 0.649 0.705 0.639 0.701
SVM 0.717 0.647 0.746 0.619 0.713
LR 0.711 0.650 0.730 0.633 0.708
MS(DL) 0.728 0.651 0.771 0.604 0.723
非结构化数据 NB 0.718 0.624 0.802 0.578 0.712
SVM 0.743 0.704 0.771 0.701 0.742
LR 0.731 0.687 0.762 0.685 0.729
MU(DL) 0.743 0.720 0.751 0.728 0.743
结构化数据+
非结构化数据
NB 0.719 0.654 0.736 0.636 0.715
SVM 0.745 0.710 0.771 0.713 0.744
LR 0.734 0.687 0.760 0.689 0.732
SUCM(DL) 0.754 0.735 0.752 0.749 0.754
Experimental Results
Model Performance Improvement after Structured Data Combined with Unstructured Data
Model Performance Improvement after Unstructured Data Combined with Structured Data
激活函数 ACC F1 P R AUC
ReLU 0.754 0.735 0.752 0.749 0.754
ELU 0.750 0.728 0.767 0.724 0.749
Swish 0.748 0.727 0.752 0.736 0.747
Experimental Results with Different Activation Functions
全连接层数 ACC F1 P R AUC 运行时间/min
(3, 1) 0.750 0.732 0.760 0.738 0.750 1.3
(3, 2) 0.750 0.734 0.744 0.748 0.750 1.8
(4, 1) 0.754 0.735 0.752 0.749 0.754 1.2
(4, 2) 0.738 0.725 0.732 0.754 0.739 1.9
(5, 1) 0.741 0.725 0.739 0.740 0.741 1.4
(5, 2) 0.749 0.730 0.750 0.743 0.749 1.7
Experimental Results with Different Number of Fully Connected Layers
模 型 ACC F1 P R AUC 运行时间/min
MU(Char-CNN) 0.743 0.720 0.751 0.728 0.743 1.0
MU(Word2Vec) 0.728 0.715 0.720 0.732 0.728 2.3
SUCM(Char-CNN) 0.754 0.735 0.752 0.749 0.754 1.2
SUCM(Word2Vec) 0.729 0.713 0.733 0.718 0.729 4.3
Experimental Results with Different Text Processing Methods
[1] Roy S B, Teredesai A, Zolfaghar K, et al. Dynamic Hierarchical Classification for Patient Risk-of-Readmission[C]// Proceedings of the 21st ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. ACM, 2015: 1691-1700.
[2] Mcilvennan C K, Eapen Z J, Allen L A. Hospital Readmissions Reduction Program[J]. Circulation, 2015,131(20):1796-1803.
doi: 10.1161/CIRCULATIONAHA.114.010270 pmid: 25986448
[3] Benbassat J, Taragin M. Hospital Readmissions as a Measure of Quality of Health Care[J]. Archives of Internal Medicine, 2000,160(8):1074.
doi: 10.1001/archinte.160.8.1074 pmid: 10789599
[4] Zhang X, Zhao J B, LeCun Y. Character-Level Convolutional Networks for Text Classification[C]// Proceedings of the 28th International Conference on Neural Information Processing Systems. 2015: 649-657.
[5] 刘勘, 陈露. 面向医疗分诊的深度神经网络学习[J]. 数据分析与知识发现, 2019,3(6):99-108.
[5] ( Liu Kan, Chen Lu. Deep Neural Network Learning for Medical Triage[J]. Data Analysis and Knowledge Discovery, 2019,3(6):99-108.)
[6] Yu S P, Farooq F, Van Esbroeck A, et al. Predicting Readmission Risk with Institution-Specific Prediction Models[J]. Artificial Intelligence in Medicine, 2015,65(2):89-96.
doi: 10.1016/j.artmed.2015.08.005 pmid: 26363683
[7] 余传明, 龚雨田, 王峰, 等. 基于文本价格融合模型的股票趋势预测[J]. 数据分析与知识发现, 2018,2(12):33-42.
[7] ( Yu Chuanming, Gong Yutian, Wang Feng, et al. Predicting Stock Prices with Text and Price Combined Model[J]. Data Analysis and Knowledge Discovery, 2018,2(12):33-42.)
[8] 汤培楷. 基于机器学习的再入院预测[J]. 中国数字医学, 2016,11(7):50-52.
[8] ( Tang Peikai. Predicting Hospital Readmission Based on Machine Learning[J]. China Digital Medicine, 2016,11(7):50-52.)
[9] 朱春燕. 心血管疾病患者再次入院风险评估系统的设计与实现[D]. 杭州: 浙江大学, 2016.
[9] ( Zhu Chunyan. Design and Realization of a Readmission Risk Assessment System for Patients with Cardiovascular Disease[D]. Hangzhou: Zhejiang University, 2016.)
[10] 杜国栋. 基于梯度提升决策树的患者30天再入院预测模型研究[D]. 昆明: 昆明理工大学, 2018.
[10] ( Du Guodong. Study on Prediction Model of 30-day Readmission Based on Gradient Boosting Decision Tree[D]. Kunming: Kunming University of Science and Technology, 2018.)
[11] Eigner I, Reischl D, Bodendorf F. Development and Evaluation of Ensemble-Based Classification Models for Predicting Unplanned Hospital Readmissions after Hysterectomy[C]// Proceedings of Australasian Conference on Information Systems. 2018.
[12] Hammoudeh A, Alnaymat G, Ghannam I, et al. Predicting Hospital Readmission Among Diabetics Using Deep Learning[C]// Proceedings of the 5th International Symposium on Emerging Information, Communication and Networks. 2018: 484-489.
[13] Ashfaq A, Santanna A, Lingman M, et al. Readmission Prediction Using Deep Learning on Electronic Health Records[J]. Journal of Biomedical Informatics, 2019,97:103256.
pmid: 31351136
[14] Zebin T, Chaussalet T J. Design and Implementation of a Deep Recurrent Model for Prediction of Readmission in Urgent Care Using Electronic Health Records[C]// Proceedings of 2019 IEEE Conference on Computational Intelligence in Bioinformatics and Computational Biology (CIBCB). IEEE, 2019: 1-5.
[15] Wang H S, Cui Z C, Chen Y X, et al. Predicting Hospital Readmission via Cost-sensitive Deep Learning[J]. IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB), 2018,15(6):1968-1978.
doi: 10.1109/TCBB.2018.2827029
[16] Dashtban M, Li W. Predicting Risk of Hospital Readmission for Comorbidity Patients Through a Novel Deep Learning Framework[C]// Proceedings of the 53rd Hawaii International Conference on System Sciences. 2020.
[17] Craig E, Arias C, Gillman D. Predicting Readmission Risk from Doctors’ Notes[OL]. arXiv Preprint, arXiv: 1711. 10663.
[18] Xiao C, Ma T F, Dieng A B, et al. Readmission Prediction via Deep Contextual Embedding of Clinical Concepts[J]. PLoS One, 2018,13(4):e0195024.
doi: 10.1371/journal.pone.0195024 pmid: 29630604
[19] Patel A, Gan K, Li A, et al. Machine Learning Algorithms in Predicting Hospital Readmissions in Sickle Cell Disease[J]. Blood, 2019,134(S1):982.
[20] Liu X, Chen Y, Bae J, et al. Predicting Heart Failure Readmission from Clinical Notes Using Deep Learning[OL]. arXiv Preprint, arXiv: 1912. 10306.
[21] Tang F Y, Xiao C, Wang F, et al. Predictive Modeling in Urgent Care: A Comparative Study of Machine Learning Approaches[J]. JAMIA Open, 2018,1(1):87-98.
doi: 10.1093/jamiaopen/ooy011 pmid: 31984321
[22] Lovelace J R, Hurley N C, Haimovich A D, et al. Explainable Prediction of Adverse Outcomes Using Clinical Notes[OL]. arXiv Preprint, arXiv: 1910. 14095.
[23] Golas S B, Shibahara T, Agboola S, et al. A Machine Learning Model to Predict the Risk of 30-day Readmissions in Patients with Heart Failure: A Retrospective Analysis of Electronic Medical Records Data[J]. BMC Medical Informatics and Decision Making, 2018,18(1):44.
pmid: 29929496
[24] Kwon O, Na W, Yang H, et al. Electronic Medical Record-Based Machine Learning Approach to Predict the Risk of 30-Day Major Adverse Cardiac Event After Invasive Coronary Treatment[J]. Circulation, 2019,140(S1):A14474.
[25] 韩雅玲, 周玉杰. 冠心病合理用药指南(第2版)[J/OL]. 中国医学前沿杂志(电子版), 2018,10(6):1-130.
[25] ( Han Yaling, Zhou Yujie. Guidelines for Rational Use of Coronary Heart Disease (The 2nd edition) [J/OL]. Chinese Journal of the Frontiers of Medical Science (Electronic Version), 2018,10(6):1-130.)
[26] 曹明花. 长沙地区慢性心力衰竭患者再入院影响因素的研究[D]. 长沙: 湖南师范大学, 2013.
[26] ( Cao Minghua. A Study About the Risk Factors of Readmission in Patients with Chronic Heart Failure in Changsha[D]. Changsha: Hunan Normal University, 2013.)
[27] Goto456 Stopwords[EB/OL]. (2020-03-04). [2020-03-08]. https://github.com/goto456/stopwords.
[28] 郭跃华. 概率论与数理统计[M]. 北京: 科学出版社, 2007.
[28] ( Guo Yuehua. Probability Theory and Mathematical Statistics[M]. Beijing: Science Press, 2007.)
[29] 洪寒梅, 陈妍, 钱欣平, 等. 期刊影响力指数排名的合理性分析[J]. 中国科技期刊研究, 2018,29(8):842-848.
doi: 10.11946/cjstp.201803200244
[29] ( Hong Hanmei, Chen Yan, Qian Xinping, et al. Analysis of Rationality on the Ranking of Academic Journal Clout Index[J]. Chinese Journal of Scientific and Technical Periodicals, 2018,29(8):842-848.)
doi: 10.11946/cjstp.201803200244
[30] 谢为俊, 丁冶春, 王凤贺, 等. 基于卷积神经网络的油茶籽完整性识别方法[J]. 农业机械学报, 2020,51(7):13-21.
[30] ( Xie Weijun, Ding Yechun, Wang Fenghe, et al. Integrity Recognition of Camellia Oleifera Seeds Based on Neural Network[J]. Transactions of the Chinese Society for Agricultural Machinery, 2020,51(7):13-21.)
[31] Jia M M, Tian F. Readmission Prediction of Diabetic Based on Convolutional Neural Networks[C]// Proceedings of 2019 IEEE International Conference on Computer and Communications (ICCC). IEEE, 2019, DOI: 10.1109/ICCC47050.2019.9064477.
[32] Clevert D A, Unterthiner T, Hochreiter S, et al. Fast and Accurate Deep Network Learning by Exponential Linear Units (ELUs)[OL]. arXiv Preprint, arXiv: 1511. 07289.
[33] Ramachandran P, Zoph B, Le Q V, et al. Searching for Activation Functions[OL]. arXiv Preprint, arXiv: 1710. 05941.
[34] 向菲, 谢耀谈. 基于混合采样与迁移学习的患者评论识别模型[J]. 数据分析与知识发现, 2020,4(2/3):39-47.
[34] ( Xiang Fei, Xie Yaotan. Recognition Model of Patient Reviews Based on Mixed Sampling and Transfer Learning[J]. Data Analysis and Knowledge Discovery, 2020,4(2/3):39-47.)
[35] Li S, Zhao Z, Hu R F, et al. Analogical Reasoning on Chinese Morphological and Semantic Relations[C]// Proceedings of Meeting of the Association for Computational Linguistics. 2018: 138-143.
[1] Zhou Zeyu,Wang Hao,Zhao Zibo,Li Yueyan,Zhang Xiaoqin. Construction and Application of GCN Model for Text Classification with Associated Information[J]. 数据分析与知识发现, 2021, 5(9): 31-41.
[2] Zhao Danning,Mu Dongmei,Bai Sen. Automatically Extracting Structural Elements of Sci-Tech Literature Abstracts Based on Deep Learning[J]. 数据分析与知识发现, 2021, 5(7): 70-80.
[3] Xu Yuemei, Wang Zihou, Wu Zixin. Predicting Stock Trends with CNN-BiLSTM Based Multi-Feature Integration Model[J]. 数据分析与知识发现, 2021, 5(7): 126-138.
[4] Huang Mingxuan,Jiang Caoqing,Lu Shoudong. Expanding Queries Based on Word Embedding and Expansion Terms[J]. 数据分析与知识发现, 2021, 5(6): 115-125.
[5] Zhong Jiawa,Liu Wei,Wang Sili,Yang Heng. Review of Methods and Applications of Text Sentiment Analysis[J]. 数据分析与知识发现, 2021, 5(6): 1-13.
[6] Zhang Guobiao,Li Jie. Detecting Social Media Fake News with Semantic Consistency Between Multi-model Contents[J]. 数据分析与知识发现, 2021, 5(5): 21-29.
[7] Chang Chengyang,Wang Xiaodong,Zhang Shenglei. Polarity Analysis of Dynamic Political Sentiments from Tweets with Deep Learning Method[J]. 数据分析与知识发现, 2021, 5(3): 121-131.
[8] Feng Yong,Liu Yang,Xu Hongyan,Wang Rongbing,Zhang Yonggang. Recommendation Model Incorporating Neighbor Reviews for GRU Products[J]. 数据分析与知识发现, 2021, 5(3): 78-87.
[9] Hu Haotian,Ji Jinfeng,Wang Dongbo,Deng Sanhong. An Integrated Platform for Food Safety Incident Entities Based on Deep Learning[J]. 数据分析与知识发现, 2021, 5(3): 12-24.
[10] Zhang Qi,Jiang Chuan,Ji Youshu,Feng Minxuan,Li Bin,Xu Chao,Liu Liu. Unified Model for Word Segmentation and POS Tagging of Multi-Domain Pre-Qin Literature[J]. 数据分析与知识发现, 2021, 5(3): 2-11.
[11] Lv Xueqiang,Luo Yixiong,Li Jiaquan,You Xindong. Review of Studies on Detecting Chinese Patent Infringements[J]. 数据分析与知识发现, 2021, 5(3): 60-68.
[12] Cheng Bin,Shi Shuicai,Du Yuncheng,Xiao Shibin. Keyword Extraction for Journals Based on Part-of-Speech and BiLSTM-CRF Combined Model[J]. 数据分析与知识发现, 2021, 5(3): 101-108.
[13] Li Danyang, Gan Mingxin. Music Recommendation Method Based on Multi-Source Information Fusion[J]. 数据分析与知识发现, 2021, 5(2): 94-105.
[14] Yu Chuanming, Zhang Zhengang, Kong Lingge. Comparing Knowledge Graph Representation Models for Link Prediction[J]. 数据分析与知识发现, 2021, 5(11): 29-44.
[15] Han Pu, Zhang Wei, Zhang Zhanpeng, Wang Yuxin, Fang Haoyu. Sentiment Analysis of Weibo Posts on Public Health Emergency with Feature Fusion and Multi-Channel[J]. 数据分析与知识发现, 2021, 5(11): 68-79.
  Copyright © 2016 Data Analysis and Knowledge Discovery   Tel/Fax:(010)82626611-6626,82624938   E-mail:jishu@mail.las.ac.cn