Please wait a minute...
Data Analysis and Knowledge Discovery  2020, Vol. 4 Issue (11): 63-73    DOI: 10.11925/infotech.2096-3467.2020.0469
Current Issue | Archive | Adv Search |
Predicting Hospital Readmissions with Deep Learning: Case Study of Heart Diseases
Da Jingwei1,Yan Jiaqi1(),Deng Sanhong1,2,Wang Zhongmin3
1School of Information Management, Nanjing University, Nanjing 210023, China
2Jiangsu Key Laboratory of Data Engineering and Knowledge Service, Nanjing 210023, China
3Jiangsu Province Hospital (The First Affiliated Hospital of Nanjing Medical University), Nanjing 210029, China
Download: PDF (990 KB)   HTML ( 13
Export: BibTeX | EndNote (RIS)      
Abstract  

[Objective] This paper uses the deep learning method to predict possible readmissions of patients based on their electronic medical records, aiming to improve hospital management. [Methods] We proposed a model based on character-level convolution neural network to process the unstructured texts. Then, with the help of structured data (demographics, clinical records and administrative data) to predict the hospital readmission cases. [Results] The deep learning model combining structured and unstructured data yielded better prediction results at F1-score of 0.735. Compared with the models only using structured or unstructured data, the F1-score was increased by 12.9% and 2.1%, respectively. [Limitations] The experimental medical records were collected from one hospital, which has some impacts on prediction results. [Conclusions] The proposed model provides references for researchers of hospital readmission prediction and hospital administrators.

Key wordsHospital Readmission      Deep Learning      Heart Disease      Predictive Analysis     
Received: 27 May 2020      Published: 02 September 2020
ZTFLH:  TP391  
Corresponding Authors: Yan Jiaqi     E-mail: jiaqiyan@nju.edu.cn

Cite this article:

Da Jingwei,Yan Jiaqi,Deng Sanhong,Wang Zhongmin. Predicting Hospital Readmissions with Deep Learning: Case Study of Heart Diseases. Data Analysis and Knowledge Discovery, 2020, 4(11): 63-73.

URL:

http://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/10.11925/infotech.2096-3467.2020.0469     OR     http://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/Y2020/V4/I11/63

Research Framework
特征子集 特征名称 数据类型 数据描述
人口统计学数据 性别 分类变量 男(2 394, 65.4%); 女(1 266, 34.6%)
婚姻状态 分类变量 已婚(3 510, 95.9%); 未婚(150, 4.1%)
临床数据 收缩压 数值型变量 平均值=129.582; 方差=18.512
舒张压 数值型变量 平均值=72.314; 方差=12.130
行政数据 住院天数 数值型变量 平均值=14.879; 方差=10.311
ICD_10编码 分类变量 I25.101(57.6%); I25.105(29.3%);其他(13.1%)
美托洛尔使用 分类变量 是 (1 197, 32.7%); 否 (2 463, 67.3%)
厄贝沙坦使用 分类变量 是(215, 5.9%); 否(3 445, 94.1%)
是否手术 分类变量 是 (201, 5.5%); 否 (3 459, 94.5%)
Description Statistics of Structured Data
Examples of Unstructured Data
变量 χ2 Spearman相关系数 p
性别 1.493 - 0.224
是否手术 34.905 - 0.000
ICD-10编码 81.145 - 0.000
婚姻状态 41.540 - 0.000
美托洛尔使用 88.998 - 0.000
厄贝沙坦使用 5.284 - 0.024
住院天数 - 0.391 0.000
收缩压 - -0.095 0.000
舒张压 - -0.163 0.000
Correlation Analysis of Independent Variable and Dependent Variable
Architecture of SUCM Model
参数名称 参数值
词嵌入维度(Word Embedding) 64
句子维度(Sentences Dimension) 300
卷积核个数(Number of Filter) 32
卷积核长度(Filter Length) 5
学习率(Learning Rate) 0.001
Parameter Settings of SUCM Model
模 型 ACC F1 P R AUC
结构化数据 NB 0.704 0.649 0.705 0.639 0.701
SVM 0.717 0.647 0.746 0.619 0.713
LR 0.711 0.650 0.730 0.633 0.708
MS(DL) 0.728 0.651 0.771 0.604 0.723
非结构化数据 NB 0.718 0.624 0.802 0.578 0.712
SVM 0.743 0.704 0.771 0.701 0.742
LR 0.731 0.687 0.762 0.685 0.729
MU(DL) 0.743 0.720 0.751 0.728 0.743
结构化数据+
非结构化数据
NB 0.719 0.654 0.736 0.636 0.715
SVM 0.745 0.710 0.771 0.713 0.744
LR 0.734 0.687 0.760 0.689 0.732
SUCM(DL) 0.754 0.735 0.752 0.749 0.754
Experimental Results
Model Performance Improvement after Structured Data Combined with Unstructured Data
Model Performance Improvement after Unstructured Data Combined with Structured Data
激活函数 ACC F1 P R AUC
ReLU 0.754 0.735 0.752 0.749 0.754
ELU 0.750 0.728 0.767 0.724 0.749
Swish 0.748 0.727 0.752 0.736 0.747
Experimental Results with Different Activation Functions
全连接层数 ACC F1 P R AUC 运行时间/min
(3, 1) 0.750 0.732 0.760 0.738 0.750 1.3
(3, 2) 0.750 0.734 0.744 0.748 0.750 1.8
(4, 1) 0.754 0.735 0.752 0.749 0.754 1.2
(4, 2) 0.738 0.725 0.732 0.754 0.739 1.9
(5, 1) 0.741 0.725 0.739 0.740 0.741 1.4
(5, 2) 0.749 0.730 0.750 0.743 0.749 1.7
Experimental Results with Different Number of Fully Connected Layers
模 型 ACC F1 P R AUC 运行时间/min
MU(Char-CNN) 0.743 0.720 0.751 0.728 0.743 1.0
MU(Word2Vec) 0.728 0.715 0.720 0.732 0.728 2.3
SUCM(Char-CNN) 0.754 0.735 0.752 0.749 0.754 1.2
SUCM(Word2Vec) 0.729 0.713 0.733 0.718 0.729 4.3
Experimental Results with Different Text Processing Methods
[1] Roy S B, Teredesai A, Zolfaghar K, et al. Dynamic Hierarchical Classification for Patient Risk-of-Readmission[C]// Proceedings of the 21st ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. ACM, 2015: 1691-1700.
[2] Mcilvennan C K, Eapen Z J, Allen L A. Hospital Readmissions Reduction Program[J]. Circulation, 2015,131(20):1796-1803.
doi: 10.1161/CIRCULATIONAHA.114.010270 pmid: 25986448
[3] Benbassat J, Taragin M. Hospital Readmissions as a Measure of Quality of Health Care[J]. Archives of Internal Medicine, 2000,160(8):1074.
doi: 10.1001/archinte.160.8.1074 pmid: 10789599
[4] Zhang X, Zhao J B, LeCun Y. Character-Level Convolutional Networks for Text Classification[C]// Proceedings of the 28th International Conference on Neural Information Processing Systems. 2015: 649-657.
[5] 刘勘, 陈露. 面向医疗分诊的深度神经网络学习[J]. 数据分析与知识发现, 2019,3(6):99-108.
[5] ( Liu Kan, Chen Lu. Deep Neural Network Learning for Medical Triage[J]. Data Analysis and Knowledge Discovery, 2019,3(6):99-108.)
[6] Yu S P, Farooq F, Van Esbroeck A, et al. Predicting Readmission Risk with Institution-Specific Prediction Models[J]. Artificial Intelligence in Medicine, 2015,65(2):89-96.
doi: 10.1016/j.artmed.2015.08.005 pmid: 26363683
[7] 余传明, 龚雨田, 王峰, 等. 基于文本价格融合模型的股票趋势预测[J]. 数据分析与知识发现, 2018,2(12):33-42.
[7] ( Yu Chuanming, Gong Yutian, Wang Feng, et al. Predicting Stock Prices with Text and Price Combined Model[J]. Data Analysis and Knowledge Discovery, 2018,2(12):33-42.)
[8] 汤培楷. 基于机器学习的再入院预测[J]. 中国数字医学, 2016,11(7):50-52.
[8] ( Tang Peikai. Predicting Hospital Readmission Based on Machine Learning[J]. China Digital Medicine, 2016,11(7):50-52.)
[9] 朱春燕. 心血管疾病患者再次入院风险评估系统的设计与实现[D]. 杭州: 浙江大学, 2016.
[9] ( Zhu Chunyan. Design and Realization of a Readmission Risk Assessment System for Patients with Cardiovascular Disease[D]. Hangzhou: Zhejiang University, 2016.)
[10] 杜国栋. 基于梯度提升决策树的患者30天再入院预测模型研究[D]. 昆明: 昆明理工大学, 2018.
[10] ( Du Guodong. Study on Prediction Model of 30-day Readmission Based on Gradient Boosting Decision Tree[D]. Kunming: Kunming University of Science and Technology, 2018.)
[11] Eigner I, Reischl D, Bodendorf F. Development and Evaluation of Ensemble-Based Classification Models for Predicting Unplanned Hospital Readmissions after Hysterectomy[C]// Proceedings of Australasian Conference on Information Systems. 2018.
[12] Hammoudeh A, Alnaymat G, Ghannam I, et al. Predicting Hospital Readmission Among Diabetics Using Deep Learning[C]// Proceedings of the 5th International Symposium on Emerging Information, Communication and Networks. 2018: 484-489.
[13] Ashfaq A, Santanna A, Lingman M, et al. Readmission Prediction Using Deep Learning on Electronic Health Records[J]. Journal of Biomedical Informatics, 2019,97:103256.
pmid: 31351136
[14] Zebin T, Chaussalet T J. Design and Implementation of a Deep Recurrent Model for Prediction of Readmission in Urgent Care Using Electronic Health Records[C]// Proceedings of 2019 IEEE Conference on Computational Intelligence in Bioinformatics and Computational Biology (CIBCB). IEEE, 2019: 1-5.
[15] Wang H S, Cui Z C, Chen Y X, et al. Predicting Hospital Readmission via Cost-sensitive Deep Learning[J]. IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB), 2018,15(6):1968-1978.
doi: 10.1109/TCBB.2018.2827029
[16] Dashtban M, Li W. Predicting Risk of Hospital Readmission for Comorbidity Patients Through a Novel Deep Learning Framework[C]// Proceedings of the 53rd Hawaii International Conference on System Sciences. 2020.
[17] Craig E, Arias C, Gillman D. Predicting Readmission Risk from Doctors’ Notes[OL]. arXiv Preprint, arXiv: 1711. 10663.
[18] Xiao C, Ma T F, Dieng A B, et al. Readmission Prediction via Deep Contextual Embedding of Clinical Concepts[J]. PLoS One, 2018,13(4):e0195024.
doi: 10.1371/journal.pone.0195024 pmid: 29630604
[19] Patel A, Gan K, Li A, et al. Machine Learning Algorithms in Predicting Hospital Readmissions in Sickle Cell Disease[J]. Blood, 2019,134(S1):982.
[20] Liu X, Chen Y, Bae J, et al. Predicting Heart Failure Readmission from Clinical Notes Using Deep Learning[OL]. arXiv Preprint, arXiv: 1912. 10306.
[21] Tang F Y, Xiao C, Wang F, et al. Predictive Modeling in Urgent Care: A Comparative Study of Machine Learning Approaches[J]. JAMIA Open, 2018,1(1):87-98.
doi: 10.1093/jamiaopen/ooy011 pmid: 31984321
[22] Lovelace J R, Hurley N C, Haimovich A D, et al. Explainable Prediction of Adverse Outcomes Using Clinical Notes[OL]. arXiv Preprint, arXiv: 1910. 14095.
[23] Golas S B, Shibahara T, Agboola S, et al. A Machine Learning Model to Predict the Risk of 30-day Readmissions in Patients with Heart Failure: A Retrospective Analysis of Electronic Medical Records Data[J]. BMC Medical Informatics and Decision Making, 2018,18(1):44.
pmid: 29929496
[24] Kwon O, Na W, Yang H, et al. Electronic Medical Record-Based Machine Learning Approach to Predict the Risk of 30-Day Major Adverse Cardiac Event After Invasive Coronary Treatment[J]. Circulation, 2019,140(S1):A14474.
[25] 韩雅玲, 周玉杰. 冠心病合理用药指南(第2版)[J/OL]. 中国医学前沿杂志(电子版), 2018,10(6):1-130.
[25] ( Han Yaling, Zhou Yujie. Guidelines for Rational Use of Coronary Heart Disease (The 2nd edition) [J/OL]. Chinese Journal of the Frontiers of Medical Science (Electronic Version), 2018,10(6):1-130.)
[26] 曹明花. 长沙地区慢性心力衰竭患者再入院影响因素的研究[D]. 长沙: 湖南师范大学, 2013.
[26] ( Cao Minghua. A Study About the Risk Factors of Readmission in Patients with Chronic Heart Failure in Changsha[D]. Changsha: Hunan Normal University, 2013.)
[27] Goto456 Stopwords[EB/OL]. (2020-03-04). [2020-03-08]. https://github.com/goto456/stopwords.
[28] 郭跃华. 概率论与数理统计[M]. 北京: 科学出版社, 2007.
[28] ( Guo Yuehua. Probability Theory and Mathematical Statistics[M]. Beijing: Science Press, 2007.)
[29] 洪寒梅, 陈妍, 钱欣平, 等. 期刊影响力指数排名的合理性分析[J]. 中国科技期刊研究, 2018,29(8):842-848.
doi: 10.11946/cjstp.201803200244
[29] ( Hong Hanmei, Chen Yan, Qian Xinping, et al. Analysis of Rationality on the Ranking of Academic Journal Clout Index[J]. Chinese Journal of Scientific and Technical Periodicals, 2018,29(8):842-848.)
doi: 10.11946/cjstp.201803200244
[30] 谢为俊, 丁冶春, 王凤贺, 等. 基于卷积神经网络的油茶籽完整性识别方法[J]. 农业机械学报, 2020,51(7):13-21.
[30] ( Xie Weijun, Ding Yechun, Wang Fenghe, et al. Integrity Recognition of Camellia Oleifera Seeds Based on Neural Network[J]. Transactions of the Chinese Society for Agricultural Machinery, 2020,51(7):13-21.)
[31] Jia M M, Tian F. Readmission Prediction of Diabetic Based on Convolutional Neural Networks[C]// Proceedings of 2019 IEEE International Conference on Computer and Communications (ICCC). IEEE, 2019, DOI: 10.1109/ICCC47050.2019.9064477.
[32] Clevert D A, Unterthiner T, Hochreiter S, et al. Fast and Accurate Deep Network Learning by Exponential Linear Units (ELUs)[OL]. arXiv Preprint, arXiv: 1511. 07289.
[33] Ramachandran P, Zoph B, Le Q V, et al. Searching for Activation Functions[OL]. arXiv Preprint, arXiv: 1710. 05941.
[34] 向菲, 谢耀谈. 基于混合采样与迁移学习的患者评论识别模型[J]. 数据分析与知识发现, 2020,4(2/3):39-47.
[34] ( Xiang Fei, Xie Yaotan. Recognition Model of Patient Reviews Based on Mixed Sampling and Transfer Learning[J]. Data Analysis and Knowledge Discovery, 2020,4(2/3):39-47.)
[35] Li S, Zhao Z, Hu R F, et al. Analogical Reasoning on Chinese Morphological and Semantic Relations[C]// Proceedings of Meeting of the Association for Computational Linguistics. 2018: 138-143.
[1] Huang Lu,Zhou Enguo,Li Daifeng. Text Representation Learning Model Based on Attention Mechanism with Task-specific Information[J]. 数据分析与知识发现, 2020, 4(9): 111-122.
[2] Zhao Yang, Zhang Zhixiong, Liu Huan, Ding Liangping. Classification of Chinese Medical Literature with BERT Model[J]. 数据分析与知识发现, 2020, 4(8): 41-49.
[3] Xu Chenfei, Ye Haiying, Bao Ping. Automatic Recognition of Produce Entities from Local Chronicles with Deep Learning[J]. 数据分析与知识发现, 2020, 4(8): 86-97.
[4] Yu Chuanming, Wang Manyi, Lin Hongjun, Zhu Xingyu, Huang Tingting, An Lu. A Comparative Study of Word Representation Models Based on Deep Learning[J]. 数据分析与知识发现, 2020, 4(8): 28-40.
[5] Wang Xinyun,Wang Hao,Deng Sanhong,Zhang Baolong. Classification of Academic Papers for Periodical Selection[J]. 数据分析与知识发现, 2020, 4(7): 96-109.
[6] Jiao Qihang,Le Xiaoqiu. Generating Sentences of Contrast Relationship[J]. 数据分析与知识发现, 2020, 4(6): 43-50.
[7] Wang Mo,Cui Yunpeng,Chen Li,Li Huan. A Deep Learning-based Method of Argumentative Zoning for Research Articles[J]. 数据分析与知识发现, 2020, 4(6): 60-68.
[8] Deng Siyi,Le Xiaoqiu. Coreference Resolution Based on Dynamic Semantic Attention[J]. 数据分析与知识发现, 2020, 4(5): 46-53.
[9] Yu Chuanming,Yuan Sai,Zhu Xingyu,Lin Hongjun,Zhang Puliang,An Lu. Research on Deep Learning Based Topic Representation of Hot Events[J]. 数据分析与知识发现, 2020, 4(4): 1-14.
[10] Su Chuandong,Huang Xiaoxi,Wang Rongbo,Chen Zhiqun,Mao Junyu,Zhu Jiaying,Pan Yuhao. Identifying Chinese / English Metaphors with Word Embedding and Recurrent Neural Network[J]. 数据分析与知识发现, 2020, 4(4): 91-99.
[11] Liu Tong,Ni Weijian,Sun Yujian,Zeng Qingtian. Predicting Remaining Business Time with Deep Transfer Learning[J]. 数据分析与知识发现, 2020, 4(2/3): 134-142.
[12] Cai Jingxuan,Wu Jiang,Wang Chengkun. Predicting Usefulness of Crowd Testing Reports with Deep Learning[J]. 数据分析与知识发现, 2020, 4(11): 102-111.
[13] Ding Heng,Li Yingxuan. Improving Online Q&A Service with Deep Learning[J]. 数据分析与知识发现, 2020, 4(10): 37-46.
[14] Chuanming Yu,Haonan Li,Manyi Wang,Tingting Huang,Lu An. Knowledge Representation Based on Deep Learning:Network Perspective[J]. 数据分析与知识发现, 2020, 4(1): 63-75.
[15] Mengji Zhang,Wanyu Du,Nan Zheng. Predicting Stock Trends Based on News Events[J]. 数据分析与知识发现, 2019, 3(5): 11-18.
  Copyright © 2016 Data Analysis and Knowledge Discovery   Tel/Fax:(010)82626611-6626,82624938   E-mail:jishu@mail.las.ac.cn