Please wait a minute...
Data Analysis and Knowledge Discovery  2019, Vol. 3 Issue (12): 70-75    DOI: 10.11925/infotech.2096-3467.2019.0691
Current Issue | Archive | Adv Search |
Predicting Stroke Risks with Neural Network
Juhua Wu1(),Shuo Zhang1,Lei Tao1,Shunjun Jiang2
1 School of Management, Guangdong University of Technology, Guangzhou 510520, China
2 The First Affiliated Hospital of Guangzhou Medical University,Guangzhou 510120, China
Download: PDF (581 KB)   HTML ( 15
Export: BibTeX | EndNote (RIS)      
Abstract  

[Objective] This paper tries to effectively predict stroke risks, aiming to improve the diagnoses, treatments and interventions of stroke. [Methods] Firstly, we collected about 6000 inpatient medical records from a top hospital. Then, we identified 12 risk factors affecting stroke with logistic regression modeling. Thirdly, we constructed a multi-layer neural network model to predict stroke risks. Finally, we implemented the model with Python to examine its effectiveness. [Results] I. Total cholesterol and low-density lipoprotein etc. are the most important risk factors affecting the onset of stroke. II. When the number of hidden layer neurons was 7, the risk prediction model accuracy reached 97.10%.[Limitations] We need to include more risk factors and use multiple machine learning models for comparative analyses. [Conclusion] The proposed model could effectively predict the stoke risks facing patients.

Key wordsStroke      Risk Prediction Model      Neural Network      Data Analysis     
Received: 17 June 2019      Published: 25 December 2019
ZTFLH:  TP393  
Corresponding Authors: Juhua Wu     E-mail: 25973212@qq.com

Cite this article:

Juhua Wu,Shuo Zhang,Lei Tao,Shunjun Jiang. Predicting Stroke Risks with Neural Network. Data Analysis and Knowledge Discovery, 2019, 3(12): 70-75.

URL:

https://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/10.11925/infotech.2096-3467.2019.0691     OR     https://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/Y2019/V3/I12/70

研究者 时间 风险因素 研究方法
Feigin等[9] 2015 环境颗粒物污染、固体燃料污染、铅暴露、高钠饮食、饮食含糖量高、少水果蔬菜谷物、饮酒、体力活动、二手烟、高体重指数、高空腹血糖、高收缩压、高总胆固醇、肾小球滤过率 贝叶斯回归方程
Jusuf等[10] 2016 收缩压、舒张压、甘油三酯、卒中史、高血压史、血脂异常史、蔬菜消耗、睡眠持续时间、打鼾、运动、情绪压力 判别分析
Hijazi等[4] 2016 年龄、心脏生物标志物 N-末端片段B型利钠肽、心肌肌钙蛋白高敏感性和既往卒中 回归
Aigner等[11] 2017 高血压、高脂血症、糖尿病、冠心病、吸烟、大量发作性饮酒、低体力活动和肥胖 逻辑回归
Wang等[12] 2017 高血压、糖尿病、心脏病、中风家族史、高脂血症、超重、吸烟、体育锻炼 固定效应模型
杜秋明等[13] 2018 血压晨峰、颈动脉斑块形成、夜间平均收缩压、白昼血压负荷值、低密度脂蛋白 逻辑回归
Wang等[5] 2018 高血压、糖尿病、高脂血症、心脏病或心房颤动、肥胖、吸烟、体育锻炼、中风家族史 健康风险评估模型
邵泽国等[6] 2018 高血压、糖尿病、胆固醇、体重指数超标、少量食用蔬菜水果、吸烟、饮酒、过量食肉或牛奶、少运动 决策树
Navis等[14] 2018 年轻(<80): 糖尿病、吸烟; 老年(≥80): 高血压、高脂血症、心房颤动、冠状动脉疾病 回顾性综述
变量 类型 单位 赋值/数据取值范围
人口
统计
学资
性别(Gender) 无序二分类 男=1; 女=0
年龄(Age) 有序分类 18-44; 45-59;
60-74; ≥75
体重指数(BMI) 数值类型 Kg/m2 [12.4,68.0]
检验
指标
收缩压(SBP) 连续 mmHg [52,256]
舒张压(DBP) 连续 mmHg [36,180]
白细胞(WBC) 连续 109/L [0.15,29.7]
总胆固醇(TC) 连续 mmol/L [0.64,9.6]
甘油三酯(TG) 连续 mmol/L [0.01,21.3]
高密度脂蛋白(HDL) 连续 mmol/L [0.20,3.34]
低密度脂蛋白(LDL) 连续 mmol/L [0.01,9.54]
血肌酐(Scr) 连续 umlo/L [0,1755]
临床
病史
高血压(Hype) 无序二分类 是=1, 否=0
糖尿病(Diabetes) 无序二分类 是=1, 否=0
卒中史(HS) 无序二分类 是=1, 否=0
心脏病(HD) 无序二分类 是=1, 否=0
动脉硬化/狭窄/
闭塞(HSO)
无序二分类 是=1, 否=0
是否卒中(Stroke) 无序二分类 是=1, 否=0
变量 类别 非卒中数量 非卒中比列 卒中数量 卒中比列
Gender 男性 1 567 50.53% 1 534 49.47%
女性 1 489 63.33% 862 36.67%
Age 18-44岁 1 116 95.71% 50 4.29%
45-59岁 970 71.17% 393 28.83%
60-74岁 626 38.33% 1 007 61.67%
≥75岁 344 26.67% 946 73.33%
Hype 480 21.79% 1 723 78.21%
2 576 79.29% 673 20.71%
Diabetes 161 19.54% 663 80.46%
2 895 62.55% 1 733 37.45%
HS 13 0.62% 2 099 99.38%
3 046 91.20% 294 8.80%
HSO 0 0% 830 100.00%
4 622 100% 0 0%
预测值/观测值 0(非卒中) 1(卒中) 合计
0(非卒中) 593 28 621
1(卒中) 4 466 470
合计 597 494 1 091
[1] 王陇德, 刘建民, 杨弋 , 等. 我国脑卒中防治仍面临巨大挑战——《中国脑卒中防治报告2018》概要[J]. 中国循环杂志, 2019,34(2):105-119.
[1] ( Wang Longde, Liu Jianmin, Yang Yi , et al. The Prevention and Treatment of Stroke Still Face Huge Challenges——Brief Report on Stroke Prevention and Treatment in China, 2018[J]. Chinese Circulation Journal, 2019,34(2):105-119.)
[2] Kim A S, Cahill E, Cheng N T . Global Stroke Belt: Geographic Variation in Stroke Burden Worldwide[J]. Stroke, 2015,46(12):3564-3570.
[3] Moran A, Gu D, Zhao D , et al. Future Cardiovascular Disease in China: Markov Model and Risk Factor Scenario Projections from the Coronary Heart Disease Policy Model-China[J]. Circulation: Cardiovascular Quality and Outcomes, 2010,3(3):243-252.
[4] Hijazi Z, Lindbäck J, Alexander J H , et al. The ABC (Age, Biomarkers, Clinical History) Stroke Risk Score: A BiomarkerBased Risk Score for Predicting Stroke in Atrial Fibrillation[J]. European Heart Journal, 2016,37(20):1582-1590.
[5] Wang Y, Wang J, Cheng J , et al. Is the Population Detected by Screening in China Truly at High Risk of Stroke?[J]. Journal of Stroke and Cerebrovascular Diseases, 2018,27(8):2118-2123.
[6] 邵泽国, 陈晨, 陈炜 . 基于优化决策树的脑卒中日常生活习惯风险因素分析[J]. 现代预防医学, 2018,45(15):2689-2693.
[6] ( Shao Zeguo, Chen Chen, Chen Wei . Analysis of Risk Factors of Daily Life Habits in Stroke Based on Optimal Decision Tree[J]. Modern Preventive Medicine, 2018,45(15):2689-2693.)
[7] Chauhan S, Vig L, De Grazia M D F , et al. A Comparison of Shallow and Deep Learning Methods for Predicting Cognitive Performance of Stroke Patients from MRI Lesion Images[J]. Frontiers in Neuroinformatics. https://doi.org/10.3389/fninf. 2019. 00053.
[8] Almadani O, Alshammari R . Prediction of Stroke Using Data Mining Classification Techniques[J]. International Journal of Advanced Computer Science and Applications, 2018,9(1):457-460.
[9] Feigin V L, Mensah G A, Norrving B , et al. for the GBD 2013 Stroke Panel Experts Group. Atlas of the Global Burden of Stroke (1990-2013): The GBD 2013 Study[J]. Neuroepidemiology, 2015,45(3):230-236.
[10] Jusuf M I, Machfoed M H, Keman S . Infarction Stroke Risk Prediction Model for Indonesian Population: A Case-Control Study[J]. Bangladesh Journal of Medical Science, 2016,15(2):269-274.
[11] Aigner A, Grittner U, Rolfs A , et al. Contribution of Established Stroke Risk Factors to the Burden of Stroke in Young Adults[J]. Stroke, 2017,48(7):1744-1751.
[12] Wang J, Wen X, Li W , et al. Risk Factors for Stroke in the Chinese Population: A Systematic Review and Meta-analysis[J]. Journal of Stroke and Cerebrovascular Diseases, 2017,26(3):509-517.
[13] 杜秋明, 曹书华, 王淑亮 , 等. 高血压患者发生急性脑梗死的影响因素分析[J]. 中国慢性病预防与控制, 2018,26(2):133-137.
[13] ( Du Qiuming, Cao Shuhua, Wang Shuliang , et al. Analysis of Influencing Factors of Acute Cerebral Infarction in Patients with Hypertension[J]. Chinese Journal of Prevention and Control of Chronic Diseases, 2018,26(2):133-137.)
[14] Navis A, Garcia-Santibanez R, Skliut M . Epidemiology and Outcomes of Ischemic Stroke and Transient Ischemic Attack in the Adult and Geriatric Population[J]. Journal of Stroke and Cerebrovascular Diseases, 2018,28(1):84-89.
[15] 李敏, 王春霞, 夏冰 , 等. 健康管理人群脑卒中风险预测模型[J]. 山东大学学报: 医学版, 2017,55(6):93-97, 103.
[15] ( Li Min, Wang Chunxia, Xia Bing , et al. Risk Prediction Model for Stroke in Health Management Population[J]. Journal of Shandong University: Medical Sciences, 2017,55(6):93-97, 103.)
[16] Cai R, Zhu B, Ji L , et al. An CNN-LSTM Attention Approach to Understanding User Query Intent from Online Health Communities [C]// Proceedings of 2017 IEEE International Conference on Data Mining Workshops (ICDMW). IEEE, 2017: 430-437.
[17] Mackay J, Mensah G A, Greenlund K . The Atlas of Heart Disease and Stroke[M]. World Health Organization, 2004.
[18] Huang S C, Huang Y F . Bounds on the Number of Hidden Neurons in Multilayer Perceptrons[J]. IEEE Transactions on Neural Networks, 1991,2(1):47-55.
[19] Piri S, Delen D, Liu T , et al. A Data Analytics Approach to Building a Clinical Decision Support System for Diabetic Retinopathy: Developing and Deploying a Model Ensemble[J]. Decision Support Systems, 2017,101:12-27.
[20] Yang X, Li J, Hu D , et al. Predicting the 10-year Risks of Atherosclerotic Cardiovascular Disease in Chinese Population: The China-PAR Project (Prediction for ASCVD Risk in China)[J]. Circulation, 2016,134(19):1430-1440.
[21] Agarwal R, Dhar V . Big Data, Data Science, and Analytics: The Opportunity and Challenge for IS Research[J]. Information Systems Research, 2014,25(3):443-448.
[22] Lin Y K, Chen H, Brown R A , et al. Health Care Predictive Analytics for Risk Profiling in Chronic Care: A Bayesian Multitask Learning Approach[J]. MIS Quarterly, 2017,41(2):473-495.
[1] Gu Yaowen, Zhang Bowen, Zheng Si, Yang Fengchun, Li Jiao. Predicting Drug ADMET Properties Based on Graph Attention Network[J]. 数据分析与知识发现, 2021, 5(8): 76-85.
[2] Zhang Le, Leng Jidong, Lv Xueqiang, Cui Zhuo, Wang Lei, You Xindong. RLCPAR: A Rewriting Model for Chinese Patent Abstracts Based on Reinforcement Learning[J]. 数据分析与知识发现, 2021, 5(7): 59-69.
[3] Han Pu,Zhang Zhanpeng,Zhang Mingtao,Gu Liang. Normalizing Chinese Disease Names with Multi-feature Fusion[J]. 数据分析与知识发现, 2021, 5(5): 83-94.
[4] Wang Nan,Li Hairong,Tan Shuru. Predicting of Public Opinion Reversal with Improved SMOTE Algorithm and Ensemble Learning[J]. 数据分析与知识发现, 2021, 5(4): 37-48.
[5] Li Danyang, Gan Mingxin. Music Recommendation Method Based on Multi-Source Information Fusion[J]. 数据分析与知识发现, 2021, 5(2): 94-105.
[6] Ding Hao, Ai Wenhua, Hu Guangwei, Li Shuqing, Suo Wei. A Personalized Recommendation Model with Time Series Fluctuation of User Interest[J]. 数据分析与知识发现, 2021, 5(11): 45-58.
[7] Yin Haoran,Cao Jinxuan,Cao Luzhe,Wang Guodong. Identifying Emergency Elements Based on BiGRU-AM Model with Extended Semantic Dimension[J]. 数据分析与知识发现, 2020, 4(9): 91-99.
[8] Qiu Erli,He Hongwei,Yi Chengqi,Li Huiying. Research on Public Policy Support Based on Character-level CNN Technology[J]. 数据分析与知识发现, 2020, 4(7): 28-37.
[9] Liu Weijiang,Wei Hai,Yun Tianhe. Evaluation Model for Customer Credits Based on Convolutional Neural Network[J]. 数据分析与知识发现, 2020, 4(6): 80-90.
[10] Wang Mo,Cui Yunpeng,Chen Li,Li Huan. A Deep Learning-based Method of Argumentative Zoning for Research Articles[J]. 数据分析与知识发现, 2020, 4(6): 60-68.
[11] Yan Chun,Liu Lu. Classifying Non-life Insurance Customers Based on Improved SOM and RFM Models[J]. 数据分析与知识发现, 2020, 4(4): 83-90.
[12] Su Chuandong,Huang Xiaoxi,Wang Rongbo,Chen Zhiqun,Mao Junyu,Zhu Jiaying,Pan Yuhao. Identifying Chinese / English Metaphors with Word Embedding and Recurrent Neural Network[J]. 数据分析与知识发现, 2020, 4(4): 91-99.
[13] Xu Yuemei,Liu Yunwen,Cai Lianqiao. Predicitng Retweets of Government Microblogs with Deep-combined Features[J]. 数据分析与知识发现, 2020, 4(2/3): 18-28.
[14] Xiang Fei,Xie Yaotan. Recognition Model of Patient Reviews Based on Mixed Sampling and Transfer Learning[J]. 数据分析与知识发现, 2020, 4(2/3): 39-47.
[15] Ni Weijian,Guo Haoyu,Liu Tong,Zeng Qingtian. Online Product Recommendation Based on Multi-Head Self-Attention Neural Networks[J]. 数据分析与知识发现, 2020, 4(2/3): 68-77.
  Copyright © 2016 Data Analysis and Knowledge Discovery   Tel/Fax:(010)82626611-6626,82624938   E-mail:jishu@mail.las.ac.cn