|
|
Prediction and Early Warning Model for Environmental Data and Circulatory System Disease Death with Machine Learning |
Wang Yan,Xu Meimei,Tong Yujia,Gou Huan,Cai Rong,Shan Zhiyi,An Xinying() |
Institute of Medical Information, Chinese Academy of Medical Sciences/Peking Union Medical College, Beijing 100020, China |
|
|
Abstract [Objective] This paper builds a prediction and early warning model for circulatory system disease death, aiming to improve disease prevention. [Methods] We retrieved the death data of circulatory system diseases in a Chinese region from 2014 to 2018, and constructed the prediction model with GAM, RF and XGBoost. Then, we used the distributed lag nonlinear model to calculate the accumulative lag effect results, and built the early warning model. [Results] The continuous low and high temperatures, strong sunshine hours and high concentration of environmental pollutants would increase the risk of death from circulatory system diseases. The accumulative weekly relative risks were 1.236, 1.130, 1.560, 1.062, 1.218, 1.153 and 1.796 respectively. The RMSE of the RF and XGBoost models were 4.979 and 5.341 with good performance. Age, sex, temperature, sunshine hours, SO2, NO2, CO, O3, PM10, PM2.5 concentration are the characteristic variables, and the early warning value was determined from the data of accumulative lag effects. The early warning effect is good. The sensitivity, specificity and area under the curve of the XGBoost prediction results were 0.948, 0.939 and 0.941 respectively. [Limitations] We need to add data on concomitant diseases and their progress. [Conclusions] The regional number of deaths is related to the increase of age, men, temperature, sunshine hours and pollutant concentration. The new prediction and early warning model could benefit disease prevention and intervention.
|
Received: 05 January 2022
Published: 16 November 2022
|
|
Fund:Medical and Health Science and Technology Innovation Project of Chinese Academy of Medical Sciences(2021-I2M-1-033) |
Corresponding Authors:
An Xinying,ORCID:0000-0002-9870-7009
E-mail: an.xinying@imicams.ac.cn
|
[1] |
中国疾病预防控制中心. 我国某地区2014-2018年死因及环境监测数据.公共卫生数据科学中心[OL]. [2021-05-25].https://www.phsciencedata.cn/Share/renkoubei/index.jsp.
|
[1] |
(Chinese Center for Disease Control and Prevention. Cause of Death and Environmental Monitoring Data of a Region in China from 2014 to 2018 Public Health Data Science Center[OL]. [2021-05-25].https://www.phsciencedata.cn/Share/renkoubei/index.jsp.)
|
[2] |
王嘉鑫, 石彦军, 卢山, 等. 我国华东与西南县域主要气象敏感性疾病变化特征及其医疗费用研究[J]. 沙漠与绿洲气象, 2019, 13(6): 133-140.
|
[2] |
(Wang Jiaxin, Shi Yanjun, Lu Shan, et al. A Study on the Change Characteristics of Major Weather Sensitive Diseases and Their Medical Expenses in the County of Eastern and Western China[J]. Desert and Oasis Meteorology, 2019, 13(6): 133-140.)
|
[3] |
刘博, 党冰, 张楠, 等. 多种气象统计模型对比研究: 以气象敏感性疾病脑卒中预报为例[J]. 气象与环境学报, 2018, 34(4): 126-133.
|
[3] |
(Liu Bo, Dang Bing, Zhang Nan, et al. Comparison of Various Meteorological Statistical Forecasting Models-Taking Causing-Stroke Weather Forecasting as an Example[J]. Journal of Meteorology and Environment, 2018, 34(4): 126-133.)
|
[4] |
Ma P, Wang S G, Zhou J, et al. Meteorological Rhythms of Respiratory and Circulatory Diseases Revealed by Harmonic Analysis[J]. Heliyon, 2020, 6(5): e04034.
doi: 10.1016/j.heliyon.2020.e04034
|
[5] |
Liang H Q, Qiu H, Tian L W. Short-Term Effects of Fine Particulate Matter on Acute Myocardial Infraction Mortality and Years of Life Lost: A Time Series Study in Hong Kong[J]. Science of the Total Environment, 2018, 615: 558-563.
doi: 10.1016/j.scitotenv.2017.09.266
|
[6] |
Gasparrini A, Guo Y M, Hashizume M, et al. Mortality Risk Attributable to High and Low Ambient Temperature: A Multicountry Observational Study[J]. The Lancet, 2015, 386(9991): 369-375.
doi: 10.1016/S0140-6736(14)62114-0
|
[7] |
王嘉鑫. 我国东西部县域主要气象敏感性疾病变化特征及其医疗费用研究[D]. 成都: 成都信息工程大学, 2019.
|
[7] |
(Wang Jiaxin. Study on Variation Characteristics and Medical Expenses of Major Meteorological Sensitive Diseases in Eastern and Western Counties of China[D]. Chengdu: Chengdu University of Information Engineering, 2019.)
|
[8] |
孙兆彬, 安兴琴, 崔甍甍, 等. 北京地区颗粒物健康效应研究——沙尘天气、非沙尘天气下颗粒物(PM2.5、PM10)对心血管疾病入院人次的影响[J]. 中国环境科学, 2016, 36(8): 2536-2544.
|
[8] |
(Sun Zhaobin, An Xingqin, Cui Mengmeng, et al. The Effect of PM2.5 and PM10 on Cardiovascular and Cerebrovascular Diseases Admission Visitors in Beijing Areas During Dust Weather, Non-Dust Weather and Haze Pollution[J]. China Environmental Science, 2016, 36(8): 2536-2544.)
|
[9] |
科技部. 科技部关于发布科技基础资源调查专项2016年度项目指南的通知[EB/OL]. [2016-07-28]. https://www.neac.gov.cn/seac/mzjy/201608/1016713.shtml.
|
[9] |
(Ministry of Science and Technology of the People’s Republic of China. Notice of the Ministry of Science and Technology on Issuing the 2016 Project Guide for the Special Investigation of Basic Science and Technology Resources[EB/OL]. [2016-07-28]. https://service.most.gov.cn/2015tztg_all/20160728/1131.html.)
|
[10] |
黄学敏, 郑卓灵. 广东省佛山市高明区空气质量因素与呼吸系统疾病死亡的时间序列分析[J]. 现代医药卫生, 2021, 37(20): 3420-3425.
|
[10] |
(Huang Xuemin, Zheng Zhuoling. Time Series Analysis of Air Quality Factors and Death of Respiratory System Diseases in Gaoming District, Foshan City[J]. Modern Medicine & Health, 2021, 37 (20): 3420-3425.)
|
[11] |
高琦. 气象因素对手足口病发病的影响及预测预警研究[D]. 济南: 山东大学, 2021.
|
[11] |
(Gao Qi. Impact of Meteorological Factors on Hand Foot and Mouth Disease and Forecast and Early Warning[D]. Ji’nan: Shandong University, 2021.)
|
[12] |
钟沛丽. 我国流感流行特征、影响因素及模型预测研究[D]. 广州: 广州中医药大学, 2020.
|
[12] |
(Zhong Peili. A Study of Influenza Epidemic Character Istics, Influencing Factors and Model Prediction in China[D]. Guangzhou: Guangzhou University of Chinese Medicine, 2020.)
|
[13] |
Gasparrini A. Distributed Lag Linear and Non-linear Models in R: The Package DLNM[J]. Journal of Statistical Software, 2011, 43(8): 1-20.
pmid: 22003319
|
[14] |
贾俊妹. 石家庄地区三种天气敏感性疾病的医疗气象预报[D]. 兰州: 兰州大学, 2017.
|
[14] |
(Jia Junmei. Medical Meteorological Forecast for Three Weather Sensitive Diseases in Shijiazhuang[D]. Lanzhou: Lanzhou University, 2017.)
|
[15] |
刘志东. 气象因素致其他感染性腹泻发病综合风险评估及预警模型研究[D]. 济南: 山东大学, 2020.
|
[15] |
(Liu Zhidong. Impact of Meteorological Factor on Other Infectious Diarrhea: Comprehensive Risk Estimation and Early Warning Models[D]. Jinan: Shandong University, 2020.)
|
[16] |
唐琳, 赵英, 周志华, 等. 基于气象因素的衡阳市手足口病疫情预警模型的建立[J]. 实用预防医学, 2016, 23(7): 889-893.
|
[16] |
(Tang Lin, Zhao Ying, Zhou Zhihua, et al. Establishment of HFMD Early-Warning Model Based on Meteorological Factors in Hengyang City[J]. Practical Preventive Medicine, 2016, 23(7): 889-893.)
|
[17] |
Gasparrini A. Modeling Exposure-Lag-Response Associations with Distributed Lag Non-linear Models[J]. Statistics in Medicine, 2014, 33(5): 881-899.
doi: 10.1002/sim.5963
pmid: 24027094
|
[18] |
周凌柯. 数据校正技术的研究及应用[D]. 杭州: 浙江大学, 2005.
|
[18] |
(Zhou Lingke. Research on Data Reconciliation and Its Application[D]. Hangzhou: Zhejiang University, 2005.)
|
[19] |
Curriero F C, Heiner K S, Samet J M, et al. Temperature and Mortality in 11 Cities of the Eastern United States[J]. American Journal of Epidemiology, 2002, 155(1) : 80-87.
pmid: 11772788
|
[20] |
甘涛. 基于特征选择方法识别喉癌和下咽癌患者的预后基因标志物[D]. 长春: 吉林大学, 2020.
|
[20] |
(Gan Tao. Identification of Prognostic Gene Signatures for Laryngocarcinoma and Hypoharyngeal Carcinoma Patients Using Feature Selection Methods[D]. Changchun: Jilin University, 2020.)
|
[21] |
Costa O Y A, de Hollander M, Pijl A, et al. Cultivation-Independent and Cultivation-Dependent Metagenomes Reveal Genetic and Enzymatic Potential of Microbial Community Involved in the Degradation of a Complex Microbial Polymer[J]. Microbiome, 2020, 8(1): 76.
doi: 10.1186/s40168-020-00836-7
pmid: 32482164
|
[22] |
卢宏亮, 赵明松, 刘斌寅, 等. 基于Boruta-支持向量回归的安徽省土壤pH值预测制图[J]. 地理与地理信息科学, 2019, 35(5): 66-72.
|
[22] |
(Lu Hongliang, Zhao Mingsong, Liu Binyin, et al. Predictive Mapping of Soil pH in Anhui Province Based on Boruta-Support Vector Regression[J]. Geography and Geo-Information Science, 2019, 35(5): 66-72.)
|
[23] |
Rudnicki W R, Wrzesień M, Paja W.All Relevant Feature Selection Methods and Applications[A]// Feature Selection for Data and Pattern Recognition[M]. Cham: Springer, 2015: 11-28.
|
[24] |
谷少华, 贺天锋, 陆蓓蓓, 等. 基于分布滞后非线性模型的归因风险评估方法及应用[J]. 中国卫生统计, 2016, 33(6): 959-962.
|
[24] |
(Gu Shaohua, He Tianfeng, Lu Beibei, et al. Measures and Application for Attributable Risk from Distributed Lag Non-Linear Model[J]. Chinese Journal of Health Statistics, 2016, 33(6): 959-962.)
|
[25] |
Hua J X, Zhang Y X, de Foy B, et al. Quantitative Estimation of Meteorological Impacts and the COVID-19 Lockdown Reductions on NO2 and PM2.5 over the Beijing Area Using Generalized Additive Models (GAM)[J]. Journal of Environmental Management, 2021, 291: 112676.
doi: 10.1016/j.jenvman.2021.112676
|
[26] |
陶芳芳, 赵耐青, 何懿, 等. 广义相加模型在细菌性痢疾预警中的应用[J]. 中国卫生统计, 2012, 29(4): 481-483.
|
[26] |
(Tao Fangfang, Zhao Naiqing, He Yi, et al. Application of Generalized Additive Model in Early Warning of Bacillary Dysentery[J]. Chinese Journal of Health Statistics, 2012, 29(4): 481-483.)
|
[27] |
陈丰, 张婷, 黄雅迪, 等. 越江越海隧道入口段追尾事故风险预测模型研究[J]. 交通运输系统工程与信息, 2021, 21(6): 167-175.
|
[27] |
(Chen Feng, Zhang Ting, Huang Yadi, et al. Rear-End Crash Risk Prediction Model on Entrance Section of Cross-River and Cross-Sea Tunnels[J]. Journal of Transportation Systems Engineering and Information Technology, 2021, 21(6): 167-175.)
|
[28] |
冯晨, 陈志德. 基于XGBoost和LSTM加权组合模型在销售预测的应用[J]. 计算机系统应用, 2019, 28(10): 226-232.
|
[28] |
(Feng Chen, Chen Zhide. Application of Weighted Combination Model Based on XGBoost and LSTM in Sales Forecasting[J]. Computer Systems & Applications, 2019, 28(10): 226-232.)
|
[29] |
Duan W J, Wang X Q, Cheng S Y, et al. Influencing Factors of PM2.5 and O3 from 2016 to 2020 Based on DLNM and WRF-CMAQ[J]. Environmental Pollution, 2021, 285 : 117512.
doi: 10.1016/j.envpol.2021.117512
|
[30] |
王瑛, 朱小红, 刘强, 等. 2017-2019年苏州市大气主要污染物PM2.5与人群死亡风险的关系[J]. 职业与健康, 2021, 37(20): 2803-2808.
|
[30] |
(Wang Ying, Zhu Xiaohong, Liu Qiang, et al. Relationship Between Atmospheric Pollutants PM2.5 and Human Death Risk in Suzhou City from 2017-2019[J]. Occupation and Health, 2021, 37(20): 2803-2808.)
|
[31] |
Hu L, Xing Y, Jiang P, et al. Predicting the Postmortem Interval Using Human Intestinal Microbiome Data and Random Forest Algorithm[J]. Science & Justice, 2021, 61(5): 516-527.
|
[32] |
Hu Y B, Cheng J, Jiang F, et al. Season-Stratified Effects of Meteorological Factors on Childhood Asthma in Shanghai, China[J]. Environmental Research, 2020, 191: 110115.
doi: 10.1016/j.envres.2020.110115
|
[33] |
Parida B R, Bar S, Kaskaoutis D, et al. Impact of COVID-19 Induced Lockdown on Land Surface Temperature, Aerosol, and Urban Heat in Europe and North America[J]. Sustainable Cities and Society, 2021, 75: 103336.
doi: 10.1016/j.scs.2021.103336
|
[34] |
刘乐, 韦慧燕, 王兵亚, 等. 郑州市大气PM2.5与居民循环系统疾病死亡的相关性[J]. 环境与职业医学, 2021, 38(7): 740-746.
|
[34] |
(Liu Le, Wei Huiyan, Wang Bingya, et al. Correlations Between Atmospheric PM2.5 and Residents’ Circulatory Disease Deaths in Zhengzhou[J]. Journal of Environmental and Occupational Medicine, 2021, 38(7): 740-746.)
|
[35] |
付洺宇, 朱一阳, 吴春勇, 等. 基于机器学习的药物血浆蛋白结合率的预测[J]. 中国药科大学学报, 2021, 52(6): 699-706.
|
[35] |
(Fu Mingyu, Zhu Yiyang, Wu Chunyong, et al. Prediction of Plasma Protein Binding Rate Based on Machine Learning[J]. Journal of China Pharmaceutical University, 2021, 52(6): 699-706.)
|
[36] |
潘子妍, 邢素霞, 逄键梁, 等. 基于多特征融合与XGBoost的肺结节检测[J]. 中国医学物理学杂志, 2021, 38(11): 1371-1376.
|
[36] |
(Pan Ziyan, Xing Suxia, Pang Jianliang, et al. Lung Nodule Detection Based on Multi-Feature Fusion and XGBoost[J]. Chinese Journal of Medical Physics, 2021, 38(11): 1371-1376.)
|
[37] |
闵晶晶, 丁德平, 李津, 等. 北京急性脑血管疾病与气象要素的关系及预测[J]. 气象, 2014, 40(1): 108-113.
|
[37] |
(Min Jingjing, Ding Deping, Li Jin, et al. Relationship Between Acute Cerebrovascular Disease and Meteorological Factors in Beijing and Its Forecast[J]. Meteorological Monthly, 2014, 40(1): 108-113.)
|
[38] |
谢昀霏, 宋晓明, 方嘉堃, 等. 广州市氧化性污染物与气温对居民心脑血管疾病死亡风险的交互作用[J]. 环境与职业医学, 2021, 38(11): 1199-1206.
doi: 10.1097/00043764-199612000-00001
|
[38] |
(Xie Yunfei, Song Xiaoming, Fang Jiakun, et al. Interaction Between Oxidant Pollutants and Ambient Temperature on Cardio-Cerebrovascular Mortality Risks in Guangzhou, China[J]. Journal of Environmental and Occupational Medicine, 2021, 38(11): 1199-1206.)
doi: 10.1097/00043764-199612000-00001
|
|
Viewed |
|
|
|
Full text
|
|
|
|
|
Abstract
|
|
|
|
|
Cited |
|
|
|
|
|
Shared |
|
|
|
|
|
Discussed |
|
|
|
|