基于灰狼优化与多机器学习的重大传染病集合预测研究——以COVID-19疫情为例*

doi:10.11925/infotech.2096-3467.2021.1269

数据分析与知识发现

2022, Vol. 6

Issue (8): 122-133 https://doi.org/10.11925/infotech.2096-3467.2021.1269

研究论文

本期目录 | 过刊浏览 | 高级检索

基于灰狼优化与多机器学习的重大传染病集合预测研究——以COVID-19疫情为例*

曲宗希,沙勇忠(

),李雨桐

兰州大学管理学院兰州 730099,兰州大学应急管理研究中心兰州 730099

Predicting Major Infectious Diseases Based on Grey Wolf Optimization and Multi-machine Learning: Case Study of COVID-19

Qu Zongxi,Sha Yongzhong(

),Li Yutong

School of Management, Lanzhou University, Lanzhou 730099, China,Research Center for Emergency Management, Lanzhou University, Lanzhou 730099, China

摘要
图/表
参考文献
相关文章
Metrics

全文: PDF (3231 KB) HTML ( 19 )
输出: BibTeX | EndNote (RIS)

摘要

【目的】 预知重大传染病的发展趋势可提前制定应对措施,探索基于多机器学习的集合预测方法构建准确有效的传染病疫情预测模型。【方法】 建立融合多机器学习的重大传染病集合预测模型,基于灰狼优化算法搜索获得集合模型的最优权重系数。通过COVID-19疫情数据设计实验评估模型预测性能。【结果】 ANFIS、LSSVM和LSTM分别适用于确诊、死亡和恢复病例情景;基于灰狼优化的集合预测模型在三种情景下的平均R²分别达到0.989、0.993和0.987,相较于各单项模型的平均RMSE分别降低了37.37%、63.93%和53.37%。【局限】 模型需使用其他重大传染病疫情数据进一步验证。【结论】 不同机器学习的预测表现各有所长,基于灰狼优化的集合预测模型能够有效融合多机器学习的优势,从而获得稳定、精确的预测结果。

	服务

	把本文推荐给朋友
	加入引用管理器
	E-mail Alert
	RSS
	作者相关文章
	曲宗希
	沙勇忠
	李雨桐

关键词 ：重大传染病疫情, 集合预测, 灰狼优化, 机器学习

Abstract：

[Objective] This paper tries to build an accurate and effective forecasting model for major infectious diseases based on multi-machine learning, aiming to predict outbreak trends and help formulate countermeasures in advance. [Methods] We established an ensemble prediction model with three machine learning optimal weight combinations of ANFIS, LSSVM and LSTM from the Gray Wolf Optimization algorithm. Then, we assessed the model’s prediction performance with the COVID-19 epidemic data. [Results] The ANFIS, LSSVM, and LSTM were suitable for predicting confirmed cases, death cases, and recovery cases. The average R² of the proposed model reached 0.989, 0.993 and 0.987for the three scenarios. The average RMSE were 37.37%, 63.93% and 53.37% lower than the single model, respectively. [Limitations] The model needs to be examined with data sets on other major infectious diseases. [Conclusions] The ensemble prediction model based on Gray Wolf Optimization can effectively merge the advantages of multiple machine learning models to obtain stable and accurate results.

Key words： Major Infectious Disease Outbreak Ensemble Prediction Grey Wolf Optimization Machine Learning

收稿日期: 2021-11-07 出版日期: 2022-09-23

ZTFLH:	R183
	TP181

基金资助:*国家自然科学青年基金项目的研究成果之一(72004086)

通讯作者: 沙勇忠,ORCID： 0000-0002-2479-2335 E-mail: shayzh@lzu.edu.cn

引用本文:

曲宗希, 沙勇忠, 李雨桐. 基于灰狼优化与多机器学习的重大传染病集合预测研究——以COVID-19疫情为例*[J]. 数据分析与知识发现, 2022, 6(8): 122-133.
Qu Zongxi, Sha Yongzhong, Li Yutong. Predicting Major Infectious Diseases Based on Grey Wolf Optimization and Multi-machine Learning: Case Study of COVID-19. Data Analysis and Knowledge Discovery, 2022, 6(8): 122-133.

链接本文:

https://manu44.magtech.com.cn/Jwk_infotech_wk3/CN/10.11925/infotech.2096-3467.2021.1269 或 https://manu44.magtech.com.cn/Jwk_infotech_wk3/CN/Y2022/V6/I8/122

Fig.1 灰狼狩猎示意图

评价准则	计算公式
RMSE	$1 N ∑ n = 1 N (y n - y^n) 2 1 / 2$
MAE	$1 N ∑ n = 1 N y n - y^n$
R²	$1 - ∑ n = 1 N (y n - y^n) ∑ n = 1 N (y n - y -)$

Table 1 模型性能评价准则

Table 2 机器学习模型参数设置

Fig.2 不同机器学习模型预测误差对比

Table 3 GWO算法优化获得的集合模型的权重结果

Table 4 各模型在巴西的预测误差对比

Table 5 各模型在德国的预测误差对比

Table 6 各模型在印度的预测误差对比

Fig.3 巴西不同模型预测值与实际值对比

Fig.4 德国不同模型预测值与实际值对比

Fig.5 印度不同模型预测值与实际值对比

Table 7 模型DM检验结果

[1]	Wu T, Perrings C, Kinzig A, et al. Economic Growth, Urbanization, Globalization, and the Risks of Emerging Infectious Diseases in China: A Review[J]. Ambio, 2017, 46(1): 18-29. doi: 10.1007/s13280-016-0809-2
[2]	陈叶, 王萍, 刘芳炜, 等. 埃博拉出血热研究进展[J]. 中国公共卫生, 2017, 33(1): 170-172.
[2]	(Chen Ye, Wang Ping, Liu Fangwei, et al. Progress in Researches on Ebola Hemorrhagic Fever[J]. Chinese Journal of Public Health, 2017, 33(1): 170-172.)
[3]	Devadoss P R, Pan S L, Singh S. Managing Knowledge Integration in a National Health-Care Crisis: Lessons Learned from Combating SARS in Singapore[J]. IEEE Transactions on Information Technology in Biomedicine, 2005, 9(2):266-275. pmid: 16138543
[4]	Racey P A, Fenton B. Mubareka S, et al. Don’t Misrepresent Link Between Bats and SARS[J]. Nature, 2018, 553(7688): 281.
[5]	Zumla A, Hui D S, Perlman S. Middle East Respiratory Syndrome[J]. The Lancet, 2015, 386(9997): 995-1007. doi: 10.1016/S0140-6736(15)60454-8
[6]	Cauchemez S, Besnard M, Bompard P, et al. Association Between Zika Virus and Microcephaly in French Polynesia, 2013-15: A Retrospective Study[J]. The Lancet, 2016, 387(10033): 2125-2132. doi: 10.1016/S0140-6736(16)00651-6
[7]	Swapnarekha H, Behera H S, Nayak J, et al. Role of Intelligent Computing in COVID-19 Prognosis: A State-of-the-Art Review[J]. Chaos, Solitons & Fractals, 2020, 138: 109947. doi: 10.1016/j.chaos.2020.109947
[8]	Ghosal S, Sengupta S, Majumder M, et al. Linear Regression Analysis to Predict the Number of Deaths in India due to SARS-CoV-2 at 6 Weeks from Day 0 (100 Cases-March 14th 2020)[J]. Diabetes & Metabolic Syndrome: Clinical Research & Reviews, 2020, 14(4): 311-315.
[9]	Ly K T. A COVID-19 Forecasting System Using Adaptive Neuro-Fuzzy Inference[J]. Finance Research Letters, 2021, 41: 101844. doi: 10.1016/j.frl.2020.101844
[10]	Borghi P H, Zakordonets O, Teixeira J P. A COVID-19 Time Series Forecasting Model Based on MLP ANN[J]. Procedia Computer Science, 2021, 181: 940-947. doi: 10.1016/j.procs.2021.01.250
[11]	Parbat D, Chakraborty M. A Python Based Support Vector Regression Model for Prediction of COVID19 Cases in India[J]. Chaos, Solitons & Fractals, 2020, 138: 109942. doi: 10.1016/j.chaos.2020.109942
[12]	Shastri S, Singh K, Kumar S, et al. Time Series Forecasting of Covid-19 Using Deep Learning Models: India-USA Comparative Case Study[J]. Chaos Solitons & Fractals, 2020, 140: 110227. doi: 10.1016/j.chaos.2020.110227
[13]	洪彬, 陈锦秀, 王连生, 等. 基于SEIR-LSTM混合模型的新型冠状病毒肺炎传播趋势分析与预测[J]. 厦门大学学报(自然科学版), 2020, 59(6): 1034-1040.
[13]	(Hong Bin, Chen Jinxiu, Wang Liansheng, et al. Analysis and Prediction of the Spread Trend of COVID-19 based on SEIR-LSTM Mixed Model[J]. Journal of Xiamen University(Natural Science), 2020, 59(6): 1034-1040.)
[14]	程玲华, 陈华友. 基于Theil不等系数的加权几何平均组合预测模型的性质[J]. 运筹与管理, 2007, 16(2): 78-83.
[14]	(Cheng Linghua, Chen Huayou. Properties of Weighted Geometric Means Combination Forecasting Method Based on Theil Coefficient[J]. Operations Research and Management Science, 2007, 16(2): 78-83.)
[15]	袁宏俊, 钟梅, 吴庆鹏. 基于IGOWLA算子的区间组合预测模型[J]. 统计与决策, 2016 (14): 22-25.
[15]	(Yuan Hongjun, Zhong Mei, Wu Qingpeng. Interval Combination Prediction Model based on IGOWLA Operator. Statistics & Decision, 2016 (14): 22-25.)
[16]	Bates J M, Granger C W J. The Combination of Forecasts[J]. Journal of the Operational Research Society, 1969, 20(4): 451-468. doi: 10.1057/jors.1969.103
[17]	Ren Y, Suganthan P N, Srikanth N. Ensemble Methods for Wind and Solar Power Forecasting—A State-of-the-Art Review[J]. Renewable and Sustainable Energy Reviews, 2015, 50: 82-91. doi: 10.1016/j.rser.2015.04.081
[18]	Mirjalili S, Mirjalili S M, Lewis A. Grey Wolf Optimizer[J]. Advances in Engineering Software, 2014, 69: 46-61. doi: 10.1016/j.advengsoft.2013.12.007
[19]	Emary E, Zawbaa H M, Grosan C, et al. Feature Subset Selection Approach by Gray-Wolf Optimization[C]// Proceedings of Afro-European Conference for Industrial Advancement. 2015: 1-13.
[20]	王琛, 董永权. 基于二进制灰狼优化的特征选择及文本聚类[J]. 计算机工程与设计, 2021, 42(9): 2526-2535.
[20]	(Wang Chen, Dong Yongquan. Feature Selection Based on Binary Grey Wolf Optimization and Text Clustering[J]. Computer Engineering and Design, 2021, 42(9): 2526-2535.)
[21]	李天翼, 陈红梅. 一种用于解决特征选择问题的新型混合演化算法[J]. 郑州大学学报(理学版), 2021, 53(2): 41-49.
[21]	(Li Tianyi, Chen Hongmei. A New Hybrid Evolutionary Algorithm for Solving Feature Selection Problem[J]. Journal of Zhengzhou University (Natural Science Edition), 2021, 53(2): 41-49.)
[22]	Wong L I, Sulaiman M H, Mohamed M R. Solving Economic Dispatch Problems with Practical Constraints Utilizing Grey Wolf Optimizer[J]. Applied Mechanics and Materials, 2015, 785: 511-515. doi: 10.4028/www.scientific.net/AMM.785.511
[23]	Kamboj V K, Bath S K, Dhillon J S. Solution of Non-Convex Economic Load Dispatch Problem Using Grey Wolf Optimizer[J]. Neural Computing and Applications, 2016, 27(5):1301-1316. doi: 10.1007/s00521-015-1934-8
[24]	Sulaiman M H, Ing W L, Mustaffa Z, et al. Grey Wolf Optimizer for Solving Economic Dispatch Problem with Valve-Loading Effects[J]. APRN Journal of Engineering and Applied Sciences, 2015, 10(21): 1619-1628.
[25]	Jayabarathi T, Raghunathan T, Adarsh B R, et al. Economic Dispatch Using Hybrid Grey Wolf Optimizer[J]. Energy, 2016, 111: 630-641. doi: 10.1016/j.energy.2016.05.105
[26]	Yusof Y, Mustaffa Z. Time Series Forecasting of Energy Commodity Using Grey Wolf Optimizer[C]// Proceedings of the International MultiConference of Engineers and Computer Scientists. 2015.
[27]	Mustaffa Z, Sulaiman M H, Kahar M N M. Training LSSVM with GWO for Price Forecasting[C]// Proceedings of 2015 International Conference on Informatics, Electronics& Vision. 2015: 1-6.
[28]	Hassanin M F, Shoeb A M, Hassanien A E. Grey Wolf Optimizer-Based Back-Propagation Neural Network Algorithm[C]// Proceedings of the 12th International Computer Engineering Conference. 2016: 213-218.
[29]	Jang J S R. ANFIS: Adaptive-Network-Based Fuzzy Inference System[J]. IEEE Transactions on Systems, Man, and Cybernetics, 1993, 23(3): 665-685. doi: 10.1109/21.256541
[30]	Suykens J, Vandewalle J. Least Squares Support Vector Machine Classifiers[J]. Neural Processing Letters, 1999, 9: 293-300. doi: 10.1023/A:1018628609742
[31]	Hochreiter S, Schmidhuber J. Long Short-Term Memory[J]. Neural Computation, 1997, 9(8): 1735-1780. pmid: 9377276
[32]	Clemen R T. Combining Forecasts: A Review and Annotated Bibliography[J]. International Journal of Forecasting, 1989, 5(4): 559-583. doi: 10.1016/0169-2070(89)90012-5
[33]	Dong E S, Du H R, Gardner L. An Interactive Web-Based Dashboard to Track COVID-19 in Real Time[J]. The Lancet Infectious Diseases, 2020, 20(5): 533-534. doi: 10.1016/S1473-3099(20)30120-1
[34]	Diebold F X, Mariano R S. Comparing Predictive Accuracy[J]. Journal of Business & Economic Statistics, 1995, 13(3): 253-263.

[1]	赵杨, 严周周, 沈棋琦, 李钟航. 基于机器学习的医疗健康APP隐私政策合规性研究*[J]. 数据分析与知识发现, 2022, 6(5): 112-126.
[2]	王露, 乐小虬. 科技论文引用内容分析研究进展[J]. 数据分析与知识发现, 2022, 6(4): 1-15.
[3]	王若佳, 严承希, 郭凤英, 王继民. 基于用户画像的在线健康社区用户流失预测研究^*[J]. 数据分析与知识发现, 2022, 6(2/3): 80-92.
[4]	吴金红, 穆克亮. 国际期刊异常行为的自动识别与预警研究^*[J]. 数据分析与知识发现, 2022, 6(2/3): 385-395.
[5]	胡雅敏, 吴晓燕, 陈方. 基于机器学习的技术术语识别研究综述[J]. 数据分析与知识发现, 2022, 6(2/3): 7-17.
[6]	车宏鑫,王桐,王伟. 前列腺癌预测模型对比研究*[J]. 数据分析与知识发现, 2021, 5(9): 107-114.
[7]	陈东华,赵红梅,尚小溥,张润彤. 数据驱动的大型医院手术室运营预测与优化方法研究*[J]. 数据分析与知识发现, 2021, 5(9): 115-128.
[8]	王寒雪,崔文娟,周园春,杜一. 基于机器学习的食源性疾病致病菌识别方法*[J]. 数据分析与知识发现, 2021, 5(9): 54-62.
[9]	苏强, 侯校理, 邹妮. 基于机器学习组合优化方法的术后感染预测模型研究^*[J]. 数据分析与知识发现, 2021, 5(8): 65-75.
[10]	曹睿,廖彬,李敏,孙瑞娜. 基于XGBoost的在线短租市场价格预测及特征分析模型^*[J]. 数据分析与知识发现, 2021, 5(6): 51-65.
[11]	钟佳娃,刘巍,王思丽,杨恒. 文本情感分析方法及应用综述^*[J]. 数据分析与知识发现, 2021, 5(6): 1-13.
[12]	向卓元,刘志聪,吴玉. 基于用户行为自适应推荐模型研究 ^*[J]. 数据分析与知识发现, 2021, 5(4): 103-114.
[13]	周志超. 基于机器学习技术的自动引文分类研究综述^*[J]. 数据分析与知识发现, 2021, 5(12): 14-24.
[14]	柴国荣,王斌,沙勇忠. 基于多机器学习方法联合的公共卫生风险预测研究——以兰州市流感预测为例*[J]. 数据分析与知识发现, 2021, 5(1): 90-98.
[15]	陈东,王建冬,李慧颖,蔡思航,黄倩倩,易成岐,曹攀. 融合机器学习算法和多因素的禽肉交易量预测方法研究 ^*[J]. 数据分析与知识发现, 2020, 4(7): 18-27.

Viewed

Full text

Abstract

Cited

Shared

Discussed