|
|
Comparing Prediction Models for Prostate Cancer |
Che Hongxin,Wang Tong,Wang Wei() |
School of Public Health, Jilin University, Changchun 130021, China |
|
|
Abstract [Objective] This paper compares the performance of prostate cancer prediction models based on ensemble learning and non-ensemble learning algorithms, aiming to identify the optimal algorithm and key risk factors for the cancer. [Objective] First, we constructed the prediction models with K-Nearest Neighbor, Decision Tree, Support Vector Machine, and BP neural network. Then, we built prediction models based on AdaBoost, GradientBoost and XGBoost. Finally, we identified risk factors of prostate cancer with the two groups of models. [Results] Among models based on the non-ensemble algorithms, the Decision Tree model had the best performance with the accuracy of 0.933 3, the F1 score of 0.930 1, and the AUC of 0.914 5. For the ensemble algorithm based models, the performance of XGBoost model was the best, with the accuracy of 0.957 3, F1 score of 0.962 4, and the AUC of 0.951 3. We found nine important risk factors for prostate cancer, including total PSA and free PSA. [Limitations] The experimental data set and the model building algorithm need to be expanded. [Conclusions] Ensemble learning algorithm is better than the non-ensemble ones to predict prostate cancer and identify risk factors.
|
Received: 29 November 2020
Published: 15 October 2021
|
|
Fund:*Interdisciplinary Research Funding Program for Doctoral Students of Jilin University(101832020DJX081) |
Corresponding Authors:
Wang Wei
E-mail: w_w@jlu.edu.cn
|
[1] |
Bray F, Ferlay J, Soerjomataram I, et al. Global Cancer Statistics 2018: GLOBOCAN Estimates of Incidence and Mortality Worldwide for 36 Cancers in 185 Countries[J]. CA: A Cancer Journal for Clinicians, 2018, 68(6):394-424.
doi: 10.3322/caac.v68.6
|
[2] |
顾秀瑛, 郑荣寿, 张思维, 等. 2000—2014年中国肿瘤登记地区前列腺癌发病趋势及年龄变化分析[J]. 中华预防医学杂志, 2018, 52(6):586-592.
|
[2] |
( Gu Xiuying, Zheng Rongshou, Zhang Siwei, et al. Analysis on the Trend of Prostate Cancer Incidence and Age Change in Cancer Registration Areas of China, 2000 to 2014[J]. Chinese Journal of Preventive Medicine, 2018, 52(6):586-592.)
|
[3] |
Platz E A, Rimm E B, Willett W C, et al. Racial Variation in Prostate Cancer Incidence and in Hormonal System Markers Among Male Health Professionals[J]. Journal of the National Cancer Institute, 2000, 92(24):2009-2017.
pmid: 11121463
|
[4] |
Culp M B, Soerjomataram I, Efstathiou J A, et al. Recent Global Patterns in Prostate Cancer Incidence and Mortality Rates[J]. European Urology, 2020, 77(1):38-52.
doi: 10.1016/j.eururo.2019.08.005
|
[5] |
Nitta S, Tsutsumi M, Sakka S, et al. Machine Learning Methods Can More Efficiently Predict Prostate Cancer Compared with Prostate-specific Antigen Density and Prostate Specific Antigen Velocity[J]. Prostate International, 2019, 7(3):114-118.
doi: 10.1016/j.prnil.2019.01.001
|
[6] |
Jones S, Hargrave C, Deegan T, et al. Comparison of Statistical Machine Learning Models for Rectal Protocol Compliance in Prostate External Beam Radiation Therapy[J]. Medical Physics, 2020, 47(4):1452-1459.
doi: 10.1002/mp.v47.4
|
[7] |
杨振森, 李传富, 周康源, 等. 基于小波变换的超声图像纹理特征提取及前列腺癌诊断[J]. 航天医学与医学工程, 2009, 22(4):281-285.
|
[7] |
( Yang Zhensen, Li Chuanfu, Zhou Kangyuan, et al. Diagnosis of Prostate Cancer and Texture Feature Extraction of Ultrasound Images Based on Wavelet Transform[J]. Space Medicine & Medical Engineering, 2009, 22(4):281-285.)
|
[8] |
殷昭阳, 李方龙, 崔亮, 等. 前列腺穿刺活检结果预测模型的建立[J]. 现代泌尿生殖肿瘤杂志, 2016, 8(5):283-287.
|
[8] |
( Yin Zhaoyang, Li Fanglong, Cui Liang, et al. Establishment of a Model for Predicting the Result of Prostate Biopsy[J]. Journal of Contemporary Urologic and Reproductive Oncology, 2016, 8(5):283-287.)
|
[9] |
彭涛, 肖建明, 张仕慧, 等. 基于多参数MRI及影像组学建立机器学习模型诊断临床显著性前列腺癌[J]. 中国医学影像技术, 2019, 35(10):1526-1530.
|
[9] |
( Peng Tao, Xiao Jianming, Zhang Shihui, et al. Establishment of Machine Learning Models for Diagnosis of Clinically Significant Prostate Cancer Based on Multi-Parameter MRI and Radiomics[J]. Chinese Journal of Medical Imaging Technology, 2019, 35(10):1526-1530.)
|
[10] |
陈志远, 杨瑞, 刘修恒. 机器学习构建多基因模型预测前列腺癌[J]. 现代泌尿外科杂志, 2020, 25(7):585-589.
|
[10] |
( Chen Zhiyuan, Yang Rui, Liu Xiuheng. Construction of a Multigene Predictive Model of Prostate Cancer Based on Machine Learning[J]. Journal of Modern Urology, 2020, 25(7):585-589.)
|
[11] |
Maiti T, Mukhopadhyay P. Comparison of Statistical Classification Methods Based on a Prostate Cancer Study[J]. Calcutta Statistical Association Bulletin, 2005, 57(3-4):219-238.
doi: 10.1177/0008068320050306
|
[12] |
肖利洪, 陈沛然, 李梅, 等. TAN贝叶斯网络模型在前列腺癌中的预测研究[J]. 中华男科学杂志, 2016, 22(6):506-510.
|
[12] |
( Xiao Lihong, Chen Peiran, Li Mei, et al. Tree-Augmented Naive Bayesian Network Model for Predicting Prostate Cancer[J]. National Journal of Andrology, 2016, 22(6):506-510.)
|
[13] |
Sanchis-Bonet A, Ortega-Polledo L, Garcia-Loarte E E, et al. Utility of Prostate Health Index and Prostate Health Index Density in Predicting Detection of Clinically Significant Prostate Cancer in a Cohort of Patients with PSA in the Grey Zone and Normal Digital Rectal Examination[J]. European Urology Supplements, 2019, 18(11):e3420.
doi: 10.1016/S1569-9056(19)34593-2
|
[14] |
徐继伟, 杨云. 集成学习方法: 研究综述[J]. 云南大学学报(自然科学版), 2018, 40(6):1082-1092.
|
[14] |
( Xu Jiwei, Yang Yun. A Survey of Ensemble Learning Approaches[J]. Journal of Yunnan University (Natural Sciences Edition), 2018, 40(6):1082-1092.)
|
[15] |
Zhou Z H, Wu J X, Tang W. Ensembling Neural Networks: Many Could be Better Than All[J]. Artificial Intelligence, 2002, 137(1-2):239-263.
doi: 10.1016/S0004-3702(02)00190-X
|
[16] |
于玲, 吴铁军. 集成学习:Boosting算法综述[J]. 模式识别与人工智能, 2004, 17(1):52-59.
|
[16] |
( Yu Ling, Wu Tiejun. Assemble Learning: A Survey of Boosting Algorithms[J]. Pattern Recognition and Artificial Intelligence, 2004, 17(1):52-59.)
|
[17] |
Çınar M, Engin M, Engin E Z, et al. Early Prostate Cancer Diagnosis by Using Artificial Neural Networks and Support Vector Machines[J]. Expert Systems with Applications, 2009, 36(3):6357-6361.
doi: 10.1016/j.eswa.2008.08.010
|
[18] |
Lee H J, Hwang S I, Han S M, et al. Image-based Clinical Decision Support for Transrectal Ultrasound in the Diagnosis of Prostate Cancer: Comparison of Multiple Logistic Regression, Artificial Neural Network, and Support Vector Machine[J]. European Radiology, 2010, 20(6):1476-1484.
doi: 10.1007/s00330-009-1686-x
|
[19] |
Pantic D N, Stojadinovic M M, Stojadinovic M M. Decision Tree Analysis for Prostate Cancer Prediction in Patients with Serum PSA 10 ng/mL or Less[J]. Serbian Journal of Experimental and Clinical Research, 2020, 21(1):43-50.
doi: 10.2478/sjecr-2018-0039
|
[20] |
黄朴文. 基于集成学习的糖尿病分析预测[J]. 电子制作, 2018(22):73-75.
|
[20] |
( Huang Puwen. Diabetes Analysis and Prediction Based on Ensemble Learning[J]. Practical Electronics, 2018(22):73-75.)
|
[21] |
汤元杰. 雄激素对前列腺癌细胞内游离钙离子浓度的影响及其机制探讨[D]. 上海:第二军医大学, 2004.
|
[21] |
( Tang Yuanjie. Effects of Androgen on Intracellular Free Calcium Concentration of Prostate Cancer Cells and Its Underlying Mechanism[D]. Shanghai: Second Military Medical University, 2004.)
|
[22] |
巩蓓, 雷婷, 张曼. 比较载脂蛋白A-1在前列腺癌和前列腺增生中的表达[J]. 国际检验医学杂志, 2015, 36(2):150-152.
|
[22] |
( Gong Bei, Lei Ting, Zhang Man. Expression of Apolipoprotein A-I in Prostate Cancer and Benign Prostatic Hyperplasia[J]. International Journal of Laboratory Medicine, 2015, 36(2):150-152.)
|
[23] |
Venanzoni M, Giunta S, Muraro G, et al. Apolipoprotein E Expression in Localized Prostate Cancers[J]. International Journal of Oncology, 2003, 22(4):779-786.
pmid: 12632068
|
[24] |
郑轶群, 李志坚, 高新, 等. Eag1钾通道在前列腺癌组织中的表达及意义[J]. 中华男科学杂志, 2013, 19(3):205-209.
|
[24] |
( Zheng Yiqun, Li Zhijian, Gao Xin, et al. Expression of Eag1 K(+) Channel in Prostate Cancer and Its Significance[J]. National Journal of Andrology, 2013, 19(3):205-209.)
|
|
Viewed |
|
|
|
Full text
|
|
|
|
|
Abstract
|
|
|
|
|
Cited |
|
|
|
|
|
Shared |
|
|
|
|
|
Discussed |
|
|
|
|