[Objective] This paper tries to identify potential targets of antineoplastic drugs, aiming to provide references for future clinical work and experiment. [Methods] First, we retrieved the targets of antineoplastic drugs from the DrugBank database, which were also combined with the protein interaction information from the HPRD database. Then, we established the PPI network for these targets with Cytoscape and calculated the topology properties of the nodes. Third, we used SPSS single factor analysis and Weka’s information gain principle to choose the variables for topological attributes. Fourth, we introduced the SMOTE algorithm to process unbalanced data sets and constructed the prediction model for antineoplastic drug targets with the decision tree method. Finally, we compared the performance of our new model with those of the classic ones. [Results] The precision of the proposed model reached 73.18%. With the help of CBioPortal, we found 16 targets’ prediction scores higher than 0.9. These targets could mutate and amplify in various tumors, which were analyzed with the case of NR5A1. [Limitations] The characteristics of target functions, sequence attributes, and other factors should also be included to construct the model. [Conclusions] The proposed model could predict the potential targets of antineoplastic drugs effectively.
范馨月, 崔雷. 基于网络属性的抗肿瘤药物靶点预测方法及其应用*[J]. 数据分析与知识发现, 2018, 2(12): 98-108.
Fan Xinyue,Cui Lei. Predicting Antineoplastic Drug Targets Based on Network Properties. Data Analysis and Knowledge Discovery, 2018, 2(12): 98-108.
Mediator of RNA polymerase II transcription subunit 1
Cervical Cancer (4.6%)
Breast Cancer, NOS (31.25%)
THRB
Thyroid hormone receptor beta
Cutaneous Melanoma (5.23%)
Prostate Cancer, NOS (21.54%)
NCS1
Neuronal calcium sensor 1
Endometrial Cancer (0.59%)
Prostate Cancer, NOS (13.85%)
NR3C2
Mineralocorticoid receptor
Ovarian/Fallopian Tube Cancer, NOS (14.29%)
Prostate Cancer, NOS (15.38%)
TUB
Tubby protein homolog
Endometrial Cancer (4.08%)
Prostate Cancer, NOS (9.23%)
IL2
Interleukin-2
Cutaneous Melanoma (1.05%)
Prostate Cancer, NOS (7.69%)
Score≥0.9的药物靶点在癌症组织中突变及扩增情况
NR5A1一阶邻居子网
NR5A1在不同类型癌症中的表达情况
NFKB1在不同类型癌症中的表达情况
NCOA1在不同类型癌症中的表达情况
MAPK1在不同类型癌症中的表达情况
JUN在不同类型癌症中的表达情况
癌症类型 基因名称
Melanoma
Adrenocortical Carcinoma
Endometrial Cancer
Esophagogastric Cancer
Colorectal Adenocarcinoma
Cancer of Unknown Primary
AR
2.09% (2.79%a,18b)
1.97% (1.97%,20)
6.08% (6.68%,4)
4.09% (4.75%,11)
4.52% (4.52%,9)
5.14% (5.24%,7)
NCOA1
2.79% (3.83%,6)
1.97% (2.46%,9)
5.34% (7.42%,2)
2.47% (3.20%,7)
3.55% (3.55%,4)
4.40% (5.99%,3)
JUN
0.35% (1.05%,15)
0 (0.99,-)
0.59% (0.96%,8)
0.76% (1.15%,7)
1.94% (1.94%,2)
0.47% (3.27%,11)
MAPK1
1.39% (3.48%,4)
0.99% (2.96%,8)
1.19% (2.23%,6)
0.49% (1.55%,16)
0.65% (0.96%,14)
0.84% (6.08%,9)
NFKB1
2.09% (2.79%,4)
0.99% (0.99%,8)
3.86% (4.15%,2)
0.73% (0.89%,13)
2.91% (2.91%,3)
6.74% (8.23%,1)
5种抗肿瘤药物靶点在不同癌症组织中突变频率
基因名称
Case
Ampilication Case
比例
AR
65
38
58.46%
NR5A1
65
11
16.92%
NCOA1
65
8
12.31%
JUN
65
6
9.23%
MAPK1
65
5
7.65%
NFKB1
65
4
6.15%
前列腺癌中6种基因扩增率
[1]
Allemani C, Matsuda T, Di Carlo V, et al.Global Surveillance of Trends in Cancer Survival 2000-14 (CONCORD-3): Analysis of Individual Records for 37513025 Patients Diagnosed with One of 18 Cancers from 322 Population-based Registries in 71 Countries[J]. The Lancet, 2018, 391(10125): 1023-1075.
doi: 10.1016/S0140-6736(17)33326-3
pmid: 29395269
(Chen Wanqing, Sun Kexin, Zheng Rongshou, et al.Report of Cancer Incidence and Mortality in Different Areas of China, 2014[J]. China Cancer, 2018, 27(1): 1-14.)
[3]
Futreal P A, Coin L, Marshall M, et al.A Census of Human Cancer Genes[J]. Nature Reviews Cancer, 2004, 4(3): 177-183.
doi: 10.1038/nrc1299
pmid: 14993899
[4]
Strausberg R L, Simpson A J, Wooster R.Sequence-based Cancer Genomics: Progress, Lessons and Opportunities[J]. Nature Reviews Genetics, 2003, 4(6): 409-418.
doi: 10.1038/nrg1085
pmid: 12776211
[5]
Ostlund G, Lindskog M, Sonnhammer E L.Network-based Identification of Novel Cancer Genes[J]. Molecular & Cellular Proteomics, 2010, 9(4): 648-655.
doi: 10.1074/mcp.M900227-MCP200
pmid: 2860235
[6]
Li L, Zhang K, Lee J, et al.Discovering Cancer Genes by Integrating Network and Functional Properties[J]. BMC Medical Genomics, 2009, 2: 61-74.
doi: 10.1186/1755-8794-2-61
pmid: 2758898
(Shang Zhenwei, Li Jin, Jiang Yongshuai, et al.A Method of Drug Target Prediction Based on SVM and Its Application[J]. Progress in Modern Biomedicine, 2012, 12(20): 3943-3946.)
doi: 10.3969/j.issn.1004-1346.2014.08.015
(Xie Qianqian, Li Dingfang, Zhang Wen.Predicting Potential Drug Targets for Ion Channel Proteins Based on Ensemble Learning[J]. Computer Science, 2015, 42(4): 177-180.)
doi: 10.11896/j.issn.1002-137X.2015.4.035
[9]
蔡立葛. 基于失衡数据挖掘的药物靶点预测方法研究[D]. 哈尔滨: 哈尔滨理工大学, 2017.
[9]
(Cai Lige.Research on the Prediction of Drug Targets Based on Imbalance Data Mining[D]. Harbin: Harbin University of Science and Technology, 2017.)
[10]
Carson M B, Lu H.Network-based Prediction and Knowledge Mining of Disease Genes[J]. BMC Medical Genomics, 2015, 8(S2): S9.
doi: 10.1186/1755-8794-8-S2-S9
pmid: 4460923
[11]
Jing Y, Bian Y, Hu Z, et al.Deep Learning for Drug Design: An Artificial Intelligence Paradigm for Drug Discovery in the Big Data Era[J]. The AAPS Journal, 2018, 20(3): 58.
doi: 10.1208/s12248-018-0210-0
pmid: 29943256
[12]
Ferrero E, Dunham I, Sanseau P.In Silico Prediction of Novel Therapeutic Targets Using Gene-Disease Association Data[J]. Journal of Translational Medicine, 2017, 15(1): 182.
doi: 10.1186/s12967-017-1285-6
pmid: 28851378
[13]
Wishart D S, Knox C, Guo A C, et al.DrugBank: A Knowledgebase for Drugs, Drug Actions and Drug Targets[J]. Nucleic Acids Research, 2008, 36(Database Issue): 901-906.
doi: 10.1093/nar/gkm958
pmid: 18048412
[14]
Keshava Prasad T S, Goel R, Kandasamy K, et al. Human Protein Reference Database[J]. Nucleic Acids Research, 2008, 37(S1): 767-772.
doi: 10.1038/nrg1266
[15]
Shannon P, Markiel A, Ozier O, et al.Cytoscape: A Software Environment for Integrated Models of Biomolecular Interaction Networks[J]. Genome Research, 2003, 13(11): 2498-2504.
doi: 10.1101/gr.1239303
[16]
Hall M, Frank E, Holmes G, et al.The WEKA Data Mining Software: An Update[J]. ACM SIGKDD Explorations Newsletter, 2009, 11(1): 10-18.
doi: 10.1145/1656274
[17]
Han L, Cui J, Lin H, et al.Recent Progresses in the Application of Machine Learning Approach for Predicting Protein Functional Class Independent of Sequence Similarity[J]. Proteomics, 2006, 6(14): 4023-4037.
doi: 10.1002/pmic.200500938
pmid: 16791826
[18]
Chawla N V, Bowyer K W, Hall L O, et al.SMOTE: Synthetic Minority Over-Sampling Technique[J]. Journal of Artificial Intelligence Research, 2002, 16(1): 321-357.
doi: 10.1613/jair.953
(Du Jinglin, Yan Weilan.Multiple Classifiers of C4.5 Decision Tree Based on Distance Weight[J]. Computer Engineering and Design , 2018, 39(1): 96-102.)
(Huang Xiuxia, Sun Li.Optimization of C4.5 Algorithm[J]. Computer Engineering and Design, 2016, 37(5): 1265-1270.)
[21]
Cerami E, Gao J, Dogrusoz U, et al.The cBio Cancer Genomics Portal: An Open Platform for Exploring Multidimensional Cancer Genomics Data[J]. Cancer Discovery, 2012, 2(5): 401-404.
doi: 10.1158/2159-8290.CD-12-0095
[22]
Delaney J R, Patel C B, Willis K M, et al. Haploinsufficiency Networks Identify Targetable Patterns of Allelic Deficiency in Low Mutation Ovarian Cancer[J]. Nature Communications, 2017, 8: Article No.14423.
doi: 10.1038/ncomms14423
pmid: 28198375