Please wait a minute...
Data Analysis and Knowledge Discovery  2021, Vol. 5 Issue (8): 76-85    DOI: 10.11925/infotech.2096-3467.2021.0233
Current Issue | Archive | Adv Search |
Predicting Drug ADMET Properties Based on Graph Attention Network
Gu Yaowen1,Zhang Bowen2,Zheng Si1,Yang Fengchun1,Li Jiao1()
1Institute of Medical Information, Chinese Academy of Medical Sciences, Beijing 100020, China
2XtalPi AI Research Center, Beijing 100089, China
Download: PDF (1092 KB)   HTML ( 42
Export: BibTeX | EndNote (RIS)      
Abstract  

[Objective] This study builds a prediction model for drugs’ ADMET properties (Absorption, Distribution, Metabolism, Excretion, Toxicity), aiming to evaluate drugs in virtual screening. [Methods] We constructed a drug ADMET prediction based on the Graph Attention Network (GAN). Then, we used the drug ADMET properties from open access databases and scientific publications to create their molecular graphs and structures. Finally, we compared the GAN-based model with three machine learning models and two graph neural network models. [Results] We collected 9 datasets with 149 457 ADMET records. The proposed prediction model had an average accuracy of 0.825 and an average F1-Score of 0.672 with the 9 datasets, which were 6.4% and 26.0% higher than those of the baseline models. [Limitations] The data cleansing process needs to be refined, while the prediction performance can be further improved with a pre-training architecture. [Conclusions] The proposed model could effectively predict a drug’s ADMET, which could help virtual drug screening and computer-aided drug developments.

Key wordsGraph Neural Network      Graph Attention Network      Multi-source Heterogeneous Data      ADMET      Virtual Screening     
Received: 08 March 2021      Published: 15 September 2021
ZTFLH:  R961  
Fund:National Natural Science Foundation of China(81601573);National Key Research and Development Program of China(2016YFC0901901)
Corresponding Authors: Li Jiao ORCID:0000-0001-6391-8343     E-mail: li.jiao@imicams.ac.cn

Cite this article:

Gu Yaowen, Zhang Bowen, Zheng Si, Yang Fengchun, Li Jiao. Predicting Drug ADMET Properties Based on Graph Attention Network. Data Analysis and Knowledge Discovery, 2021, 5(8): 76-85.

URL:

https://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/10.11925/infotech.2096-3467.2021.0233     OR     https://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/Y2021/V5/I8/76

Building Process of ADMET Prediction Model Based on Graph Attention Network
NAME SMILES VALUE
CHEMBL274189 [Cl-].[H]c1c(OC([H])([H])[H])c(OC([H])([H])[H])c([H])c2c1-c1c([H])c3c([H])c([H])c(OC([H])([H])[H])c(OC([H])([H])[H])c3c([H])[n+]1C([H])([H])C2([H])[H] 0
CHEMBL12089 [Cl-].[H]c1c2c(c([H])c3c1-c1c([H])c4c([H])c([H])c(OC([H])([H])[H])c(OC([H])([H])[H])c4c([H])[n+]1C([H])([H])C3([H])[H])OC([H])([H])O2 0
CHEMBL3353920 [H]/C(=C(C(/[H])=C(\[H])C(=O)[O-])\C([H])([H])[H])[C@]([H])(C(=O)c1c([H])c([H])c(N(C([H])([H])[H])C([H])([H])[H])c([H])c1[H])C([H])([H])[H] 0
CHEMBL67 [H]/C(=C(\[H])c1c([H])c(OC([H])([H])[H])c(OC([H])([H])[H])c(OC([H])([H])[H])c1[H])c1c([H])c([H])c(OC([H])([H])[H])c([O-])c1[H] 1
Diagram of ADMET Dataset(LO2 Toxicity)
数据类型 ADMET
属性
数据描述 样本量 阳性
样本量
阴性
样本量
代谢 CYP450 1A2 inhibitor 细胞色素酶P450 1A2亚型抑制 21 566 10 376 11 190
CYP450 2C9 inhibitor 细胞色素酶P450 2C9亚型抑制 21 763 5 422 16 341
CYP450 2C19
inhibitor
细胞色素酶P450 2C19亚型抑制 22 255 7 809 14 446
CYP450 2D6 inhibitor 细胞色素酶P450 2D6亚型抑制 22 470 4 542 17 928
CYP450 3A4 inhibitor 细胞色素酶P450 3A4亚型抑制 24 066 8 782 15 284
毒性 hERG hERG钾通道抑制(心脏毒性) 6 596 4 570 2 026
Ames 致突变性 12 970 7 242 5 728
LO2 LO2细胞毒性 501 94 407
HEK293 HEK293细胞毒性 17 270 2 445 14 825
ADMET Dataset Description
Chemical Spatial Distribution (Based on t-SNE)
预测模型 P450 1A2 P450 2C9 P450 2C19 P450 2D6 P450 3A4
F1-Score Accuracy F1-Score Accuracy F1-Score Accuracy F1-Score Accuracy F1-Score Accuracy
RF 0.771 0.792 0.535 0.829 0.685 0.779 0.437 0.853 0.676 0.826
KNN 0.686 0.737 0.451 0.790 0.570 0.749 0.441 0.842 0.567 0.782
LR 0.729 0.756 0.602 0.824 0.669 0.772 0.514 0.833 0.689 0.811
GCN 0.754 0.773 0.658 0.820 0.723 0.776 0.580 0.806 0.741 0.845
MPNN 0.755 0.781 0.648 0.832 0.712 0.794 0.584 0.853 0.726 0.822
本文模型(GAT) 0.778 0.799 0.670 0.840 0.725 0.787 0.585 0.855 0.748 0.844
The Performance of Drug Metabolite Prediction Model
预测模型 hERG Ames LO2 HEK293
F1-Score Accuracy F1-Score Accuracy F1-Score Accuracy F1-Score Accuracy
RF 0.868 0.803 0.555 0.687 0.545 0.851 0.277 0.908
KNN 0.843 0.773 0.342 0.603 0.556 0.842 0.348 0.908
LR 0.838 0.773 0.599 0.639 0.579 0.842 0.258 0.888
GCN 0.864 0.808 0.674 0.689 0.585 0.832 0.262 0.902
MPNN 0.841 0.766 0.726 0.752 0.370 0.495 0.301 0.884
本文模型(GAT) 0.872 0.829 0.676 0.709 0.588 0.861 0.409 0.901
The Performance of Drug Toxicity Prediction Model
[1] Ferreira L L G, Andricopulo A D. ADMET Modeling Approaches in Drug Discovery[J]. Drug Discovery Today, 2019, 24(5):1157-1165.
doi: 10.1016/j.drudis.2019.03.015
[2] Lucas A J, Sproston J L, Barton P, et al. Estimating Human ADME Properties, Pharmacokinetic Parameters and Likely Clinical Dose in Drug Discovery[J]. Expert Opinion on Drug Discovery, 2019, 14(12):1313-1327.
doi: 10.1080/17460441.2019.1660642 pmid: 31538500
[3] Wang Y L, Xing J, Xu Y, et al. In Silico ADME/T Modelling for Rational Drug Design[J]. Quarterly Reviews of Biophys, 2015, 48(4):488-515.
doi: 10.1017/S0033583515000190
[4] Chi C T, Lee M H, Weng C F, et al. In Silico Prediction of PAMPA Effective Permeability Using a Two-QSAR Approach[J]. International Journal of Molecular Sciences, 2019, 20(13):3170.
doi: 10.3390/ijms20133170
[5] Ruiz I L, Gómez-Nieto M A. Robust QSAR Prediction Models for Volume of Distribution at Steady State in Humans Using Relative Distance Measurements[J]. SAR and QSAR Environmental Research, 2018, 29(7):529-550.
doi: 10.1080/1062936X.2018.1494038
[6] Dong J, Wang N N, Yao Z J, et al. ADMETlab: A Platform for Systematic ADMET Evaluation Based on a Comprehensively Collected ADMET Database[J]. Journal of Cheminformatics, 2018, 10(1):29.
doi: 10.1186/s13321-018-0283-x pmid: 29943074
[7] Durant J L, Leland B A, Henry D R, et al. Reoptimization of MDL Keys for Use in Drug Discovery[J]. Journal of Chemical Information and Computer Sciences, 2002, 42(6):1273-1280.
doi: 10.1021/ci010132r
[8] Rogers D, Hahn M. Extended-Connectivity Fingerprints[J]. Journal of Chemical Informaiton and Model, 2010, 50(5):742-754.
[9] Cheng F X, Li W H, Zhou Y D, et al. admetSAR: A Comprehensive Source and Free Tool for Assessment of Chemical ADMET Properties[J]. Journal of Chemical Informaiton and Model, 2012, 52(11):3099-3105.
[10] Wishart D S, Feunang Y D, Guo A C, et al. DrugBank 5.0: A Major Update to the DrugBank Database for 2018[J]. Nucleic Acids Research, 2018, 46(D1):D1074-D1082.
doi: 10.1093/nar/gkx1037
[11] Pires D E, Blundell T L, Ascher D B. pkCSM: Predicting Small-Molecule Pharmacokinetic and Toxicity Properties Using Graph-Based Signatures[J]. Journal of Medicinal Chemistry, 2015, 58(9):4066-4072.
doi: 10.1021/acs.jmedchem.5b00104
[12] Withnall M, Lindelöf E, Engkvist O, et al. Building Attention and Edge Message Passing Neural Networks for Bioactivity and Physical-Chemical Property Prediction[J]. Journal of Cheminformatics, 2020, 12:1.
doi: 10.1186/s13321-019-0407-y
[13] Huang Y A, Hu P, Chan K C C, et al. Graph Convolution for Predicting Associations Between miRNA and Drug Resistance[J]. Bioinformatics, 2020, 36(3):851-858.
[14] Gilmer J, Schoenholz S S, Riley P F, et al. Neural Message Passing for Quantum Chemistry[C]// Proceedings of the 34th International Conference on Machine Learning. 2017: 1263-1272.
[15] 张思凡, 牛振东, 陆浩, 等. 基于图卷积嵌入与特征交叉的文献被引量预测方法:以交通运输领域为例[J]. 数据分析与知识发现, 2020, 4(9):56-67.
[15] ( Zhang Sifan, Niu Zhendong, Lu Hao, et al. Predicting Citations Based on Graph Convolution Embedding and Feature Cross: Case Study of Transportation Research[J]. Data Analysis and Knowledge Discovery, 2020, 4(9):56-67.)
[16] 陈鑫, 刘喜恩, 吴及. 药物表示学习研究进展[J]. 清华大学学报(自然科学版), 2020, 60(2):171-180.
[16] ( Chen Xin, Liu Xien, Wu Ji. Research Progress on Drug Representation Learning[J]. Journal of Tsinghua University (Science and Technology), 2020, 60(2):171-180.)
[17] Wu Z Q, Ramsundar B, Feinberg E N, et al. MoleculeNet: A Benchmark for Molecular Machine Learning[J]. Chemical Science, 2018, 9(2):513-530.
doi: 10.1039/C7SC02664A
[18] Liu K, Sun X Y, Jia L, et al. Chemi-Net: A Molecular Graph Convolutional Network for Accurate Drug Property Prediction[J]. International Journal of Molecular Sciences, 2019, 20(14):3389.
doi: 10.3390/ijms20143389
[19] Jo J, Kwak B, Choi H S, et al. The Message Passing Neural Networks for Chemical Property Prediction on SMILES[J]. Methods, 2020, 179:65-72.
doi: 10.1016/j.ymeth.2020.05.009
[20] Veličković P, Cucurull G, Casanova A, et al. Graph Attention Networks[OL]. arXiv Preprint, arXiv:1710.10903.
[21] Zhang J L, Jiang Z L, Hu X H, et al. A Novel Graph Attention Adversarial Network for Predicting Disease-Related Associations[J]. Methods, 2020, 179:81-88.
doi: 10.1016/j.ymeth.2020.05.010
[22] Yu Z X, Huang F, Zhao X H, et al. Predicting Drug-Disease Associations Through Layer Attention Graph Convolutional Network[J]. Briefings in Bioinformatics, 2020, doi: 10.1093/bib/bbaa243.
doi: 10.1093/bib/bbaa243
[23] Gaulton A, Bellis L J, Bento A P, et al. ChEMBL: A Large-scale Bioactivity Database for Drug Discovery[J]. Nucleic Acids Research, 2012, 40(Database Issue):D1100-D1107.
doi: 10.1093/nar/gkr777
[24] Wang Y L, Xiao J W, Suzek T O, et al. PubChem’s BioAssay Database[J]. Nucleic Acids Research, 2012, 40(Database Issue):D400-D412.
doi: 10.1093/nar/gkr1132
[25] Cao D Y, Wang J M, Zhou R, et al. ADMET Evaluation in Drug Discovery. 11. PharmacoKinetics Knowledge Base (PKKB): A Comprehensive Database of Pharmacokinetic and Toxic Properties for Drugs[J]. Journal of Chemical Information and Modeling, 2012, 52(5):1132-1137.
doi: 10.1021/ci300112j
[26] Xu Q, Liu K, Lin X M, et al. ADMETNet: The Knowledge Base of Pharmacokinetics and Toxicology Network[J]. Journal of Genetics and Genomics, 2017, 44(5):273-276.
doi: 10.1016/j.jgg.2017.04.005
[27] Sorkun M C, Khetan A, Er S. AqSolDB, a Curated Reference Set of Aqueous Solubility and 2D Descriptors for a Diverse Set of Compounds[J]. Scientific Data, 2019, 6:143.
doi: 10.1038/s41597-019-0151-1 pmid: 31395888
[28] Richard A M, Huang R L, Waidyanatha S, et al. The Tox21 10K Compound Library: Collaborative Chemistry Advancing Toxicology[J]. Chemical Research in Toxicology, 2021, 34(2):189-216.
doi: 10.1021/acs.chemrestox.0c00264
[29] Li X, Xu Y J, Lai L H, et al. Prediction of Human Cytochrome P450 Inhibition Using a Multitask Deep Autoencoder Neural Network[J]. Molecular Pharmaceutics, 2018, 15(10):4336-4345.
doi: 10.1021/acs.molpharmaceut.8b00110
[30] Wallach I, Heifets A. Most Ligand-Based Classification Benchmarks Reward Memorization Rather than Generalization[J]. Journal of Chemical Information and Modeling, 2018, 58(5):916-932.
doi: 10.1021/acs.jcim.7b00403 pmid: 29698607
[31] Parks C, Gaieb Z, Amaro R E. An Analysis of Proteochemometric and Conformal Prediction Machine Learning Protein-Ligand Binding Affinity Models[J]. Frontiers in Molecular Biosciences, 2020, 7:93.
doi: 10.3389/fmolb.2020.00093
[32] Kearnes S, Mccloskey K, Berndl M, et al. Molecular Graph Convolutions: Moving Beyond Fingerprints[J]. Journal of Computer-Aided Molecular Design, 2016, 30(8):595-608.
doi: 10.1007/s10822-016-9938-8 pmid: 27558503
[1] Wang Song, Yang Yang, Liu Xinmin. Discovering Potentialities of User Ideas from Open Innovation Communities with Graph Attention Network[J]. 数据分析与知识发现, 2021, 5(11): 89-101.
  Copyright © 2016 Data Analysis and Knowledge Discovery   Tel/Fax:(010)82626611-6626,82624938   E-mail:jishu@mail.las.ac.cn