Data Analysis and Knowledge Discovery  2021, Vol. 5 Issue (8): 76-85    DOI: 10.11925/infotech.2096-3467.2021.0233
Predicting Drug ADMET Properties Based on Graph Attention Network
Gu Yaowen1,Zhang Bowen2,Zheng Si1,Yang Fengchun1,Li Jiao1()
1Institute of Medical Information, Chinese Academy of Medical Sciences, Beijing 100020, China
2XtalPi AI Research Center, Beijing 100089, China
[Objective] This study builds a prediction model for drugs’ ADMET properties (Absorption, Distribution, Metabolism, Excretion, Toxicity), aiming to evaluate drugs in virtual screening. [Methods] We constructed a drug ADMET prediction based on the Graph Attention Network (GAN). Then, we used the drug ADMET properties from open access databases and scientific publications to create their molecular graphs and structures. Finally, we compared the GAN-based model with three machine learning models and two graph neural network models. [Results] We collected 9 datasets with 149 457 ADMET records. The proposed prediction model had an average accuracy of 0.825 and an average F1-Score of 0.672 with the 9 datasets, which were 6.4% and 26.0% higher than those of the baseline models. [Limitations] The data cleansing process needs to be refined, while the prediction performance can be further improved with a pre-training architecture. [Conclusions] The proposed model could effectively predict a drug’s ADMET, which could help virtual drug screening and computer-aided drug developments.

Key wordsGraph Neural Network      Graph Attention Network      Multi-source Heterogeneous Data      ADMET      Virtual Screening     
Received: 08 March 2021      Published: 15 September 2021
ZTFLH:  R961  
Fund:National Natural Science Foundation of China(81601573);National Key Research and Development Program of China(2016YFC0901901)
Cite this article:

Gu Yaowen, Zhang Bowen, Zheng Si, Yang Fengchun, Li Jiao. Predicting Drug ADMET Properties Based on Graph Attention Network. Data Analysis and Knowledge Discovery, 2021, 5(8): 76-85.

Building Process of ADMET Prediction Model Based on Graph Attention Network
Diagram of ADMET Dataset(LO2 Toxicity)
ADMET Dataset Description
Chemical Spatial Distribution (Based on t-SNE)
预测模型 P450 1A2 P450 2C9 P450 2C19 P450 2D6 P450 3A4
F1-Score Accuracy F1-Score Accuracy F1-Score Accuracy F1-Score Accuracy F1-Score Accuracy
RF 0.771 0.792 0.535 0.829 0.685 0.779 0.437 0.853 0.676 0.826
KNN 0.686 0.737 0.451 0.790 0.570 0.749 0.441 0.842 0.567 0.782
LR 0.729 0.756 0.602 0.824 0.669 0.772 0.514 0.833 0.689 0.811
GCN 0.754 0.773 0.658 0.820 0.723 0.776 0.580 0.806 0.741 0.845
MPNN 0.755 0.781 0.648 0.832 0.712 0.794 0.584 0.853 0.726 0.822
本文模型(GAT) 0.778 0.799 0.670 0.840 0.725 0.787 0.585 0.855 0.748 0.844
The Performance of Drug Metabolite Prediction Model
预测模型 hERG Ames LO2 HEK293
F1-Score Accuracy F1-Score Accuracy F1-Score Accuracy F1-Score Accuracy
RF 0.868 0.803 0.555 0.687 0.545 0.851 0.277 0.908
KNN 0.843 0.773 0.342 0.603 0.556 0.842 0.348 0.908
LR 0.838 0.773 0.599 0.639 0.579 0.842 0.258 0.888
GCN 0.864 0.808 0.674 0.689 0.585 0.832 0.262 0.902
MPNN 0.841 0.766 0.726 0.752 0.370 0.495 0.301 0.884
本文模型(GAT) 0.872 0.829 0.676 0.709 0.588 0.861 0.409 0.901
The Performance of Drug Toxicity Prediction Model
