[Objective] This study constructs a virtual screening model for anti-tuberculosis drugs aiming to support the research and development of new medicine. [Methods] We proposed a curriculum learning-optimized graph neural network model for anti-tuberculosis inhibitors virtual screening (GNN-MTB). Then, we created a benchmark dataset for anti-tuberculosis drugs from the open access databases. Finally, we compared the performance of the GNN-MTB with four classic machine learning models and two graph neural network models on the benchmark dataset of 10,789 records. [Results] The proposed GNN-MTB model’s AUC score reached 0.912 and its AUPR score was 0.679, which were higher than those of the classic models. The maximum improvement of our method in AUC and AUPR were 3.872% and 13.167%. The GNN-MTB is made open source and could be found at https://github.com/gu-yaowen/GNN-MTB. [Limitations] The proposed model needs to add the analysis data on drug sensitivity and bacterial resistance. [Conclusions] The proposed GNN-MTB model benefits the development of anti-tuberculosis drug screening. This method could also create drug virtual screening models for other diseases.
顾耀文,郑思,杨丰春,李姣. 基于图神经网络的抗结核杆菌药物虚拟筛选模型的建立及应用*[J]. 数据分析与知识发现, 2022, 6(11): 93-102.
Gu Yaowen,Zheng Si,Yang Fengchun,Li Jiao. GNN-MTB: An Anti-Mycobacterium Drug Virtual Screening Model Based on Graph Neural Network. Data Analysis and Knowledge Discovery, 2022, 6(11): 93-102.
MacNeil A, Glaziou P, Sismanidis C, et al. Global Epidemiology of Tuberculosis and Progress Toward Meeting Global Targets-Worldwide, 2018[J]. MMWR Morbidity and Mortality Weekly Report, 2020, 69(11): 281-285.
Abubakar I, Zignol M, Falzon D, et al. Drug-Resistant Tuberculosis: Time for Visionary Political Leadership[J]. The Lancet Infectious Diseases, 2013, 13(6): 529-539.
doi: 10.1016/S1473-3099(13)70030-6
[4]
Cox V, Brigden G, Crespo R H, et al. Global Programmatic Use of Bedaquiline and Delamanid for the Treatment of Multidrug-Resistant Tuberculosis[J]. The International Journal of Tuberculosis and Lung Disease, 2018, 22(4): 407-412.
doi: 10.5588/ijtld.17.0706
[5]
Jastrzębski S, Szymczak M, Pocha A, et al. Emulating Docking Results Using a Deep Neural Network: A New Perspective for Virtual Screening[J]. Journal of Chemical Information and Modeling, 2020, 60(9): 4246-4262.
doi: 10.1021/acs.jcim.9b01202
pmid: 32865414
[6]
Stokes J M, Yang K, Swanson K, et al. A Deep Learning Approach to Antibiotic Discovery[J]. Cell, 2020, 180(4): 688-702.e13.
doi: S0092-8674(20)30102-1
pmid: 32084340
[7]
Gomes J, Ramsundar B, Feinberg E N, et al. Atomic Convolutional Networks for Predicting Protein-ligand Binding Affinity[OL]. arXiv Preprint, arXiv:1703.10603.
[8]
Kong W, Tu X Y, Huang W R, et al. Prediction and Optimization of NaV1.7 Sodium Channel Inhibitors Based on Machine Learning and Simulated Annealing[J]. Journal of Chemical Information and Modeling, 2020, 60(6): 2739-2753.
doi: 10.1021/acs.jcim.9b01180
(Zhou Zeyu, Wang Hao, Zhao Zibo, et al. Construction and Application of GCN Model for Text Classification with Associated Information[J]. Data Analysis and Knowledge Discovery, 2021, 5(9): 31-41.)
(Gu Yaowen, Zhang Bowen, Zheng Si, et al. Predicting Drug ADMET Properties Based on Graph Attention Network[J]. Data Analysis and Knowledge Discovery, 2021, 5(8): 76-85.)
[11]
Sakai M, Nagayasu K, Shibui N, et al. Prediction of Pharmacological Activities from Chemical Structures with Graph Convolutional Neural Networks[J]. Scientific Reports, 2021, 11: 525.
doi: 10.1038/s41598-020-80113-7
pmid: 33436854
[12]
Prathipati P, Ma N L, Keller T H. Global Bayesian Models for the Prioritization of Antitubercular Agents[J]. Journal of Chemical Information and Modeling, 2008, 48(12): 2362-2370.
doi: 10.1021/ci800143n
pmid: 19053518
[13]
Lane T, Russo D P, Zorn K M, et al. Comparing and Validating Machine Learning Models for Mycobacterium Tuberculosis Drug Discovery[J]. Molecular Pharmaceutics, 2018, 15(10): 4346-4360.
doi: 10.1021/acs.molpharmaceut.8b00083
pmid: 29672063
[14]
Ye Q, Chai X, Jiang D J, et al. Identification of Active Molecules Against Mycobacterium Tuberculosis Through Machine Learning[J]. Briefings in Bioinformatics, 2021, 22(5): bbab068.
doi: 10.1093/bib/bbab068
[15]
Yang Y, Walker T M, Walker A S, et al. DeepAMR for Predicting Co-occurrent Resistance of Mycobacterium Tuberculosis[J]. Bioinformatics (Oxford, England), 2019, 35(18): 3240-3249.
doi: 10.1093/bioinformatics/btz067
[16]
Yang Y, Walker T M, Kouchaki S, et al. An End-to-End Heterogeneous Graph Attention Network for Mycobacterium Tuberculosis Drug-Resistance Prediction[J]. Briefings in Bioinformatics, 2021, 22(6): bbab299.
doi: 10.1093/bib/bbab299
[17]
Mendez D, Gaulton A, Bento A P, et al. ChEMBL: Towards Direct Deposition of Bioassay Data[J]. Nucleic Acids Research, 2018, 47(D1): D930-D940.
doi: 10.1093/nar/gky1075
[18]
Lane T R, Urbina F, Rank L, et al. Machine Learning Models for Mycobacterium Tuberculosis in Vitro Activity: Prediction and Target Visualization[J]. Molecular Pharmaceutics, 2022, 19(2): 674-689.
doi: 10.1021/acs.molpharmaceut.1c00791
[19]
Kipf T, Welling M. Semi-supervised Classification with Graph Convolutional Networks[OL]. arXiv Preprint, arXiv:1609.02907.
[20]
Bengio Y, Louradour J, Collobert R, et al. Curriculum Learning[C]// Proceedings of the 26th Annual International Conference on Machine Learning. 2009: 41-48.
[21]
Wang X, Chen Y D, Zhu W W. A Survey on Curriculum Learning[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2022, 44(9): 4555-4576.
[22]
Wang Y W, Wang W, Liang Y X, et al. CurGraph: Curriculum Learning for Graph Classification[C]// Proceedings of the Web Conference 2021. 2021: 1238-1248.
[23]
Li X H, Wen L J, Deng Y W, et al. Graph Neural Network with Curriculum Learning for Imbalanced Node Classification[OL]. arXiv Preprint, arXiv: 2202.02529.
[24]
Gu Y W, Zheng S, Xu Z D, et al. An Efficient Curriculum Learning-Based Strategy for Molecular Graph Learning[J]. Briefings in Bioinformatics, 2022, 23(3): bbac099.
doi: 10.1093/bib/bbac099
[25]
Gu Y W, Zheng S, Li J. CurrMG: A Curriculum Learning Approach for Graph Based Molecular Property Prediction[C]// Proceedings of 2021 IEEE International Conference on Bioinformatics and Biomedicine. 2021: 2686-2693.
[26]
Platanios E A, Stretcu O, Neubig G, et al. Competence-Based Curriculum Learning for Neural Machine Translation[C]// Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), 2019: 1162-1172.
[27]
Veličković P, Cucurull G, Casanova A. et al. Graph Attention Networks[OL]. arXiv Preprint, arXiv:1710.10903.
[28]
Gilmer J, Schoenholz S S, Riley P F, et al. Neural Message Passing for Quantum Chemistry[C]// Proceedings of the 34th International Conference on Machine Learning. 2017: 1263-1272.