Suppr
超能文献

基于堆叠集成的致突变性预测模型，采用多种模态结合图注意力网络。

Stacked ensemble-based mutagenicity prediction model using multiple modalities with graph attention network.

作者信息

Liyaqat Tanya, Ahmad Tanvir, Kashif Mohammad, Saxena Chandni

机构信息

Department of Computer Engineering, Jamia Millia Islamia, New Delhi, 110025, India.

The Chinese University of Hong Kong, Hong Kong, China.

出版信息

Med Biol Eng Comput. 2025 Jun 5. doi: 10.1007/s11517-025-03392-0.

DOI:10.1007/s11517-025-03392-0

PMID:40471492

Abstract

Mutagenicity is concerning due to its link to genetic mutations, which can lead to cancer and other adverse effects. Early identification of mutagenic compounds in drug development is crucial to prevent unsafe candidates and reduce costs. While computational techniques, especially machine learning (ML) models, have become prevalent for mutagenicity prediction, they typically rely on a single modality. Our work introduces a novel stacked ensemble mutagenicity prediction model that integrates multiple modalities, including SMILES and molecular graphs. These modalities capture diverse molecular information such as substructural, physicochemical, geometrical, and topological features. We use SMILES for deriving substructural, geometrical, and physicochemical data, while a graph attention network (GAT) extracts topological information from molecular graphs. Our model employs a stacked ensemble of ML classifiers and SHAP (Shapley Additive Explanations) to identify the significance of classifiers and key features. Our method outperforms state-of-the-art techniques on two standard datasets, achieving an area under the curve of 95.21% on the Hansen benchmark dataset. This research is expected to interest clinicians and computational biologists in translational research.

摘要

由于致突变性与基因突变相关联，而基因突变可导致癌症和其他不良影响，因此致突变性备受关注。在药物研发中尽早识别致突变化合物对于防止出现不安全的候选药物并降低成本至关重要。虽然计算技术，尤其是机器学习（ML）模型，已在致突变性预测中普遍使用，但它们通常依赖单一模态。我们的工作引入了一种新颖的堆叠集成致突变性预测模型，该模型整合了多种模态，包括SMILES和分子图。这些模态可捕捉各种分子信息，如亚结构、物理化学、几何和拓扑特征。我们使用SMILES来推导亚结构、几何和物理化学数据，而图注意力网络（GAT）则从分子图中提取拓扑信息。我们的模型采用ML分类器的堆叠集成和SHAP（Shapley加性解释）来确定分类器和关键特征的重要性。我们的方法在两个标准数据集上优于现有技术，在汉森基准数据集上实现了95.21%的曲线下面积。这项研究有望引起临床医生和计算生物学家在转化研究方面的兴趣。

相似文献

Stacked ensemble-based mutagenicity prediction model using multiple modalities with graph attention network.

Med Biol Eng Comput. 2025 Jun 5. doi: 10.1007/s11517-025-03392-0.

A stacked ensemble machine learning approach for the prediction of diabetes.

J Diabetes Metab Disord. 2023 Nov 22;23(1):603-617. doi: 10.1007/s40200-023-01321-2. eCollection 2024 Jun.

Predicting miRNA-disease association via graph attention learning and multiplex adaptive modality fusion.

Comput Biol Med. 2024 Feb;169:107904. doi: 10.1016/j.compbiomed.2023.107904. Epub 2023 Dec 28.

Multi-type feature fusion based on graph neural network for drug-drug interaction prediction.

BMC Bioinformatics. 2022 Jun 10;23(1):224. doi: 10.1186/s12859-022-04763-2.

DMGAT: predicting ncRNA-drug resistance associations based on diffusion map and heterogeneous graph attention network.

Brief Bioinform. 2025 Mar 4;26(2). doi: 10.1093/bib/bbaf179.

Drug-target affinity prediction with extended graph learning-convolutional networks.

BMC Bioinformatics. 2024 Feb 16;25(1):75. doi: 10.1186/s12859-024-05698-6.

Optimizing machine-learning models for mutagenicity prediction through better feature selection.

Mutagenesis. 2022 Oct 26;37(3-4):191-202. doi: 10.1093/mutage/geac010.

Prediction of gully erosion susceptibility through the lens of the SHapley Additive exPlanations (SHAP) method using a stacking ensemble model.

J Environ Manage. 2025 May;383:125478. doi: 10.1016/j.jenvman.2025.125478. Epub 2025 Apr 25.

Multimodal Drug Target Binding Affinity Prediction Using Graph Local Substructure.

IEEE J Biomed Health Inform. 2025 Mar;29(3):1625-1634. doi: 10.1109/JBHI.2024.3386815. Epub 2025 Mar 6.

A stacked ensemble machine learning model for the prediction of pentavalent 3 vaccination dropout in East Africa.

Front Big Data. 2025 Apr 7;8:1522578. doi: 10.3389/fdata.2025.1522578. eCollection 2025.

本文引用的文献

Optimizing machine-learning models for mutagenicity prediction through better feature selection.

Mutagenesis. 2022 Oct 26;37(3-4):191-202. doi: 10.1093/mutage/geac010.

LightGBM: An Effective and Scalable Algorithm for Prediction of Chemical Toxicity-Application to the Tox21 and Mutagenicity Data Sets.

J Chem Inf Model. 2019 Oct 28;59(10):4150-4158. doi: 10.1021/acs.jcim.9b00633. Epub 2019 Oct 9.

Learning continuous and data-driven molecular descriptors by translating equivalent chemical representations.

Chem Sci. 2018 Nov 19;10(6):1692-1701. doi: 10.1039/c8sc04175j. eCollection 2019 Feb 14.

Improvement of quantitative structure-activity relationship (QSAR) tools for predicting Ames mutagenicity: outcomes of the Ames/QSAR International Challenge Project.

Mutagenesis. 2019 Mar 6;34(1):3-16. doi: 10.1093/mutage/gey031.

The rise of deep learning in drug discovery.

Drug Discov Today. 2018 Jun;23(6):1241-1250. doi: 10.1016/j.drudis.2018.01.039. Epub 2018 Jan 31.

Novel naïve Bayes classification models for predicting the carcinogenicity of chemicals.

Food Chem Toxicol. 2016 Nov;97:141-149. doi: 10.1016/j.fct.2016.09.005. Epub 2016 Sep 3.

Addressing toxicity risk when designing and selecting compounds in early drug discovery.

Drug Discov Today. 2014 May;19(5):688-93. doi: 10.1016/j.drudis.2014.01.006. Epub 2014 Jan 19.

In silico prediction of chemical Ames mutagenicity.

J Chem Inf Model. 2012 Nov 26;52(11):2840-7. doi: 10.1021/ci300400a. Epub 2012 Oct 17.

Benchmark data set for in silico prediction of Ames mutagenicity.

J Chem Inf Model. 2009 Sep;49(9):2077-81. doi: 10.1021/ci900161g.

Genotoxicity and carcinogenicity studies of analgesics, anti-inflammatory drugs and antipyretics.

Pharmacol Res. 2009 Jul;60(1):1-17. doi: 10.1016/j.phrs.2009.03.007. Epub 2009 Mar 17.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

Suppr超能文献

基于堆叠集成的致突变性预测模型，采用多种模态结合图注意力网络。

Stacked ensemble-based mutagenicity prediction model using multiple modalities with graph attention network.

作者信息

机构信息

出版信息

相似文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译