ToxiM：一种使用机器学习和化学信息学方法开发的小分子毒性预测工具。

ToxiM: A Toxicity Prediction Tool for Small Molecules Developed Using Machine Learning and Chemoinformatics Approaches.

作者信息

Sharma Ashok K, Srivastava Gopal N, Roy Ankita, Sharma Vineet K

机构信息

Metagenomics and Systems Biology Laboratory, Department of Biological Sciences, Indian Institute of Science Education and Research, Bhopal, India.

出版信息

Front Pharmacol. 2017 Nov 30;8:880. doi: 10.3389/fphar.2017.00880. eCollection 2017.

DOI:10.3389/fphar.2017.00880

PMID:29249969

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC5714866/

Abstract

The experimental methods for the prediction of molecular toxicity are tedious and time-consuming tasks. Thus, the computational approaches could be used to develop alternative methods for toxicity prediction. We have developed a tool for the prediction of molecular toxicity along with the aqueous solubility and permeability of any molecule/metabolite. Using a comprehensive and curated set of toxin molecules as a training set, the different chemical and structural based features such as descriptors and fingerprints were exploited for feature selection, optimization and development of machine learning based classification and regression models. The compositional differences in the distribution of atoms were apparent between toxins and non-toxins, and hence, the molecular features were used for the classification and regression. On 10-fold cross-validation, the descriptor-based, fingerprint-based and hybrid-based classification models showed similar accuracy (93%) and Matthews's correlation coefficient (0.84). The performances of all the three models were comparable (Matthews's correlation coefficient = 0.84-0.87) on the blind dataset. In addition, the regression-based models using descriptors as input features were also compared and evaluated on the blind dataset. Random forest based regression model for the prediction of solubility performed better ( = 0.84) than the multi-linear regression (MLR) and partial least square regression (PLSR) models, whereas, the partial least squares based regression model for the prediction of permeability (caco-2) performed better ( = 0.68) in comparison to the random forest and MLR based regression models. The performance of final classification and regression models was evaluated using the two validation datasets including the known toxins and commonly used constituents of health products, which attests to its accuracy. The ToxiM web server would be a highly useful and reliable tool for the prediction of toxicity, solubility, and permeability of small molecules.

摘要

预测分子毒性的实验方法是繁琐且耗时的任务。因此，可利用计算方法来开发毒性预测的替代方法。我们已经开发了一种工具，用于预测任何分子/代谢物的分子毒性以及水溶性和渗透性。使用一组全面且经过整理的毒素分子作为训练集，利用不同的基于化学和结构的特征（如描述符和指纹）进行特征选择、优化，并开发基于机器学习的分类和回归模型。毒素和非毒素之间原子分布的组成差异明显，因此，分子特征被用于分类和回归。在10折交叉验证中，基于描述符、基于指纹和基于混合的分类模型显示出相似的准确率（93%）和马修斯相关系数（0.84）。在盲数据集上，所有这三种模型的性能相当（马修斯相关系数 = 0.84 - 0.87）。此外，还在盲数据集上对以描述符作为输入特征的基于回归的模型进行了比较和评估。基于随机森林的溶解度预测回归模型（ = 0.84）比多元线性回归（MLR）和偏最小二乘回归（PLSR）模型表现更好，而基于偏最小二乘的渗透性（caco - 2）预测回归模型与基于随机森林和MLR的回归模型相比表现更好（ = 0.68）。使用包括已知毒素和保健品常用成分的两个验证数据集对最终分类和回归模型的性能进行了评估，这证明了其准确性。ToxiM网络服务器将是预测小分子毒性、溶解度和渗透性的非常有用且可靠的工具。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ca71/5714866/7a44ce038cb6/fphar-08-00880-g0001.jpg

相似文献

ToxiM: A Toxicity Prediction Tool for Small Molecules Developed Using Machine Learning and Chemoinformatics Approaches.

Front Pharmacol. 2017 Nov 30;8:880. doi: 10.3389/fphar.2017.00880. eCollection 2017.

Molib: A machine learning based classification tool for the prediction of biofilm inhibitory molecules.

Genomics. 2020 Jul;112(4):2823-2832. doi: 10.1016/j.ygeno.2020.03.020. Epub 2020 Mar 27.

Prediction of organic compound aqueous solubility using machine learning: a comparison study of descriptor-based and fingerprints-based models.

J Cheminform. 2023 Oct 18;15(1):99. doi: 10.1186/s13321-023-00752-6.

Automated classification of tropical shrub species: a hybrid of leaf shape and machine learning approach.

PeerJ. 2017 Sep 12;5:e3792. doi: 10.7717/peerj.3792. eCollection 2017.

MolToxPred: small molecule toxicity prediction using machine learning approach.

RSC Adv. 2024 Jan 30;14(6):4201-4220. doi: 10.1039/d3ra07322j. eCollection 2024 Jan 23.

Improved Prediction of Blood-Brain Barrier Permeability Through Machine Learning with Combined Use of Molecular Property-Based Descriptors and Fingerprints.

AAPS J. 2018 Mar 21;20(3):54. doi: 10.1208/s12248-018-0215-8.

Performance comparison of nonlinear and linear regression algorithms coupled with different attribute selection methods for quantitative structure - retention relationships modelling in micellar liquid chromatography.

J Chromatogr A. 2020 Jul 19;1623:461146. doi: 10.1016/j.chroma.2020.461146. Epub 2020 Apr 29.

ADMET evaluation in drug discovery: 15. Accurate prediction of rat oral acute toxicity using relevance vector machine and consensus modeling.

J Cheminform. 2016 Feb 1;8:6. doi: 10.1186/s13321-016-0117-7. eCollection 2016.

ADMET Evaluation in Drug Discovery. Part 17: Development of Quantitative and Qualitative Prediction Models for Chemical-Induced Respiratory Toxicity.

Mol Pharm. 2017 Jul 3;14(7):2407-2421. doi: 10.1021/acs.molpharmaceut.7b00317. Epub 2017 Jun 21.

Prediction of chemical carcinogenicity by machine learning approaches.

SAR QSAR Environ Res. 2009;20(1-2):27-75. doi: 10.1080/10629360902724085.

引用本文的文献

From molecules to data: the emerging impact of chemoinformatics in chemistry.

J Cheminform. 2025 Aug 7;17(1):121. doi: 10.1186/s13321-025-00978-6.

A mixture parameterized biologically based dosimetry model to predict body burdens of polycyclic aromatic hydrocarbons in developmental zebrafish toxicity assays.

Toxicol Sci. 2025 Jun 1;205(2):326-343. doi: 10.1093/toxsci/kfaf039.

GutBugDB: a web resource to predict the human gut microbiome-mediated biotransformation of biotic and xenobiotic molecules.

Gut Microbiome (Camb). 2025 Jan 9;6:e3. doi: 10.1017/gmb.2024.15. eCollection 2025.

Toxicological Evaluation of Kaempferol and Linearolactone as Treatments for Amoebic Liver Abscess Development in .

Int J Mol Sci. 2024 Oct 2;25(19):10633. doi: 10.3390/ijms251910633.

MolToxPred: small molecule toxicity prediction using machine learning approach.

RSC Adv. 2024 Jan 30;14(6):4201-4220. doi: 10.1039/d3ra07322j. eCollection 2024 Jan 23.

Application of artificial intelligence approaches to predict the metabolism of xenobiotic molecules by human gut microbiome.

Front Microbiol. 2023 Dec 5;14:1254073. doi: 10.3389/fmicb.2023.1254073. eCollection 2023.

Clustering Protein Binding Pockets and Identifying Potential Drug Interactions: A Novel Ligand-Based Featurization Method.

J Chem Inf Model. 2023 Nov 13;63(21):6655-6666. doi: 10.1021/acs.jcim.3c00722. Epub 2023 Oct 17.

Accurate clinical toxicity prediction using multi-task deep neural nets and contrastive molecular explanations.

Sci Rep. 2023 Mar 25;13(1):4908. doi: 10.1038/s41598-023-31169-8.

Predicting Dose-Range Chemical Toxicity using Novel Hybrid Deep Machine-Learning Method.

Toxics. 2022 Nov 18;10(11):706. doi: 10.3390/toxics10110706.

Discovery of Potential SARS-CoV-2 Papain-like Protease Natural Inhibitors Employing a Multi-Phase In Silico Approach.

Life (Basel). 2022 Sep 9;12(9):1407. doi: 10.3390/life12091407.

本文引用的文献

ADME Properties Evaluation in Drug Discovery: Prediction of Caco-2 Cell Permeability Using a Combination of NSGA-II and Boosting.

J Chem Inf Model. 2016 Apr 25;56(4):763-73. doi: 10.1021/acs.jcim.5b00642. Epub 2016 Apr 5.

ChemDes: an integrated web-based platform for molecular descriptor and fingerprint computation.

J Cheminform. 2015 Dec 9;7:60. doi: 10.1186/s13321-015-0109-z. eCollection 2015.

BiGG Models: A platform for integrating, standardizing and sharing genome-scale models.

Nucleic Acids Res. 2016 Jan 4;44(D1):D515-22. doi: 10.1093/nar/gkv1049. Epub 2015 Oct 17.

Oncometabolites: tailoring our genes.

FEBS J. 2015 Aug;282(15):2796-805. doi: 10.1111/febs.13295. Epub 2015 Apr 30.

Dangers of "EDTA".

N Z Med J. 2014 Jul 18;127(1398):126-7.

admetSAR: a comprehensive source and free tool for assessment of chemical ADMET properties.

J Chem Inf Model. 2012 Nov 26;52(11):3099-105. doi: 10.1021/ci300367a. Epub 2012 Nov 1.

The lifelong effects of early childhood adversity and toxic stress.

Pediatrics. 2012 Jan;129(1):e232-46. doi: 10.1542/peds.2011-2663. Epub 2011 Dec 26.

Mesothelioma from chrysotile asbestos: update.

Ann Epidemiol. 2011 Sep;21(9):688-97. doi: 10.1016/j.annepidem.2011.05.010.

PaDEL-descriptor: an open source software to calculate molecular descriptors and fingerprints.

J Comput Chem. 2011 May;32(7):1466-74. doi: 10.1002/jcc.21707. Epub 2010 Dec 17.

ISDD: A computational model of particle sedimentation, diffusion and target cell dosimetry for in vitro toxicity studies.

Part Fibre Toxicol. 2010 Nov 30;7(1):36. doi: 10.1186/1743-8977-7-36.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

ToxiM：一种使用机器学习和化学信息学方法开发的小分子毒性预测工具。

ToxiM: A Toxicity Prediction Tool for Small Molecules Developed Using Machine Learning and Chemoinformatics Approaches.

作者信息

Sharma Ashok K, Srivastava Gopal N, Roy Ankita, Sharma Vineet K

机构信息

Metagenomics and Systems Biology Laboratory, Department of Biological Sciences, Indian Institute of Science Education and Research, Bhopal, India.

出版信息

Front Pharmacol. 2017 Nov 30;8:880. doi: 10.3389/fphar.2017.00880. eCollection 2017.

DOI:10.3389/fphar.2017.00880

PMID:29249969

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC5714866/

Abstract

摘要

ToxiM：一种使用机器学习和化学信息学方法开发的小分子毒性预测工具。

ToxiM: A Toxicity Prediction Tool for Small Molecules Developed Using Machine Learning and Chemoinformatics Approaches.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

ToxiM：一种使用机器学习和化学信息学方法开发的小分子毒性预测工具。

ToxiM: A Toxicity Prediction Tool for Small Molecules Developed Using Machine Learning and Chemoinformatics Approaches.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献