Suppr超能文献

随机森林分类法预测延长寿命的化合物。

Random forest classification for predicting lifespan-extending chemical compounds.

机构信息

Department of Chemistry, FEPS, University of Surrey, Guildford, Surrey, GU2 7XH, UK.

出版信息

Sci Rep. 2021 Jul 5;11(1):13812. doi: 10.1038/s41598-021-93070-6.

Abstract

Ageing is a major risk factor for many conditions including cancer, cardiovascular and neurodegenerative diseases. Pharmaceutical interventions that slow down ageing and delay the onset of age-related diseases are a growing research area. The aim of this study was to build a machine learning model based on the data of the DrugAge database to predict whether a chemical compound will extend the lifespan of Caenorhabditis elegans. Five predictive models were built using the random forest algorithm with molecular fingerprints and/or molecular descriptors as features. The best performing classifier, built using molecular descriptors, achieved an area under the curve score (AUC) of 0.815 for classifying the compounds in the test set. The features of the model were ranked using the Gini importance measure of the random forest algorithm. The top 30 features included descriptors related to atom and bond counts, topological and partial charge properties. The model was applied to predict the class of compounds in an external database, consisting of 1738 small-molecules. The chemical compounds of the screening database with a predictive probability of ≥ 0.80 for increasing the lifespan of Caenorhabditis elegans were broadly separated into (1) flavonoids, (2) fatty acids and conjugates, and (3) organooxygen compounds.

摘要

衰老是许多疾病的主要危险因素,包括癌症、心血管疾病和神经退行性疾病。减缓衰老和延缓与年龄相关疾病发作的药物干预措施是一个不断发展的研究领域。本研究的目的是构建一个基于 DrugAge 数据库数据的机器学习模型,以预测化合物是否能延长秀丽隐杆线虫的寿命。使用随机森林算法和分子指纹或分子描述符作为特征,构建了五个预测模型。使用分子描述符构建的最佳分类器在测试集中对化合物进行分类的曲线下面积(AUC)得分为 0.815。使用随机森林算法的基尼重要性度量对模型的特征进行了排序。排名前 30 的特征包括与原子和键数、拓扑和部分电荷性质相关的描述符。该模型应用于预测由 1738 种小分子组成的外部数据库中的化合物类别。筛选数据库中预测概率≥0.80 的化合物可分为以下几类:(1)类黄酮;(2)脂肪酸及其共轭物;(3)有机氧化合物。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/646b/8257600/777e774ee25e/41598_2021_93070_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验