Suppr超能文献

MetaTREE,一个专注于代谢树的新型数据库,预测了一种重要的解毒机制:谷胱甘肽结合。

MetaTREE, a Novel Database Focused on Metabolic Trees, Predicts an Important Detoxification Mechanism: The Glutathione Conjugation.

机构信息

Dipartimento di Scienze Farmaceutiche, Università degli Studi di Milano, Via Mangiagalli 25, I-20133 Milano, Italy.

出版信息

Molecules. 2021 Apr 6;26(7):2098. doi: 10.3390/molecules26072098.

Abstract

(1) Background: Data accuracy plays a key role in determining the model performances and the field of metabolism prediction suffers from the lack of truly reliable data. To enhance the accuracy of metabolic data, we recently proposed a manually curated database collected by a meta-analysis of the specialized literature (MetaQSAR). Here we aim to further increase data accuracy by focusing on publications reporting exhaustive metabolic trees. This selection should indeed reduce the number of false negative data. (2) Methods: A new metabolic database (MetaTREE) was thus collected and utilized to extract a dataset for metabolic data concerning glutathione conjugation (MT-dataset). After proper pre-processing, this dataset, along with the corresponding dataset extracted from MetaQSAR (MQ-dataset), was utilized to develop binary classification models using a random forest algorithm. (3) Results: The comparison of the models generated by the two collected datasets reveals the better performances reached by the MT-dataset (MCC raised from 0.63 to 0.67, sensitivity from 0.56 to 0.58). The analysis of the applicability domain also confirms that the model based on the MT-dataset shows a more robust predictive power with a larger applicability domain. (4) Conclusions: These results confirm that focusing on metabolic trees represents a convenient approach to increase data accuracy by reducing the false negative cases. The encouraging performances shown by the models developed by the MT-dataset invites to use of MetaTREE for predictive studies in the field of xenobiotic metabolism.

摘要

(1) 背景:数据准确性在确定模型性能方面起着关键作用,而代谢预测领域则缺乏真正可靠的数据。为了提高代谢数据的准确性,我们最近提出了一个通过专门文献的荟萃分析收集的人工整理数据库(MetaQSAR)。在这里,我们旨在通过关注报告详尽代谢树的出版物来进一步提高数据的准确性。这种选择确实应该减少假阴性数据的数量。(2) 方法:因此,收集了一个新的代谢数据库(MetaTREE)并将其用于提取与谷胱甘肽缀合相关的代谢数据数据集(MT-数据集)。经过适当的预处理后,该数据集与从 MetaQSAR 中提取的相应数据集(MQ-数据集)一起,用于使用随机森林算法开发二进制分类模型。(3) 结果:比较两个收集数据集生成的模型表明,MT-数据集的性能更好(MCC 从 0.63 提高到 0.67,敏感性从 0.56 提高到 0.58)。适域性分析也证实,基于 MT-数据集的模型显示出更稳健的预测能力和更大的适域性。(4) 结论:这些结果证实,通过减少假阴性病例,关注代谢树是提高数据准确性的一种便捷方法。MT-数据集开发的模型所表现出的令人鼓舞的性能,邀请将 MetaTREE 用于外源物质代谢领域的预测研究。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cc65/8038802/b0b8cc76c500/molecules-26-02098-g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验