Suppr超能文献

FAME 2:细胞色素P450区域选择性的简单有效机器学习模型。

FAME 2: Simple and Effective Machine Learning Model of Cytochrome P450 Regioselectivity.

作者信息

Šícho Martin, de Bruyn Kops Christina, Stork Conrad, Svozil Daniel, Kirchmair Johannes

机构信息

Faculty of Mathematics, Informatics and Natural Sciences, Department of Computer Science, Center for Bioinformatics, Universität Hamburg , Hamburg, 20146, Germany.

CZ-OPENSCREEN: National Infrastructure for Chemical Biology, Laboratory of Informatics and Chemistry, Faculty of Chemical Technology, University of Chemistry and Technology Prague , 166 28 Prague 6, Czech Republic.

出版信息

J Chem Inf Model. 2017 Aug 28;57(8):1832-1846. doi: 10.1021/acs.jcim.7b00250. Epub 2017 Aug 7.

Abstract

UNLABELLED

We report on the further development of FAst MEtabolizer (FAME; J. Chem. Inf.

MODEL

2013, 53, 2896-2907), a collection of random forest models for the prediction of sites of metabolism (SoMs) of xenobiotics. A broad set of descriptors was explored, from simple 2D descriptors such as those used in FAME, to quantum chemical descriptors employed in some of the most accurate models for SoM prediction currently available. In line with the original FAME approach, our objective was to keep things simple and to come up with accurate and robust models that are based on a small number of 2D descriptors. We found that circular descriptions of atoms and their environments with such descriptors in combination with an extremely randomized trees algorithm can yield models that perform equally well compared to more complex approaches. Thorough evaluation experiments on an independent test set showed that the best of these models obtained a Matthews correlation coefficient, area under the receiver operating characteristic curve, and Top-2 accuracy of 0.57, 0.91 and 94.1%, respectively. Models for the prediction of isoform-specific regioselectivity of CYP 3A4, 2D6, and 2C9 were also developed and showed competitive performance. The best models have been integrated into a newly developed software package (FAME 2), which is available free of charge from the authors.

摘要

未标注

我们报告了快速代谢预测器(FAME;《化学信息与建模杂志》:2013年,53卷,2896 - 2907页)的进一步发展情况,它是一组用于预测外源性物质代谢位点(SoM)的随机森林模型。我们探索了广泛的描述符集,从简单的二维描述符(如FAME中使用的那些)到一些当前可用的用于SoM预测的最精确模型中采用的量子化学描述符。与原始的FAME方法一致,我们的目标是保持简单,并基于少量二维描述符得出准确且稳健的模型。我们发现,使用此类描述符对原子及其环境进行圆形描述,并结合极端随机树算法,可以产生与更复杂方法性能相当的模型。在独立测试集上进行的全面评估实验表明,这些模型中最佳的模型分别获得了马修斯相关系数、受试者工作特征曲线下面积和前两位准确率,分别为0.57、0.91和94.1%。还开发了用于预测CYP 3A4、2D6和2C9同工酶特异性区域选择性的模型,并显示出具有竞争力的性能。最佳模型已集成到一个新开发的软件包(FAME 2)中,作者可免费提供该软件包。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验