• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

使用贝叶斯加性回归树构建定量构效关系模型。

Building Quantitative Structure-Activity Relationship Models Using Bayesian Additive Regression Trees.

机构信息

Biometics Research , Merck & Co., Inc. , Kenilworth , New Jersey 07033 , United States.

Department of Statistics , The Ohio State University , Cockins Hall, 1958 Neil Avenue , Columbus , Ohio 43210 , United States.

出版信息

J Chem Inf Model. 2019 Jun 24;59(6):2642-2655. doi: 10.1021/acs.jcim.9b00094. Epub 2019 May 6.

DOI:10.1021/acs.jcim.9b00094
PMID:30998343
Abstract

Quantitative structure-activity relationship (QSAR) is a very commonly used technique for predicting the biological activity of a molecule using information contained in the molecular descriptors. The large number of compounds and descriptors and the sparseness of descriptors pose important challenges to traditional statistical methods and machine learning (ML) algorithms (such as random forest (RF)) used in this field. Recently, Bayesian Additive Regression Trees (BART), a flexible Bayesian nonparametric regression approach, has been demonstrated to be competitive with widely used ML approaches. Instead of only focusing on accurate point estimation, BART is formulated entirely in a hierarchical Bayesian modeling framework, allowing one to also quantify uncertainties and hence to provide both point and interval estimation for a variety of quantities of interest. We studied BART as a model builder for QSAR and demonstrated that the approach tends to have predictive performance comparable to RF. More importantly, we investigated BART's natural capability to analyze truncated (or qualified) data, generate interval estimates for molecular activities as well as descriptor importance, and conduct model diagnosis, which could not be easily handled through other approaches.

摘要

定量构效关系(QSAR)是一种非常常用的技术,用于使用分子描述符中包含的信息来预测分子的生物活性。大量的化合物和描述符以及描述符的稀疏性对该领域中使用的传统统计方法和机器学习(ML)算法(如随机森林(RF))提出了重要挑战。最近,贝叶斯加法回归树(BART)作为一种灵活的贝叶斯非参数回归方法,已被证明具有竞争力,可与广泛使用的 ML 方法相媲美。BART 不是仅专注于准确的点估计,而是完全在分层贝叶斯建模框架中进行公式化,这允许对各种感兴趣的数量进行不确定性的量化,从而为这些数量提供点估计和区间估计。我们将 BART 作为 QSAR 的模型构建者进行了研究,并证明了该方法具有与 RF 相当的预测性能。更重要的是,我们研究了 BART 分析截断(或合格)数据的自然能力,为分子活性以及描述符重要性生成区间估计,并进行模型诊断,而这些是其他方法不容易处理的。

相似文献

1
Building Quantitative Structure-Activity Relationship Models Using Bayesian Additive Regression Trees.使用贝叶斯加性回归树构建定量构效关系模型。
J Chem Inf Model. 2019 Jun 24;59(6):2642-2655. doi: 10.1021/acs.jcim.9b00094. Epub 2019 May 6.
2
General Approach to Estimate Error Bars for Quantitative Structure-Activity Relationship Predictions of Molecular Activity.定量构效关系预测分子活性的误差估计的一般方法。
J Chem Inf Model. 2018 Aug 27;58(8):1561-1575. doi: 10.1021/acs.jcim.8b00114. Epub 2018 Jul 17.
3
Estimating the domain of applicability for machine learning QSAR models: a study on aqueous solubility of drug discovery molecules.估计机器学习定量构效关系模型的适用域:关于药物发现分子水溶性的研究
J Comput Aided Mol Des. 2007 Sep;21(9):485-98. doi: 10.1007/s10822-007-9125-z. Epub 2007 Jul 14.
4
Genome-wide prediction using Bayesian additive regression trees.使用贝叶斯加法回归树进行全基因组预测。
Genet Sel Evol. 2016 Jun 10;48(1):42. doi: 10.1186/s12711-016-0219-8.
5
Improving virtual screening predictive accuracy of Human kallikrein 5 inhibitors using machine learning models.使用机器学习模型提高人激肽释放酶5抑制剂的虚拟筛选预测准确性。
Comput Biol Chem. 2017 Aug;69:110-119. doi: 10.1016/j.compbiolchem.2017.05.007. Epub 2017 May 29.
6
Persistent spectral hypergraph based machine learning (PSH-ML) for protein-ligand binding affinity prediction.基于持久谱超图的机器学习(PSH-ML)用于蛋白质-配体结合亲和力预测。
Brief Bioinform. 2021 Sep 2;22(5). doi: 10.1093/bib/bbab127.
7
Profile-QSAR: a novel meta-QSAR method that combines activities across the kinase family to accurately predict affinity, selectivity, and cellular activity.谱定量构效关系(Profile-QSAR):一种新型的元定量构效关系方法,它结合了激酶家族的各项活性,可准确预测亲和力、选择性和细胞活性。
J Chem Inf Model. 2011 Aug 22;51(8):1942-56. doi: 10.1021/ci1005004. Epub 2011 Jul 19.
8
Bayesian additive regression trees and the General BART model.贝叶斯加法回归树与通用BART模型。
Stat Med. 2019 Nov 10;38(25):5048-5069. doi: 10.1002/sim.8347. Epub 2019 Aug 28.
9
Development of Predictive QSAR Models of 4-Thiazolidinones Antitrypanosomal Activity Using Modern Machine Learning Algorithms.基于现代机器学习算法的 4-噻唑烷酮类抗锥虫活性的预测 QSAR 模型的建立。
Mol Inform. 2018 May;37(5):e1700078. doi: 10.1002/minf.201700078. Epub 2017 Nov 14.
10
Critically Assessing the Predictive Power of QSAR Models for Human Liver Microsomal Stability.批判性评估定量构效关系(QSAR)模型对人肝微粒体稳定性的预测能力。
J Chem Inf Model. 2015 Aug 24;55(8):1566-75. doi: 10.1021/acs.jcim.5b00255. Epub 2015 Jul 29.

引用本文的文献

1
Development and Evaluation of Conformal Prediction Methods for Quantitative Structure-Activity Relationship.定量构效关系的共形预测方法的开发与评估
ACS Omega. 2024 Jun 27;9(27):29478-29490. doi: 10.1021/acsomega.4c02017. eCollection 2024 Jul 9.
2
Computational Models Using Multiple Machine Learning Algorithms for Predicting Drug Hepatotoxicity with the DILIrank Dataset.使用多种机器学习算法的计算模型,结合 DILIrank 数据集预测药物肝毒性。
Int J Mol Sci. 2020 Mar 19;21(6):2114. doi: 10.3390/ijms21062114.
3
Use of QSAR Global Models and Molecular Docking for Developing New Inhibitors of c-src Tyrosine Kinase.
运用定量构效关系全球模型和分子对接技术开发新型 c-src 酪氨酸激酶抑制剂。
Int J Mol Sci. 2019 Dec 18;21(1):19. doi: 10.3390/ijms21010019.