Suppr超能文献

一种用于改进基于深度学习的定量构效关系回归建模中不确定性量化的混合框架。

A hybrid framework for improving uncertainty quantification in deep learning-based QSAR regression modeling.

作者信息

Wang Dingyan, Yu Jie, Chen Lifan, Li Xutong, Jiang Hualiang, Chen Kaixian, Zheng Mingyue, Luo Xiaomin

机构信息

Shanghai Key Laboratory of Forensic Medicine, Academy of Forensic Science, Shanghai, 200063, China.

University of Chinese Academy of Sciences, No.19A Yuquan Road, Beijing, 100049, China.

出版信息

J Cheminform. 2021 Sep 20;13(1):69. doi: 10.1186/s13321-021-00551-x.

Abstract

Reliable uncertainty quantification for statistical models is crucial in various downstream applications, especially for drug design and discovery where mistakes may incur a large amount of cost. This topic has therefore absorbed much attention and a plethora of methods have been proposed over the past years. The approaches that have been reported so far can be mainly categorized into two classes: distance-based approaches and Bayesian approaches. Although these methods have been widely used in many scenarios and shown promising performance with their distinct superiorities, being overconfident on out-of-distribution examples still poses challenges for the deployment of these techniques in real-world applications. In this study we investigated a number of consensus strategies in order to combine both distance-based and Bayesian approaches together with post-hoc calibration for improved uncertainty quantification in QSAR (Quantitative Structure-Activity Relationship) regression modeling. We employed a set of criteria to quantitatively assess the ranking and calibration ability of these models. Experiments based on 24 bioactivity datasets were designed to make critical comparison between the model we proposed and other well-studied baseline models. Our findings indicate that the hybrid framework proposed by us can robustly enhance the model ability of ranking absolute errors. Together with post-hoc calibration on the validation set, we show that well-calibrated uncertainty quantification results can be obtained in domain shift settings. The complementarity between different methods is also conceptually analyzed.

摘要

对于统计模型而言,可靠的不确定性量化在各种下游应用中至关重要,尤其是在药物设计与发现领域,因为错误可能会导致巨额成本。因此,这一主题备受关注,在过去几年中人们提出了大量方法。目前已报道的方法主要可分为两类:基于距离的方法和贝叶斯方法。尽管这些方法已在许多场景中广泛使用,并凭借其独特优势展现出了良好的性能,但对分布外样本过度自信仍然给这些技术在实际应用中的部署带来了挑战。在本研究中,我们研究了多种共识策略,以便将基于距离的方法和贝叶斯方法与事后校准相结合,从而在定量构效关系(QSAR)回归建模中改进不确定性量化。我们采用了一组标准来定量评估这些模型的排序和校准能力。基于24个生物活性数据集设计了实验,以对我们提出的模型与其他经过充分研究的基线模型进行关键比较。我们的研究结果表明,我们提出的混合框架能够有力地增强模型对绝对误差的排序能力。结合在验证集上的事后校准,我们表明在域转移设置中可以获得校准良好的不确定性量化结果。我们还从概念上分析了不同方法之间的互补性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/defc/8454160/d39e4ec11839/13321_2021_551_Fig1_HTML.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验