Suppr超能文献

使用共形预测在高通量筛选中最大化收益。

Maximizing gain in high-throughput screening using conformal prediction.

作者信息

Svensson Fredrik, Afzal Avid M, Norinder Ulf, Bender Andreas

机构信息

Department of Chemistry, Centre for Molecular Informatics, University of Cambridge, Lensfield Road, Cambridge, CB2 1EW, UK.

IOTA Pharmaceuticals, St Johns Innovation Centre, Cowley Road, Cambridge, CB4 0WS, UK.

出版信息

J Cheminform. 2018 Feb 21;10(1):7. doi: 10.1186/s13321-018-0260-4.

Abstract

Iterative screening has emerged as a promising approach to increase the efficiency of screening campaigns compared to traditional high throughput approaches. By learning from a subset of the compound library, inferences on what compounds to screen next can be made by predictive models, resulting in more efficient screening. One way to evaluate screening is to consider the cost of screening compared to the gain associated with finding an active compound. In this work, we introduce a conformal predictor coupled with a gain-cost function with the aim to maximise gain in iterative screening. Using this setup we were able to show that by evaluating the predictions on the training data, very accurate predictions on what settings will produce the highest gain on the test data can be made. We evaluate the approach on 12 bioactivity datasets from PubChem training the models using 20% of the data. Depending on the settings of the gain-cost function, the settings generating the maximum gain were accurately identified in 8-10 out of the 12 datasets. Broadly, our approach can predict what strategy generates the highest gain based on the results of the cost-gain evaluation: to screen the compounds predicted to be active, to screen all the remaining data, or not to screen any additional compounds. When the algorithm indicates that the predicted active compounds should be screened, our approach also indicates what confidence level to apply in order to maximize gain. Hence, our approach facilitates decision-making and allocation of the resources where they deliver the most value by indicating in advance the likely outcome of a screening campaign.

摘要

与传统的高通量方法相比,迭代筛选已成为一种提高筛选效率的有前景的方法。通过从化合物库的一个子集中学习,预测模型可以推断出接下来要筛选哪些化合物,从而实现更高效的筛选。评估筛选的一种方法是考虑筛选成本与发现活性化合物相关收益的比较。在这项工作中,我们引入了一种共形预测器,并结合了收益成本函数,旨在在迭代筛选中最大化收益。使用这种设置,我们能够证明,通过评估训练数据上的预测,可以对哪些设置将在测试数据上产生最高收益做出非常准确的预测。我们使用来自PubChem的12个生物活性数据集评估该方法,使用20%的数据训练模型。根据收益成本函数的设置,在12个数据集中的8 - 10个数据集中准确识别出了产生最大收益的设置。总体而言,我们的方法可以根据成本 - 收益评估的结果预测哪种策略产生的收益最高:筛选预测为活性的化合物、筛选所有剩余数据,还是不筛选任何额外的化合物。当算法表明应该筛选预测的活性化合物时,我们的方法还会指出应用何种置信水平以最大化收益。因此,我们的方法通过提前指出筛选活动的可能结果,促进了决策制定和资源分配,使资源在最有价值的地方发挥作用。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1b04/5821614/e592e2013813/13321_2018_260_Fig1_HTML.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验