• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

机器学习驱动的数据估值优化高通量筛选管道。

Machine Learning-Driven Data Valuation for Optimizing High-Throughput Screening Pipelines.

机构信息

Technical University of Munich, TUM School of Natural Sciences, Department of Bioscience, Center for Functional Protein Assemblies (CPA), 85748 Garching bei München, Germany.

出版信息

J Chem Inf Model. 2024 Nov 11;64(21):8142-8152. doi: 10.1021/acs.jcim.4c01547. Epub 2024 Oct 23.

DOI:10.1021/acs.jcim.4c01547
PMID:39440790
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11558681/
Abstract

In the rapidly evolving field of drug discovery, high-throughput screening (HTS) is essential for identifying bioactive compounds. This study introduces a novel application of data valuation, a concept for evaluating the importance of data points based on their impact, to enhance drug discovery pipelines. Our approach improves active learning for compound library screening, robustly identifies true and false positives in HTS data, and identifies important inactive samples in an imbalanced HTS training, all while accounting for computational efficiency. We demonstrate that importance-based methods enable more effective batch screening, reducing the need for extensive HTS. Machine learning models accurately differentiate true biological activity from assay artifacts, streamlining the drug discovery process. Additionally, importance undersampling aids in HTS data set balancing, improving machine learning performance without omitting crucial inactive samples. These advancements could significantly enhance the efficiency and accuracy of drug development.

摘要

在药物发现这个快速发展的领域,高通量筛选(HTS)对于识别生物活性化合物至关重要。本研究介绍了一种新颖的数据估值应用,该方法基于数据点的影响来评估其重要性,以增强药物发现管道。我们的方法改进了化合物库筛选的主动学习,在 HTS 数据中稳健地识别真实和假阳性,并识别不平衡 HTS 训练中的重要无活性样本,同时考虑计算效率。我们证明基于重要性的方法可以更有效地进行批量筛选,减少对广泛 HTS 的需求。机器学习模型可以准确地区分真实的生物学活性和分析物伪迹,从而简化药物发现过程。此外,重要性欠采样有助于 HTS 数据集平衡,在不忽略关键无活性样本的情况下提高机器学习性能。这些进展可以显著提高药物开发的效率和准确性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2ddf/11558681/1b60e9a18ad3/ci4c01547_0006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2ddf/11558681/f9f6f5331cba/ci4c01547_0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2ddf/11558681/9c29d7900fcd/ci4c01547_0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2ddf/11558681/bf6123c51bb9/ci4c01547_0003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2ddf/11558681/8f62a198f661/ci4c01547_0004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2ddf/11558681/6f0dce2583b0/ci4c01547_0005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2ddf/11558681/1b60e9a18ad3/ci4c01547_0006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2ddf/11558681/f9f6f5331cba/ci4c01547_0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2ddf/11558681/9c29d7900fcd/ci4c01547_0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2ddf/11558681/bf6123c51bb9/ci4c01547_0003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2ddf/11558681/8f62a198f661/ci4c01547_0004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2ddf/11558681/6f0dce2583b0/ci4c01547_0005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2ddf/11558681/1b60e9a18ad3/ci4c01547_0006.jpg

相似文献

1
Machine Learning-Driven Data Valuation for Optimizing High-Throughput Screening Pipelines.机器学习驱动的数据估值优化高通量筛选管道。
J Chem Inf Model. 2024 Nov 11;64(21):8142-8152. doi: 10.1021/acs.jcim.4c01547. Epub 2024 Oct 23.
2
Changing the HTS Paradigm: AI-Driven Iterative Screening for Hit Finding.改变高通量筛选范式:人工智能驱动的迭代筛选以寻找命中。
SLAS Discov. 2021 Feb;26(2):257-262. doi: 10.1177/2472555220949495. Epub 2020 Aug 18.
3
MF-PCBA: Multifidelity High-Throughput Screening Benchmarks for Drug Discovery and Machine Learning.MF-PCBA:药物发现和机器学习的多保真度高通量筛选基准
J Chem Inf Model. 2023 May 8;63(9):2667-2678. doi: 10.1021/acs.jcim.2c01569. Epub 2023 Apr 14.
4
Data-Driven Derivation of an "Informer Compound Set" for Improved Selection of Active Compounds in High-Throughput Screening.基于数据驱动的“信息化合物集”推导,提高高通量筛选中活性化合物的选择。
J Chem Inf Model. 2016 Sep 26;56(9):1622-30. doi: 10.1021/acs.jcim.6b00244. Epub 2016 Aug 16.
5
Deep Learning-Based Imbalanced Data Classification for Drug Discovery.基于深度学习的药物发现中不平衡数据分类。
J Chem Inf Model. 2020 Sep 28;60(9):4180-4190. doi: 10.1021/acs.jcim.9b01162. Epub 2020 Jul 8.
6
3D cell cultures toward quantitative high-throughput drug screening.三维细胞培养用于高通量药物筛选的定量分析。
Trends Pharmacol Sci. 2022 Jul;43(7):569-581. doi: 10.1016/j.tips.2022.03.014. Epub 2022 Apr 30.
7
[High-throughput Screening Technology for Selective Inhibitors of Transporters and Its Application in Drug Discovery].[转运体选择性抑制剂的高通量筛选技术及其在药物发现中的应用]
Yakugaku Zasshi. 2021;141(4):511-515. doi: 10.1248/yakushi.20-00204-3.
8
Automated MALDI Target Preparation Concept: Providing Ultra-High-Throughput Mass Spectrometry-Based Screening for Drug Discovery.自动化 MALDI 靶标制备概念:为药物发现提供超高通量基于质谱的筛选。
SLAS Technol. 2019 Apr;24(2):209-221. doi: 10.1177/2472630318791981. Epub 2018 Aug 3.
9
Machine learning and drug discovery for neglected tropical diseases.机器学习与被忽视热带病药物研发。
BMC Bioinformatics. 2023 Apr 24;24(1):165. doi: 10.1186/s12859-022-05076-0.
10
Using the BioAssay Ontology for analyzing high-throughput screening data.使用生物测定本体论分析高通量筛选数据。
J Biomol Screen. 2015 Mar;20(3):402-15. doi: 10.1177/1087057114563493. Epub 2014 Dec 15.

本文引用的文献

1
Machine Learning Assisted Hit Prioritization for High Throughput Screening in Drug Discovery.机器学习辅助药物发现高通量筛选中的活性化合物优先级排序
ACS Cent Sci. 2024 Mar 15;10(4):823-832. doi: 10.1021/acscentsci.3c01517. eCollection 2024 Apr 24.
2
MF-PCBA: Multifidelity High-Throughput Screening Benchmarks for Drug Discovery and Machine Learning.MF-PCBA:药物发现和机器学习的多保真度高通量筛选基准
J Chem Inf Model. 2023 May 8;63(9):2667-2678. doi: 10.1021/acs.jcim.2c01569. Epub 2023 Apr 14.
3
Towards More Efficient Data Valuation in Healthcare Federated Learning using Ensembling.
使用集成方法实现医疗联邦学习中更高效的数据评估
Distrib Collab Fed Learn Afford AI Healthc Resour Div Glob Health (2022). 2022 Sep;13573:119-129. doi: 10.1007/978-3-031-18523-6_12. Epub 2022 Oct 7.
4
MolData, a molecular benchmark for disease and target based machine learning.MolData,一种基于疾病和靶点的机器学习分子基准。
J Cheminform. 2022 Mar 7;14(1):10. doi: 10.1186/s13321-022-00590-y.
5
Changing the HTS Paradigm: AI-Driven Iterative Screening for Hit Finding.改变高通量筛选范式:人工智能驱动的迭代筛选以寻找命中。
SLAS Discov. 2021 Feb;26(2):257-262. doi: 10.1177/2472555220949495. Epub 2020 Aug 18.
6
From Local Explanations to Global Understanding with Explainable AI for Trees.利用可解释人工智能实现从局部解释到树木的全局理解
Nat Mach Intell. 2020 Jan;2(1):56-67. doi: 10.1038/s42256-019-0138-9. Epub 2020 Jan 17.
7
SciPy 1.0: fundamental algorithms for scientific computing in Python.SciPy 1.0:Python 中的科学计算基础算法。
Nat Methods. 2020 Mar;17(3):261-272. doi: 10.1038/s41592-019-0686-2. Epub 2020 Feb 3.
8
Frequent hitters: nuisance artifacts in high-throughput screening.高频击球员:高通量筛选中的讨厌的伪像。
Drug Discov Today. 2020 Apr;25(4):657-667. doi: 10.1016/j.drudis.2020.01.014. Epub 2020 Jan 24.
9
Identification of Compounds That Interfere with High-Throughput Screening Assay Technologies.鉴定干扰高通量筛选技术的化合物。
ChemMedChem. 2019 Oct 17;14(20):1795-1802. doi: 10.1002/cmdc.201900395. Epub 2019 Sep 19.
10
Hit Dexter 2.0: Machine-Learning Models for the Prediction of Frequent Hitters.命中德克斯特 2.0:用于预测高频命中者的机器学习模型。
J Chem Inf Model. 2019 Mar 25;59(3):1030-1043. doi: 10.1021/acs.jcim.8b00677. Epub 2019 Jan 25.