主动学习在化学探针识别中的适用性领域：从非特异性化合物中学习的收敛性和决策规则的阐明。

Applicability Domain of Active Learning in Chemical Probe Identification: Convergence in Learning from Non-Specific Compounds and Decision Rule Clarification.

机构信息

Kyoto University Graduate School of Medicine, Department of Molecular Biosciences, Life Science Informatics Research Unit, Kyoto, Sakyo, Yoshida, Konoemachi, Kyoto 606-8501, Japan.

Kyoto University Graduate School of Medicine, Department of Radiation Genetics; Kyoto, Sakyo, Yoshida, Konoemachi, Kyoto 606-8501, Japan.

出版信息

Molecules. 2019 Jul 26;24(15):2716. doi: 10.3390/molecules24152716.

DOI:10.3390/molecules24152716

PMID:31357419

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC6696588/

Abstract

Efficient identification of chemical probes for the manipulation and understanding of biological systems demands specificity for target proteins. Computational means to optimize candidate compound selection for experimental selectivity evaluation are being sought. The active learning virtual screening method has demonstrated the ability to efficiently converge on predictive models with reduced datasets, though its applicability domain to probe identification has yet to be determined. In this article, we challenge active learning's ability to predict inhibitory bioactivity profiles of selective compounds when learning from chemogenomic features found in non-selective ligand-target pairs. Comparison of controls versus multiple molecule representations de-convolutes factors contributing to predictive capability. Experiments using the matrix metalloproteinase family demonstrate maximum probe bioactivity prediction achieved from only approximately 20% of non-probe bioactivity; this data volume is consistent with prior chemogenomic active learning studies despite the increased difficulty from chemical biology experimental settings used here. Feature weight analyses are combined with a custom visualization to unambiguously detail how active learning arrives at classification decisions, yielding clarified expectations for chemogenomic modeling. The results influence tactical decisions for computational probe design and discovery.

摘要

高效识别用于操纵和理解生物系统的化学探针需要针对目标蛋白质的特异性。正在寻找用于优化候选化合物选择以进行实验选择性评估的计算方法。主动学习虚拟筛选方法已证明能够有效地利用减少的数据集来收敛于预测模型，尽管其在探针识别中的适用范围尚未确定。在本文中，我们挑战主动学习从非选择性配体-靶对中发现的化学生物组学特征中学习时预测选择性化合物抑制生物活性谱的能力。对照与多种分子表示形式的比较可推断出对预测能力有贡献的因素。使用基质金属蛋白酶家族的实验证明，仅从大约 20%的非探针生物活性中即可实现最大探针生物活性预测；尽管使用了这里采用的化学生物学实验设置，增加了难度，但该数据量与先前的化学生物组学主动学习研究一致。特征权重分析与自定义可视化相结合，可以明确详细地说明主动学习如何做出分类决策，从而为化学生物组学建模提供更明确的预期。结果影响计算探针设计和发现的战术决策。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2175/6696588/503996af43fc/molecules-24-02716-g001.jpg

相似文献

Applicability Domain of Active Learning in Chemical Probe Identification: Convergence in Learning from Non-Specific Compounds and Decision Rule Clarification.主动学习在化学探针识别中的适用性领域：从非特异性化合物中学习的收敛性和决策规则的阐明。

Molecules. 2019 Jul 26;24(15):2716. doi: 10.3390/molecules24152716.

Active learning for computational chemogenomics.计算化学生物基因组学的主动学习。

Future Med Chem. 2017 Mar;9(4):381-402. doi: 10.4155/fmc-2016-0197. Epub 2017 Mar 6.

Chemogenomic Active Learning's Domain of Applicability on Small, Sparse qHTS Matrices: A Study Using Cytochrome P450 and Nuclear Hormone Receptor Families.化学生物基因组学主动学习在小而稀疏的 qHTS 矩阵上的适用领域：使用细胞色素 P450 和核激素受体家族进行的研究。

ChemMedChem. 2018 Mar 20;13(6):511-521. doi: 10.1002/cmdc.201700677. Epub 2018 Feb 5.

Automated Inference of Chemical Discriminants of Biological Activity.生物活性化学判别因子的自动推断

Methods Mol Biol. 2018;1762:307-338. doi: 10.1007/978-1-4939-7756-7_16.

Linear and Kernel Model Construction Methods for Predicting Drug-Target Interactions in a Chemogenomic Framework.用于在化学基因组框架中预测药物-靶点相互作用的线性和核模型构建方法。

Methods Mol Biol. 2018;1825:355-368. doi: 10.1007/978-1-4939-8639-2_12.

TargetHunter: an in silico target identification tool for predicting therapeutic potential of small organic molecules based on chemogenomic database.TargetHunter：一种基于化学生物基因组数据库的计算机辅助药物靶点预测工具，用于预测小分子药物的治疗潜力。

AAPS J. 2013 Apr;15(2):395-406. doi: 10.1208/s12248-012-9449-z. Epub 2013 Jan 5.

Prediction of matrix metal proteinases-12 inhibitors by machine learning approaches.基于机器学习方法预测基质金属蛋白酶-12 抑制剂。

J Biomol Struct Dyn. 2019 Jul;37(10):2627-2640. doi: 10.1080/07391102.2018.1492460. Epub 2018 Dec 24.

Selection of Informative Examples in Chemogenomic Datasets.化学基因组学数据集中信息性示例的选择

Methods Mol Biol. 2018;1825:369-410. doi: 10.1007/978-1-4939-8639-2_13.

Persistent spectral hypergraph based machine learning (PSH-ML) for protein-ligand binding affinity prediction.基于持久谱超图的机器学习（PSH-ML）用于蛋白质-配体结合亲和力预测。

Brief Bioinform. 2021 Sep 2;22(5). doi: 10.1093/bib/bbab127.

Targeting HIV/HCV Coinfection Using a Machine Learning-Based Multiple Quantitative Structure-Activity Relationships (Multiple QSAR) Method.基于机器学习的多重定量构效关系（多重 QSAR）方法靶向 HIV/HCV 共感染。

Int J Mol Sci. 2019 Jul 22;20(14):3572. doi: 10.3390/ijms20143572.

引用本文的文献

Navigating the Expansive Landscapes of Soft Materials: A User Guide for High-Throughput Workflows.探索软材料的广阔领域：高通量工作流程用户指南

ACS Polym Au. 2023 Dec 5;3(6):406-427. doi: 10.1021/acspolymersau.3c00025. eCollection 2023 Dec 13.

Active learning effectively identifies a minimal set of maximally informative and asymptotically performant cytotoxic structure-activity patterns in NCI-60 cell lines.主动学习有效地识别出NCI-60细胞系中一组最小的、信息量最大且渐近性能良好的细胞毒性构效模式。

RSC Med Chem. 2020 Jul 20;11(9):1075-1087. doi: 10.1039/d0md00110d. eCollection 2020 Sep 1.

Exception That Proves the Rule: Investigation of Privileged Stereochemistry in Designing Dopamine DR Bitopic Agonists.反例证明规则：多巴胺DR双位点激动剂设计中优势立体化学的研究

ACS Med Chem Lett. 2020 Feb 28;11(10):1956-1964. doi: 10.1021/acsmedchemlett.9b00660. eCollection 2020 Oct 8.

本文引用的文献

ChEMBL: towards direct deposition of bioassay data.ChEMBL：致力于直接生成生物测定数据。

Nucleic Acids Res. 2019 Jan 8;47(D1):D930-D940. doi: 10.1093/nar/gky1075.

Selection of Informative Examples in Chemogenomic Datasets.化学基因组学数据集中信息性示例的选择

Methods Mol Biol. 2018;1825:369-410. doi: 10.1007/978-1-4939-8639-2_13.

Fundamental Bioinformatic and Chemoinformatic Data Processing.基础生物信息学和化学信息学数据处理

Methods Mol Biol. 2018;1825:95-129. doi: 10.1007/978-1-4939-8639-2_3.

Prediction of Compound Profiling Matrices, Part II: Relative Performance of Multitask Deep Learning and Random Forest Classification on the Basis of Varying Amounts of Training Data.化合物特征矩阵预测，第二部分：基于不同数量训练数据的多任务深度学习和随机森林分类的相对性能

ACS Omega. 2018 Sep 30;3(9):12033-12040. doi: 10.1021/acsomega.8b01682. Epub 2018 Sep 27.

Adaptive mining and model building of medicinal chemistry data with a multi-metric perspective.基于多指标视角的药物化学数据自适应挖掘与模型构建

Future Med Chem. 2018 Aug 1;10(16):1885-1887. doi: 10.4155/fmc-2018-0188. Epub 2018 Jul 3.

Has artificial intelligence become alchemy?人工智能变成炼金术了吗？

Science. 2018 May 4;360(6388):478. doi: 10.1126/science.360.6388.478.

Advancing drug discovery via GPU-based deep learning.通过基于图形处理器的深度学习推进药物研发。

Expert Opin Drug Discov. 2018 Jul;13(7):579-582. doi: 10.1080/17460441.2018.1465407. Epub 2018 Apr 18.

Comment on "The power metric: a new statistically robust enrichment-type metric for virtual screening applications with early recovery capability".对《功率度量：一种具有早期恢复能力的用于虚拟筛选应用的新型统计稳健富集型度量》的评论。

J Cheminform. 2018 Mar 15;10(1):13. doi: 10.1186/s13321-018-0267-x.

The rise of deep learning in drug discovery.深度学习在药物发现中的崛起。

Drug Discov Today. 2018 Jun;23(6):1241-1250. doi: 10.1016/j.drudis.2018.01.039. Epub 2018 Jan 31.

Classifiers and their Metrics Quantified.分类器及其度量指标的量化。

Mol Inform. 2018 Jan;37(1-2). doi: 10.1002/minf.201700127. Epub 2018 Jan 23.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

主动学习在化学探针识别中的适用性领域：从非特异性化合物中学习的收敛性和决策规则的阐明。

Applicability Domain of Active Learning in Chemical Probe Identification: Convergence in Learning from Non-Specific Compounds and Decision Rule Clarification.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献