计算化学生物基因组学的主动学习。

Active learning for computational chemogenomics.

机构信息

Computer-Assisted Drug Design, Institute of Pharmaceutical Sciences, Department of Chemistry & Applied Biosciences, Swiss Federal Institute of Technology (ETH Zurich), Vladimir-Prelog-Weg 1-5/10, 8093 Zurich, Switzerland.

Koch Institute for Integrative Cancer Research, Massachusetts Institute of Technology, 500 Main St, Cambridge, MA 02139, USA.

出版信息

Future Med Chem. 2017 Mar;9(4):381-402. doi: 10.4155/fmc-2016-0197. Epub 2017 Mar 6.

DOI:10.4155/fmc-2016-0197

PMID:28263088

Abstract

AIM

Computational chemogenomics models the compound-protein interaction space, typically for drug discovery, where existing methods predominantly either incorporate increasing numbers of bioactivity samples or focus on specific subfamilies of proteins and ligands. As an alternative to modeling entire large datasets at once, active learning adaptively incorporates a minimum of informative examples for modeling, yielding compact but high quality models. Results/methodology: We assessed active learning for protein/target family-wide chemogenomic modeling by replicate experiment. Results demonstrate that small yet highly predictive models can be extracted from only 10-25% of large bioactivity datasets, irrespective of molecule descriptors used.

CONCLUSION

Chemogenomic active learning identifies small subsets of ligand-target interactions in a large screening database that lead to knowledge discovery and highly predictive models.

摘要

目的

计算化学基因组学模型化合物-蛋白质相互作用空间，通常用于药物发现，现有方法主要要么纳入越来越多的生物活性样本，要么专注于蛋白质和配体的特定亚家族。作为一次对整个大型数据集进行建模的替代方法，主动学习自适应地纳入最少的信息示例进行建模，从而生成紧凑但高质量的模型。结果/方法：我们通过重复实验评估了蛋白质/靶标家族范围的化学基因组学建模的主动学习。结果表明，无论使用何种分子描述符，都可以从大型生物活性数据集中仅提取 10-25%的小而高度可预测的模型。结论：化学基因组学主动学习可以从大型筛选数据库中识别出导致知识发现和高度可预测模型的小的配体-靶标相互作用子集。

相似文献

Active learning for computational chemogenomics.

Future Med Chem. 2017 Mar;9(4):381-402. doi: 10.4155/fmc-2016-0197. Epub 2017 Mar 6.

Applicability Domain of Active Learning in Chemical Probe Identification: Convergence in Learning from Non-Specific Compounds and Decision Rule Clarification.

Molecules. 2019 Jul 26;24(15):2716. doi: 10.3390/molecules24152716.

A chemogenomics view on protein-ligand spaces.

BMC Bioinformatics. 2009 Jun 16;10 Suppl 6(Suppl 6):S13. doi: 10.1186/1471-2105-10-S6-S13.

The Future of Computational Chemogenomics.

Methods Mol Biol. 2018;1825:425-450. doi: 10.1007/978-1-4939-8639-2_15.

Computational chemogenomics: is it more than inductive transfer?

J Comput Aided Mol Des. 2014 Jun;28(6):597-618. doi: 10.1007/s10822-014-9743-1. Epub 2014 Apr 27.

Selection of Informative Examples in Chemogenomic Datasets.

Methods Mol Biol. 2018;1825:369-410. doi: 10.1007/978-1-4939-8639-2_13.

Linear and Kernel Model Construction Methods for Predicting Drug-Target Interactions in a Chemogenomic Framework.

Methods Mol Biol. 2018;1825:355-368. doi: 10.1007/978-1-4939-8639-2_12.

Quantitative chemogenomics: machine-learning models of protein-ligand interaction.

Curr Top Med Chem. 2011;11(15):1978-93. doi: 10.2174/156802611796391249.

Machine learning in computational docking.

Artif Intell Med. 2015 Mar;63(3):135-52. doi: 10.1016/j.artmed.2015.02.002. Epub 2015 Feb 16.

A Structural Framework for GPCR Chemogenomics: What's In a Residue Number?

Methods Mol Biol. 2018;1705:73-113. doi: 10.1007/978-1-4939-7465-8_4.

引用本文的文献

Advancing genetic engineering with active learning: theory, implementations and potential opportunities.

Brief Bioinform. 2025 Jul 2;26(4). doi: 10.1093/bib/bbaf286.

Molecular property prediction using pretrained-BERT and Bayesian active learning: a data-efficient approach to drug design.

J Cheminform. 2025 Apr 23;17(1):58. doi: 10.1186/s13321-025-00986-6.

PairMap: An Intermediate Insertion Approach for Improving the Accuracy of Relative Free Energy Perturbation Calculations for Distant Compound Transformations.

J Chem Inf Model. 2025 Jan 27;65(2):705-721. doi: 10.1021/acs.jcim.4c01634. Epub 2025 Jan 12.

Automation and machine learning augmented by large language models in a catalysis study.

Chem Sci. 2024 Jun 26;15(31):12200-12233. doi: 10.1039/d3sc07012c. eCollection 2024 Aug 7.

Screening oral drugs for their interactions with the intestinal transportome via porcine tissue explants and machine learning.

Nat Biomed Eng. 2024 Mar;8(3):278-290. doi: 10.1038/s41551-023-01128-9. Epub 2024 Feb 20.

Combatting over-specialization bias in growing chemical databases.

J Cheminform. 2023 May 19;15(1):53. doi: 10.1186/s13321-023-00716-w.

Defining Levels of Automated Chemical Design.

J Med Chem. 2022 May 26;65(10):7073-7087. doi: 10.1021/acs.jmedchem.2c00334. Epub 2022 May 5.

Identification of neoantigens for individualized therapeutic cancer vaccines.

Nat Rev Drug Discov. 2022 Apr;21(4):261-282. doi: 10.1038/s41573-021-00387-y. Epub 2022 Feb 1.

Assigning confidence to molecular property prediction.

Expert Opin Drug Discov. 2021 Sep;16(9):1009-1023. doi: 10.1080/17460441.2021.1925247. Epub 2021 Jun 15.

CATMoS: Collaborative Acute Toxicity Modeling Suite.

Environ Health Perspect. 2021 Apr;129(4):47013. doi: 10.1289/EHP8495. Epub 2021 Apr 30.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

计算化学生物基因组学的主动学习。

Active learning for computational chemogenomics.

机构信息

出版信息

AIM

CONCLUSION

目的

相似文献

引用本文的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献