Suppr超能文献

主动学习有效地识别出NCI-60细胞系中一组最小的、信息量最大且渐近性能良好的细胞毒性构效模式。

Active learning effectively identifies a minimal set of maximally informative and asymptotically performant cytotoxic structure-activity patterns in NCI-60 cell lines.

作者信息

Nakano Takumi, Takeda Shunichi, Brown J B

机构信息

Kyoto University Graduate School of Medicine , Department of Molecular Biosciences , Life Science Informatics Research Unit , Konoemachi Yoshida Sakyo , Kyoto 606-8501 , Japan . Email:

Kyoto University Graduate School of Medicine , Department of Radiation Genetics , Konoemachi Yoshida Sakyo , Kyoto 606-8501 , Japan.

出版信息

RSC Med Chem. 2020 Jul 20;11(9):1075-1087. doi: 10.1039/d0md00110d. eCollection 2020 Sep 1.

Abstract

The NCI-60 cancer cell line screening panel has provided insights for development of subtype-specific chemical therapies and repurposing. By extracting chemical structure and cytotoxicity patterns, virtual screening potentially complements the availability of high-throughput assay platforms and improves bioactive compound discovery rates by computational prefiltering of candidate compound libraries. Many groups report high prediction performances in computational models of NCI-60 data when using cross-validation or similar techniques, yet prospective therapy development in novel cancers may have little to no such data and further may not have the resources to perform hit identification using large compound libraries. In contrast to bulk screening and analysis, the active learning methodology has demonstrated how to identify compounds for screening in small batches and update computational models iteratively, leading to predictive models with a minimum number of compounds, and importantly clarifying data volumes at which limits in predictive ability are achieved. Here, in replicate per-cell line experiments using 50% of data (∼20 000 compounds) as the external prediction target, predictive limits are reproducibly demonstrated at the stage of systematic selection of 10-30% of the incorporable half. The pattern was consistent across all 60 cell lines. Limits of predictability are found to be correlated to the doubling times of cell lines and the number of cellular response discontinuities (activity cliffs) present per cell line. Organization into chemical scaffolds delineated degrees of predictive challenge. These results provide key insights for strategies in developing new inhibitors in existing cell lines or for future automated therapy selection in personalized oncotherapy.

摘要

NCI - 60癌细胞系筛选小组为亚型特异性化学疗法的开发和药物重新利用提供了见解。通过提取化学结构和细胞毒性模式,虚拟筛选有可能补充高通量检测平台的可用性,并通过对候选化合物库进行计算预筛选来提高生物活性化合物的发现率。许多研究小组报告称,在使用交叉验证或类似技术时,NCI - 60数据的计算模型具有很高的预测性能,然而在新型癌症的前瞻性治疗开发中,可能几乎没有此类数据,而且进一步可能没有资源使用大型化合物库来进行活性化合物鉴定。与批量筛选和分析不同,主动学习方法已经展示了如何小批量识别用于筛选的化合物,并迭代更新计算模型,从而得到使用最少数量化合物的预测模型,并且重要的是明确了达到预测能力极限时的数据量。在此,在每个细胞系实验的重复实验中,使用50%的数据(约20000种化合物)作为外部预测目标,在系统选择可纳入的一半的10 - 30%阶段可重复地证明了预测极限。该模式在所有60个细胞系中都是一致的。发现可预测性的极限与细胞系的倍增时间以及每个细胞系中存在的细胞反应不连续性(活性悬崖)的数量相关。按化学支架进行组织划分了预测挑战的程度。这些结果为在现有细胞系中开发新抑制剂的策略或未来个性化肿瘤治疗中的自动治疗选择提供了关键见解。

相似文献

9
Predictive models for estimating cytotoxicity on the basis of chemical structures.基于化学结构预测细胞毒性的模型。
Bioorg Med Chem. 2020 May 15;28(10):115422. doi: 10.1016/j.bmc.2020.115422. Epub 2020 Mar 12.

本文引用的文献

1
SciPy 1.0: fundamental algorithms for scientific computing in Python.SciPy 1.0:Python 中的科学计算基础算法。
Nat Methods. 2020 Mar;17(3):261-272. doi: 10.1038/s41592-019-0686-2. Epub 2020 Feb 3.
8
PubChem 2019 update: improved access to chemical data.PubChem 2019 年更新:改善化学数据获取。
Nucleic Acids Res. 2019 Jan 8;47(D1):D1102-D1109. doi: 10.1093/nar/gky1033.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验