Suppr超能文献

大规模蛋白质-配体数据以支持机器学习。

Protein-ligand data at scale to support machine learning.

作者信息

Edwards Aled M, Owen Dafydd R

机构信息

Structural Genomics Consortium, University of Toronto and University Health Network, Toronto, Ontario, Canada.

Pfizer Research and Development, Cambridge, MA, USA.

出版信息

Nat Rev Chem. 2025 Jul 23. doi: 10.1038/s41570-025-00737-z.

Abstract

Target 2035 is a global initiative that aims to develop a potent and selective pharmacological modulator, such as a chemical probe, for every human protein by 2035. Here, we describe the Target 2035 roadmap to develop computational methods to improve small-molecule hit discovery, which is a key bottleneck in the discovery of chemical probes. Large, publicly available datasets of high-quality protein-small-molecule binding data will be created using affinity-selection mass spectrometry and DNA-encoded chemical library screening. Positive and negative data will be made openly available, and the machine learning community will be challenged to use these data to build models and predict new, diverse small-molecule binders. Iterative cycles of prediction and testing will lead to improved models and more successful predictions. By 2030, Target 2035 will have identified experimentally verified hits for thousands of human proteins and advanced the development of open-access algorithms capable of predicting hits for proteins for which there are not yet any experimental data.

摘要

“2035目标”是一项全球倡议,旨在到2035年为每一种人类蛋白质开发一种强效且具选择性的药理学调节剂,比如化学探针。在此,我们描述了“2035目标”开发计算方法以改进小分子命中发现的路线图,小分子命中发现是化学探针发现过程中的一个关键瓶颈。将通过亲和选择质谱法和DNA编码化学库筛选创建大规模、公开可用的高质量蛋白质 - 小分子结合数据数据集。阳性和阴性数据将公开提供,机器学习社区将面临利用这些数据构建模型并预测新的、多样的小分子结合物的挑战。预测和测试的迭代循环将带来改进的模型和更成功的预测。到2030年,“2035目标”将识别出数千种人类蛋白质的经实验验证的命中物,并推动能够预测尚无任何实验数据的蛋白质命中物的开放获取算法的开发。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验