• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

我们能否利用Transformer模型快速学会“翻译”生物活性分子?

Can We Quickly Learn to "Translate" Bioactive Molecules with Transformer Models?

作者信息

Tysinger Emma P, Rai Brajesh K, Sinitskiy Anton V

机构信息

Machine Learning and Computational Sciences, Pfizer Worldwide Research, Development, and Medical, 610 Main Street, Cambridge, Massachusetts 02139, United States.

出版信息

J Chem Inf Model. 2023 Mar 27;63(6):1734-1744. doi: 10.1021/acs.jcim.2c01618. Epub 2023 Mar 13.

DOI:10.1021/acs.jcim.2c01618
PMID:36914216
Abstract

Meaningful exploration of the chemical space of druglike molecules in drug design is a highly challenging task due to a combinatorial explosion of possible modifications of molecules. In this work, we address this problem with transformer models, a type of machine learning (ML) model originally developed for machine translation. By training transformer models on pairs of similar bioactive molecules from the public ChEMBL data set, we enable them to learn medicinal-chemistry-meaningful, context-dependent transformations of molecules, including those absent from the training set. By retrospective analysis on the performance of transformer models on ChEMBL subsets of ligands binding to COX2, DRD2, or HERG protein targets, we demonstrate that the models can generate structures identical or highly similar to most active ligands, despite the models having not seen any ligands active against the corresponding protein target during training. Our work demonstrates that human experts working on hit expansion in drug design can easily and quickly employ transformer models, originally developed to translate texts from one natural language to another, to "translate" from known molecules active against a given protein target to novel molecules active against the same target.

摘要

在药物设计中,由于药物类分子可能的修饰会出现组合爆炸,因此对其化学空间进行有意义的探索是一项极具挑战性的任务。在这项工作中,我们使用变压器模型来解决这个问题,变压器模型是一种最初为机器翻译而开发的机器学习(ML)模型。通过在来自公共ChEMBL数据集的成对相似生物活性分子上训练变压器模型,我们使它们能够学习有药物化学意义的、上下文相关的分子转化,包括训练集中不存在的转化。通过对变压器模型在与COX2、DRD2或HERG蛋白靶点结合的配体的ChEMBL子集上的性能进行回顾性分析,我们证明,尽管模型在训练期间没有见过任何对相应蛋白靶点有活性的配体,但它们仍能生成与大多数活性配体相同或高度相似的结构。我们的工作表明,从事药物设计中命中扩展的人类专家可以轻松快速地使用最初开发用于将文本从一种自然语言翻译成另一种自然语言的变压器模型,将针对给定蛋白靶点有活性的已知分子“翻译”成针对同一靶点有活性的新分子。

相似文献

1
Can We Quickly Learn to "Translate" Bioactive Molecules with Transformer Models?我们能否利用Transformer模型快速学会“翻译”生物活性分子?
J Chem Inf Model. 2023 Mar 27;63(6):1734-1744. doi: 10.1021/acs.jcim.2c01618. Epub 2023 Mar 13.
2
MolGPT: Molecular Generation Using a Transformer-Decoder Model.MolGPT:基于 Transformer-Decoder 模型的分子生成。
J Chem Inf Model. 2022 May 9;62(9):2064-2076. doi: 10.1021/acs.jcim.1c00600. Epub 2021 Oct 25.
3
Generative Pre-trained Transformer (GPT) based model with relative attention for de novo drug design.基于生成式预训练转换器(GPT)的相对注意力模型在从头设计药物中的应用。
Comput Biol Chem. 2023 Oct;106:107911. doi: 10.1016/j.compbiolchem.2023.107911. Epub 2023 Jun 29.
4
Deep scaffold hopping with multimodal transformer neural networks.基于多模态变压器神经网络的深度骨架跳跃
J Cheminform. 2021 Nov 13;13(1):87. doi: 10.1186/s13321-021-00565-5.
5
Difficulty in chirality recognition for Transformer architectures learning chemical structures from string representations.Transformer架构从字符串表示中学习化学结构时的手性识别困难。
Nat Commun. 2024 Feb 16;15(1):1197. doi: 10.1038/s41467-024-45102-8.
6
Comprehensive Prediction of Molecular Recognition in a Combinatorial Chemical Space Using Machine Learning.使用机器学习全面预测组合化学空间中的分子识别。
ACS Comb Sci. 2020 Oct 12;22(10):500-508. doi: 10.1021/acscombsci.0c00003. Epub 2020 Aug 17.
7
The Development of Target-Specific Machine Learning Models as Scoring Functions for Docking-Based Target Prediction.基于对接的靶标预测中目标特异性机器学习模型作为评分函数的发展。
J Chem Inf Model. 2019 Mar 25;59(3):1238-1252. doi: 10.1021/acs.jcim.8b00773. Epub 2019 Mar 18.
8
Transformer-based molecular optimization beyond matched molecular pairs.超越匹配分子对的基于Transformer的分子优化。
J Cheminform. 2022 Mar 28;14(1):18. doi: 10.1186/s13321-022-00599-3.
9
Heavyweight Statistical Alignment to Guide Neural Translation.重磅统计对齐引导神经翻译。
Comput Intell Neurosci. 2022 Jun 3;2022:6856567. doi: 10.1155/2022/6856567. eCollection 2022.
10
PTML Combinatorial Model of ChEMBL Compounds Assays for Multiple Types of Cancer.PTML 组合模型分析多个类型癌症的 ChEMBL 化合物检测结果。
ACS Comb Sci. 2018 Nov 12;20(11):621-632. doi: 10.1021/acscombsci.8b00090. Epub 2018 Oct 3.

引用本文的文献

1
SELFprot: Effective and Efficient Multitask Finetuning Methods for Protein Parameter Prediction.SELFprot:用于蛋白质参数预测的高效多任务微调方法
J Chem Inf Model. 2025 Apr 14;65(7):3226-3238. doi: 10.1021/acs.jcim.4c02230. Epub 2025 Mar 17.
2
The physics-AI dialogue in drug design.药物设计中的物理与人工智能对话。
RSC Med Chem. 2025 Jan 23;16(4):1499-1515. doi: 10.1039/d4md00869c. eCollection 2025 Apr 16.
3
GraphGPT: A Graph Enhanced Generative Pretrained Transformer for Conditioned Molecular Generation.GraphGPT:一种基于图增强的生成式预训练转换器,用于条件分子生成。
Int J Mol Sci. 2023 Nov 25;24(23):16761. doi: 10.3390/ijms242316761.
4
ADME/tox comes of age: twenty years later.ADME/tox 崭露头角:二十年后。
Xenobiotica. 2024 Jul;54(7):352-358. doi: 10.1080/00498254.2023.2245049. Epub 2023 Aug 8.
5
Open-Source Machine Learning in Computational Chemistry.开源机器学习在计算化学中的应用。
J Chem Inf Model. 2023 Aug 14;63(15):4505-4532. doi: 10.1021/acs.jcim.3c00643. Epub 2023 Jul 19.