• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

用于蛋白质-配体相互作用预测的多尺度拓扑结构到序列变压器

Multiscale topology-enabled structure-to-sequence transformer for protein-ligand interaction predictions.

作者信息

Chen Dong, Liu Jian, Wei Guo-Wei

机构信息

Department of Mathematics, Michigan State University, East Lansing, MI, USA.

Mathematical Science Research Center, Chongqing University of Technology, Chongqing, China.

出版信息

Nat Mach Intell. 2024 Jul;6(7):799-810. doi: 10.1038/s42256-024-00855-1. Epub 2024 Jun 21.

DOI:10.1038/s42256-024-00855-1
PMID:40718138
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12290916/
Abstract

Despite the success of pretrained natural language processing (NLP) models in various fields, their application in computational biology has been hindered by their reliance on biological sequences, which ignores vital three-dimensional (3D) structural information incompatible with the sequential architecture of NLP models. Here we present a topological transformer (TopoFormer), which is built by integrating NLP models and a multiscale topology technique, the persistent topological hyperdigraph Laplacian (PTHL), which systematically converts intricate 3D protein-ligand complexes at various spatial scales into an NLP-admissible sequence of topological invariants and homotopic shapes. PTHL systematically transforms intricate 3D protein-ligand complexes into NLP-compatible sequences of topological invariants and shapes, capturing essential interactions across spatial scales. TopoFormer gives rise to exemplary scoring accuracy and excellent performance in ranking, docking and screening tasks in several benchmark datasets. This approach can be utilized to convert general high-dimensional structured data into NLP-compatible sequences, paving the way for broader NLP based research.

摘要

尽管预训练自然语言处理(NLP)模型在各个领域都取得了成功,但其在计算生物学中的应用却受到了阻碍,因为它们依赖生物序列,而忽略了与NLP模型序列架构不兼容的重要三维(3D)结构信息。在此,我们提出了一种拓扑变换器(TopoFormer),它是通过整合NLP模型和一种多尺度拓扑技术——持久拓扑超图拉普拉斯算子(PTHL)构建而成的,该技术将不同空间尺度上复杂的3D蛋白质-配体复合物系统地转换为NLP可接受的拓扑不变量和同伦形状序列。PTHL将复杂的3D蛋白质-配体复合物系统地转换为与NLP兼容的拓扑不变量和形状序列,捕捉跨空间尺度的基本相互作用。在几个基准数据集中,TopoFormer在排名、对接和筛选任务中展现出了出色的评分准确性和卓越性能。这种方法可用于将一般的高维结构化数据转换为与NLP兼容的序列,为更广泛的基于NLP的研究铺平道路。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1229/12290916/8a4382eae0a6/nihms-2098110-f0005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1229/12290916/4d9f32a348fc/nihms-2098110-f0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1229/12290916/460c473dee29/nihms-2098110-f0003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1229/12290916/fc0e3eeab3b2/nihms-2098110-f0004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1229/12290916/8a4382eae0a6/nihms-2098110-f0005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1229/12290916/4d9f32a348fc/nihms-2098110-f0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1229/12290916/460c473dee29/nihms-2098110-f0003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1229/12290916/fc0e3eeab3b2/nihms-2098110-f0004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1229/12290916/8a4382eae0a6/nihms-2098110-f0005.jpg

相似文献

1
Multiscale topology-enabled structure-to-sequence transformer for protein-ligand interaction predictions.用于蛋白质-配体相互作用预测的多尺度拓扑结构到序列变压器
Nat Mach Intell. 2024 Jul;6(7):799-810. doi: 10.1038/s42256-024-00855-1. Epub 2024 Jun 21.
2
TopoFormer: Multiscale Topology-enabled Structure-to-Sequence Transformer for Protein-Ligand Interaction Predictions.TopoFormer:用于蛋白质-配体相互作用预测的多尺度拓扑结构序列Transformer
Res Sq. 2024 Feb 9:rs.3.rs-3640878. doi: 10.21203/rs.3.rs-3640878/v1.
3
Short-Term Memory Impairment短期记忆障碍
4
Multiscale Probabilistic Modeling: A Bayesian Approach to Augment Mechanistic Models of Cell Signaling with Machine-Learning Predictions of Binding Affinity.多尺度概率建模:一种利用结合亲和力的机器学习预测增强细胞信号传导机制模型的贝叶斯方法。
bioRxiv. 2025 Jul 9:2025.05.23.655795. doi: 10.1101/2025.05.23.655795.
5
High-throughput library transgenesis in via Transgenic Arrays Resulting in Diversity of Integrated Sequences (TARDIS).利用 Transgenic Arrays Resulting in Diversity of Integrated Sequences (TARDIS) 进行 中的高通量文库转基因
Elife. 2023 Jul 4;12:RP84831. doi: 10.7554/eLife.84831.
6
An overview and evaluation of first-trimester physiological fetal human anatomy using 3-dimensional ultrasound combined with virtual reality techniques.使用三维超声结合虚拟现实技术对孕早期生理性胎儿人体解剖结构的概述与评估。
Hum Reprod. 2025 Jun 27. doi: 10.1093/humrep/deaf112.
7
Systemic Inflammatory Response Syndrome全身炎症反应综合征
8
Enhancing Clinical Relevance of Pretrained Language Models Through Integration of External Knowledge: Case Study on Cardiovascular Diagnosis From Electronic Health Records.通过整合外部知识提高预训练语言模型的临床相关性:来自电子健康记录的心血管诊断案例研究
JMIR AI. 2024 Aug 6;3:e56932. doi: 10.2196/56932.
9
Predicting Affinity Through Homology (PATH): Interpretable Binding Affinity Prediction with Persistent Homology.通过同源性预测亲和力(PATH):基于持久同源性的可解释结合亲和力预测
bioRxiv. 2024 Oct 21:2023.11.16.567384. doi: 10.1101/2023.11.16.567384.
10
Assessing the comparative effects of interventions in COPD: a tutorial on network meta-analysis for clinicians.评估慢性阻塞性肺疾病干预措施的比较效果:面向临床医生的网状Meta分析教程
Respir Res. 2024 Dec 21;25(1):438. doi: 10.1186/s12931-024-03056-x.

本文引用的文献

1
PERSISTENT HYPERDIGRAPH HOMOLOGY AND PERSISTENT HYPERDIGRAPH LAPLACIANS.持久超图同调与持久超图拉普拉斯算子
Found Data Sci. 2023 Dec;5(4):558-588. doi: 10.3934/fods.2023010.
2
AI is a viable alternative to high throughput screening: a 318-target study.人工智能是高通量筛选的可行替代方案:一项 318 靶点研究。
Sci Rep. 2024 Apr 2;14(1):7526. doi: 10.1038/s41598-024-54655-z.
3
Multiobjective tree-based reinforcement learning for estimating tolerant dynamic treatment regimes.基于多目标树的强化学习估计宽容动态治疗方案。
Biometrics. 2024 Jan 29;80(1). doi: 10.1093/biomtc/ujad017.
4
Inactive-enriched machine-learning models exploiting patent data improve structure-based virtual screening for PDL1 dimerizers.利用专利数据的非活性增强型机器学习模型改进了基于结构的PDL1二聚体虚拟筛选。
J Adv Res. 2025 Jan;67:185-196. doi: 10.1016/j.jare.2024.01.024. Epub 2024 Jan 26.
5
A practical guide to machine-learning scoring for structure-based virtual screening.基于结构的虚拟筛选的机器学习评分实用指南。
Nat Protoc. 2023 Nov;18(11):3460-3511. doi: 10.1038/s41596-023-00885-w. Epub 2023 Oct 16.
6
A generalized protein-ligand scoring framework with balanced scoring, docking, ranking and screening powers.一个具有平衡评分、对接、排序和筛选能力的通用蛋白质-配体评分框架。
Chem Sci. 2023 Jul 4;14(30):8129-8146. doi: 10.1039/d3sc02044d. eCollection 2023 Aug 2.
7
Contrastive learning in protein language space predicts interactions between drugs and protein targets.蛋白质语言空间中的对比学习可预测药物与蛋白质靶标之间的相互作用。
Proc Natl Acad Sci U S A. 2023 Jun 13;120(24):e2220778120. doi: 10.1073/pnas.2220778120. Epub 2023 Jun 8.
8
Evolutionary-scale prediction of atomic-level protein structure with a language model.用语言模型进行原子级蛋白质结构的进化尺度预测。
Science. 2023 Mar 17;379(6637):1123-1130. doi: 10.1126/science.ade2574. Epub 2023 Mar 16.
9
Turning high-throughput structural biology into predictive inhibitor design.将高通量结构生物学转化为可预测的抑制剂设计。
Proc Natl Acad Sci U S A. 2023 Mar 14;120(11):e2214168120. doi: 10.1073/pnas.2214168120. Epub 2023 Mar 6.
10
Beware of Simple Methods for Structure-Based Virtual Screening: The Critical Importance of Broader Comparisons.警惕基于结构的虚拟筛选的简单方法:更广泛比较的至关重要性。
J Chem Inf Model. 2023 Mar 13;63(5):1401-1405. doi: 10.1021/acs.jcim.3c00218. Epub 2023 Feb 27.