• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

滑动窗口相互作用语法(SWING):一种用于肽和蛋白质相互作用的广义相互作用语言模型。

Sliding Window INteraction Grammar (SWING): a generalized interaction language model for peptide and protein interactions.

作者信息

Omelchenko Alisa A, Siwek Jane C, Chhibbar Prabal, Arshad Sanya, Nazarali Iliyan, Nazarali Kiran, Rosengart AnnaElaine, Rahimikollu Javad, Tilstra Jeremy, Shlomchik Mark J, Koes David R, Joglekar Alok V, Das Jishnu

机构信息

Center for Systems immunology, School of Medicine, University of Pittsburgh, Pittsburgh, PA, USA.

Department of Immunology, School of Medicine, University of Pittsburgh, Pittsburgh, PA, USA.

出版信息

bioRxiv. 2024 May 4:2024.05.01.592062. doi: 10.1101/2024.05.01.592062.

DOI:10.1101/2024.05.01.592062
PMID:38746274
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11092674/
Abstract

The explosion of sequence data has allowed the rapid growth of protein language models (pLMs). pLMs have now been employed in many frameworks including variant-effect and peptide-specificity prediction. Traditionally, for protein-protein or peptide-protein interactions (PPIs), corresponding sequences are either co-embedded followed by post-hoc integration or the sequences are concatenated prior to embedding. Interestingly, no method utilizes a language representation of the interaction itself. We developed an interaction LM (iLM), which uses a novel language to represent interactions between protein/peptide sequences. Sliding Window Interaction Grammar (SWING) leverages differences in amino acid properties to generate an interaction vocabulary. This vocabulary is the input into a LM followed by a supervised prediction step where the LM's representations are used as features. SWING was first applied to predicting peptide:MHC (pMHC) interactions. SWING was not only successful at generating Class I and Class II models that have comparable prediction to state-of-the-art approaches, but the unique Mixed Class model was also successful at jointly predicting both classes. Further, the SWING model trained only on Class I alleles was predictive for Class II, a complex prediction task not attempted by any existing approach. For de novo data, using only Class I or Class II data, SWING also accurately predicted Class II pMHC interactions in murine models of SLE (MRL/lpr model) and T1D (NOD model), that were validated experimentally. To further evaluate SWING's generalizability, we tested its ability to predict the disruption of specific protein-protein interactions by missense mutations. Although modern methods like AlphaMissense and ESM1b can predict interfaces and variant effects/pathogenicity per mutation, they are unable to predict interaction-specific disruptions. SWING was successful at accurately predicting the impact of both Mendelian mutations and population variants on PPIs. This is the first generalizable approach that can accurately predict interaction-specific disruptions by missense mutations with only sequence information. Overall, SWING is a first-in-class generalizable zero-shot iLM that learns the language of PPIs.

摘要

序列数据的爆炸式增长推动了蛋白质语言模型(pLMs)的快速发展。目前,pLMs已被应用于许多框架中,包括变异效应和肽特异性预测。传统上,对于蛋白质-蛋白质或肽-蛋白质相互作用(PPIs),相应序列要么先进行共嵌入,然后进行事后整合,要么在嵌入之前进行拼接。有趣的是,没有一种方法利用相互作用本身的语言表示。我们开发了一种相互作用语言模型(iLM),它使用一种新颖的语言来表示蛋白质/肽序列之间的相互作用。滑动窗口相互作用语法(SWING)利用氨基酸特性的差异来生成相互作用词汇表。这个词汇表作为输入进入语言模型,随后是一个监督预测步骤,其中语言模型的表示被用作特征。SWING首先应用于预测肽:主要组织相容性复合体(pMHC)相互作用。SWING不仅成功生成了与现有最先进方法具有可比预测能力的I类和II类模型,而且独特的混合类模型在联合预测这两类相互作用方面也取得了成功。此外,仅在I类等位基因上训练的SWING模型对II类具有预测能力,这是任何现有方法都未尝试过的复杂预测任务。对于从头数据,仅使用I类或II类数据,SWING也准确预测了系统性红斑狼疮(MRL/lpr模型)和1型糖尿病(NOD模型)小鼠模型中的II类pMHC相互作用,并通过实验得到了验证。为了进一步评估SWING的通用性,我们测试了它预测错义突变对特定蛋白质-蛋白质相互作用破坏的能力。尽管像AlphaMissense和ESM1b这样的现代方法可以预测每个突变的界面以及变异效应/致病性,但它们无法预测相互作用特异性的破坏。SWING成功地准确预测了孟德尔突变和群体变异对PPIs的影响。这是第一种仅利用序列信息就能准确预测错义突变对相互作用特异性破坏的通用方法。总体而言,SWING是一流的可通用零样本iLM,它学习PPIs的语言。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/217d/11092674/7dde1317948a/nihpp-2024.05.01.592062v1-f0005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/217d/11092674/e680de5e5566/nihpp-2024.05.01.592062v1-f0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/217d/11092674/44aeb5fa671c/nihpp-2024.05.01.592062v1-f0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/217d/11092674/cf4d326f7f88/nihpp-2024.05.01.592062v1-f0003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/217d/11092674/f514e6a51062/nihpp-2024.05.01.592062v1-f0004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/217d/11092674/7dde1317948a/nihpp-2024.05.01.592062v1-f0005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/217d/11092674/e680de5e5566/nihpp-2024.05.01.592062v1-f0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/217d/11092674/44aeb5fa671c/nihpp-2024.05.01.592062v1-f0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/217d/11092674/cf4d326f7f88/nihpp-2024.05.01.592062v1-f0003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/217d/11092674/f514e6a51062/nihpp-2024.05.01.592062v1-f0004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/217d/11092674/7dde1317948a/nihpp-2024.05.01.592062v1-f0005.jpg

相似文献

1
Sliding Window INteraction Grammar (SWING): a generalized interaction language model for peptide and protein interactions.滑动窗口相互作用语法(SWING):一种用于肽和蛋白质相互作用的广义相互作用语言模型。
bioRxiv. 2024 May 4:2024.05.01.592062. doi: 10.1101/2024.05.01.592062.
2
Sliding Window Interaction Grammar (SWING): a generalized interaction language model for peptide and protein interactions.滑动窗口相互作用语法(SWING):一种用于肽和蛋白质相互作用的广义相互作用语言模型。
Nat Methods. 2025 Jul 28. doi: 10.1038/s41592-025-02723-1.
3
Comparison of Two Modern Survival Prediction Tools, SORG-MLA and METSSS, in Patients With Symptomatic Long-bone Metastases Who Underwent Local Treatment With Surgery Followed by Radiotherapy and With Radiotherapy Alone.两种现代生存预测工具 SORG-MLA 和 METSSS 在接受手术联合放疗和单纯放疗治疗有症状长骨转移患者中的比较。
Clin Orthop Relat Res. 2024 Dec 1;482(12):2193-2208. doi: 10.1097/CORR.0000000000003185. Epub 2024 Jul 23.
4
Signs and symptoms to determine if a patient presenting in primary care or hospital outpatient settings has COVID-19.在基层医疗机构或医院门诊环境中,如果患者出现以下症状和体征,可判断其是否患有 COVID-19。
Cochrane Database Syst Rev. 2022 May 20;5(5):CD013665. doi: 10.1002/14651858.CD013665.pub3.
5
Drugs for preventing postoperative nausea and vomiting in adults after general anaesthesia: a network meta-analysis.成人全身麻醉后预防术后恶心呕吐的药物:网状Meta分析
Cochrane Database Syst Rev. 2020 Oct 19;10(10):CD012859. doi: 10.1002/14651858.CD012859.pub2.
6
The Black Book of Psychotropic Dosing and Monitoring.《精神药物剂量与监测黑皮书》
Psychopharmacol Bull. 2024 Jul 8;54(3):8-59.
7
Pharmacological treatment of children with gastro-oesophageal reflux.胃食管反流患儿的药物治疗
Cochrane Database Syst Rev. 2014 Nov 24;2014(11):CD008550. doi: 10.1002/14651858.CD008550.pub2.
8
The quantity, quality and findings of network meta-analyses evaluating the effectiveness of GLP-1 RAs for weight loss: a scoping review.评估胰高血糖素样肽-1受体激动剂(GLP-1 RAs)减肥效果的网状Meta分析的数量、质量及结果:一项范围综述
Health Technol Assess. 2025 Jun 25:1-73. doi: 10.3310/SKHT8119.
9
Short-Term Memory Impairment短期记忆障碍
10
Diagnostic test accuracy and cost-effectiveness of tests for codeletion of chromosomal arms 1p and 19q in people with glioma.染色体臂 1p 和 19q 缺失的检测在胶质瘤患者中的诊断准确性和成本效益。
Cochrane Database Syst Rev. 2022 Mar 2;3(3):CD013387. doi: 10.1002/14651858.CD013387.pub2.

本文引用的文献

1
De novo identification of CD4 T cell epitopes.从头鉴定 CD4 T 细胞表位。
Nat Methods. 2024 May;21(5):846-856. doi: 10.1038/s41592-024-02255-0. Epub 2024 Apr 24.
2
xCAPT5: protein-protein interaction prediction using deep and wide multi-kernel pooling convolutional neural networks with protein language model.xCAPT5:使用深度和广泛的多核池卷积神经网络与蛋白质语言模型进行蛋白质-蛋白质相互作用预测。
BMC Bioinformatics. 2024 Mar 10;25(1):106. doi: 10.1186/s12859-024-05725-6.
3
Recent advances in generative biology for biotherapeutic discovery.
用于生物治疗发现的生成生物学的最新进展。
Trends Pharmacol Sci. 2024 Mar;45(3):255-267. doi: 10.1016/j.tips.2024.01.003. Epub 2024 Feb 19.
4
Designing proteins with language models.利用语言模型设计蛋白质。
Nat Biotechnol. 2024 Feb;42(2):200-202. doi: 10.1038/s41587-024-02123-4.
5
Accurate proteome-wide missense variant effect prediction with AlphaMissense.使用 AlphaMissense 进行精确的全蛋白质错义变异效应预测。
Science. 2023 Sep 22;381(6664):eadg7492. doi: 10.1126/science.adg7492.
6
Genome-wide prediction of disease variant effects with a deep protein language model.利用深度蛋白质语言模型进行全基因组疾病变异效应预测。
Nat Genet. 2023 Sep;55(9):1512-1522. doi: 10.1038/s41588-023-01465-0. Epub 2023 Aug 10.
7
Transfer learning enables predictions in network biology.迁移学习可实现网络生物学预测。
Nature. 2023 Jun;618(7965):616-624. doi: 10.1038/s41586-023-06139-9. Epub 2023 May 31.
8
Efficient evolution of human antibodies from general protein language models.从通用蛋白质语言模型中高效进化出人类抗体。
Nat Biotechnol. 2024 Feb;42(2):275-283. doi: 10.1038/s41587-023-01763-2. Epub 2023 Apr 24.
9
Graph-BERT and language model-based framework for protein-protein interaction identification.基于图伯特和语言模型的蛋白质相互作用识别框架。
Sci Rep. 2023 Apr 6;13(1):5663. doi: 10.1038/s41598-023-31612-w.
10
Machine learning predictions of MHC-II specificities reveal alternative binding mode of class II epitopes.机器学习预测 MHC-II 特异性揭示了 II 类抗原表位的另一种结合模式。
Immunity. 2023 Jun 13;56(6):1359-1375.e13. doi: 10.1016/j.immuni.2023.03.009. Epub 2023 Apr 5.