• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

通过整合蛋白质语言模型进行单序列蛋白质结构预测。

Single-sequence protein structure prediction by integrating protein language models.

机构信息

MoleculeMind Ltd., Beijing 100084, China.

Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100190, China.

出版信息

Proc Natl Acad Sci U S A. 2024 Mar 26;121(13):e2308788121. doi: 10.1073/pnas.2308788121. Epub 2024 Mar 20.

DOI:10.1073/pnas.2308788121
PMID:38507445
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10990103/
Abstract

Protein structure prediction has been greatly improved by deep learning in the past few years. However, the most successful methods rely on multiple sequence alignment (MSA) of the sequence homologs of the protein under prediction. In nature, a protein folds in the absence of its sequence homologs and thus, a MSA-free structure prediction method is desired. Here, we develop a single-sequence-based protein structure prediction method RaptorX-Single by integrating several protein language models and a structure generation module and then study its advantage over MSA-based methods. Our experimental results indicate that in addition to running much faster than MSA-based methods such as AlphaFold2, RaptorX-Single outperforms AlphaFold2 and other MSA-free methods in predicting the structure of antibodies (after fine-tuning on antibody data), proteins of very few sequence homologs, and single mutation effects. By comparing different protein language models, our results show that not only the scale but also the training data of protein language models will impact the performance. RaptorX-Single also compares favorably to MSA-based AlphaFold2 when the protein under prediction has a large number of sequence homologs.

摘要

在过去的几年中,深度学习极大地提高了蛋白质结构预测的能力。然而,最成功的方法依赖于预测蛋白质的序列同源物的多重序列比对(MSA)。在自然界中,蛋白质在没有其序列同源物的情况下折叠,因此需要一种无 MSA 的结构预测方法。在这里,我们通过整合几个蛋白质语言模型和一个结构生成模块,开发了一种基于单序列的蛋白质结构预测方法 RaptorX-Single,然后研究了它相对于基于 MSA 的方法的优势。我们的实验结果表明,除了比基于 MSA 的方法(如 AlphaFold2)运行速度快得多之外,RaptorX-Single 在预测抗体(在抗体数据上进行微调后)、序列同源物非常少的蛋白质和单突变效应的结构方面也优于 AlphaFold2 和其他无 MSA 的方法。通过比较不同的蛋白质语言模型,我们的结果表明,不仅模型的规模,而且训练数据也会影响性能。当预测的蛋白质有大量序列同源物时,RaptorX-Single 与基于 MSA 的 AlphaFold2 相比也具有优势。

相似文献

1
Single-sequence protein structure prediction by integrating protein language models.通过整合蛋白质语言模型进行单序列蛋白质结构预测。
Proc Natl Acad Sci U S A. 2024 Mar 26;121(13):e2308788121. doi: 10.1073/pnas.2308788121. Epub 2024 Mar 20.
2
Improved structure-related prediction for insufficient homologous proteins using MSA enhancement and pre-trained language model.利用多序列比对增强和预训练语言模型提高同源蛋白不足的结构相关预测。
Brief Bioinform. 2023 Jul 20;24(4). doi: 10.1093/bib/bbad217.
3
Improving protein structure prediction using templates and sequence embedding.利用模板和序列嵌入改进蛋白质结构预测。
Bioinformatics. 2023 Jan 1;39(1). doi: 10.1093/bioinformatics/btac723.
4
Integrating deep learning, threading alignments, and a multi-MSA strategy for high-quality protein monomer and complex structure prediction in CASP15.在 CASP15 中,通过深度学习、线程比对和多 MSAs 策略,实现高质量蛋白质单体和复合物结构预测。
Proteins. 2023 Dec;91(12):1684-1703. doi: 10.1002/prot.26585. Epub 2023 Aug 31.
5
Analysis of distance-based protein structure prediction by deep learning in CASP13.基于深度学习的 CASP13 蛋白质结构预测距离分析。
Proteins. 2019 Dec;87(12):1069-1081. doi: 10.1002/prot.25810. Epub 2019 Sep 13.
6
Improved the heterodimer protein complex prediction with protein language models.利用蛋白质语言模型改进异二聚体蛋白复合物预测。
Brief Bioinform. 2023 Jul 20;24(4). doi: 10.1093/bib/bbad221.
7
Improving deep learning protein monomer and complex structure prediction using DeepMSA2 with huge metagenomics data.利用 DeepMSA2 和海量宏基因组学数据改进深度学习蛋白质单体和复合物结构预测。
Nat Methods. 2024 Feb;21(2):279-289. doi: 10.1038/s41592-023-02130-4. Epub 2024 Jan 2.
8
Pairing interacting protein sequences using masked language modeling.使用掩蔽语言模型对相互作用的蛋白质序列进行配对。
Proc Natl Acad Sci U S A. 2024 Jul 2;121(27):e2311887121. doi: 10.1073/pnas.2311887121. Epub 2024 Jun 24.
9
Accurate De Novo Prediction of Protein Contact Map by Ultra-Deep Learning Model.基于超深度学习模型的蛋白质接触图从头精确预测
PLoS Comput Biol. 2017 Jan 5;13(1):e1005324. doi: 10.1371/journal.pcbi.1005324. eCollection 2017 Jan.
10
Single-sequence protein structure prediction using a language model and deep learning.基于语言模型和深度学习的单序列蛋白质结构预测。
Nat Biotechnol. 2022 Nov;40(11):1617-1623. doi: 10.1038/s41587-022-01432-w. Epub 2022 Oct 3.

引用本文的文献

1
SAGERank: inductive learning of protein-protein interaction from antibody-antigen recognition.SAGERank:从抗体-抗原识别中进行蛋白质-蛋白质相互作用的归纳学习。
Chem Sci. 2025 Aug 12. doi: 10.1039/d5sc03707g.
2
Chemosensory Receptors in Vertebrates: Structure and Computational Modeling Insights.脊椎动物的化学感受器:结构与计算建模见解
Int J Mol Sci. 2025 Jul 10;26(14):6605. doi: 10.3390/ijms26146605.
3
designed bright, hyperstable rhodamine binders for fluorescence microscopy.设计用于荧光显微镜的明亮、超稳定罗丹明结合剂。

本文引用的文献

1
Single-sequence protein structure prediction using supervised transformer protein language models.使用监督式转换器蛋白质语言模型进行单序列蛋白质结构预测。
Nat Comput Sci. 2022 Dec;2(12):804-814. doi: 10.1038/s43588-022-00373-3. Epub 2022 Dec 19.
2
Fast, accurate antibody structure prediction from deep learning on massive set of natural antibodies.基于大规模天然抗体数据集的深度学习实现快速、准确的抗体结构预测。
Nat Commun. 2023 Apr 25;14(1):2389. doi: 10.1038/s41467-023-38063-x.
3
Using AlphaFold to predict the impact of single mutations on protein stability and function.
bioRxiv. 2025 Jun 25:2025.06.24.661379. doi: 10.1101/2025.06.24.661379.
4
Locality-aware pooling enhances protein language model performance across varied applications.局部感知池化可提升蛋白质语言模型在各种应用中的性能。
Bioinformatics. 2025 Jul 1;41(Supplement_1):i217-i226. doi: 10.1093/bioinformatics/btaf178.
5
De novo design of porphyrin-containing proteins as efficient and stereoselective catalysts.从头设计含卟啉蛋白作为高效且立体选择性催化剂。
Science. 2025 May 8;388(6747):665-670. doi: 10.1126/science.adt7268.
6
Accurate prediction of nucleic acid binding proteins using protein language model.使用蛋白质语言模型准确预测核酸结合蛋白。
Bioinform Adv. 2025 Jan 20;5(1):vbaf008. doi: 10.1093/bioadv/vbaf008. eCollection 2025.
7
Emergence of specific binding and catalysis from a designed generalist binding protein.从设计的通用结合蛋白中产生特异性结合和催化作用。
bioRxiv. 2025 Mar 19:2025.01.30.635804. doi: 10.1101/2025.01.30.635804.
8
GDFold2: A fast and parallelizable protein folding environment with freely defined objective functions.GDFold2:一个具有自由定义目标函数的快速且可并行化的蛋白质折叠环境。
Protein Sci. 2025 Feb;34(2):e70041. doi: 10.1002/pro.70041.
9
Predicting purification process fit of monoclonal antibodies using machine learning.使用机器学习预测单克隆抗体的纯化工艺适配性。
MAbs. 2025 Dec;17(1):2439988. doi: 10.1080/19420862.2024.2439988. Epub 2025 Jan 9.
10
The success rate of processed predicted models in molecular replacement: implications for experimental phasing in the AlphaFold era.处理后预测模型在分子置换中的成功率:对 AlphaFold 时代实验相位的影响。
Acta Crystallogr D Struct Biol. 2024 Nov 1;80(Pt 11):766-779. doi: 10.1107/S2059798324009380. Epub 2024 Oct 3.
利用 AlphaFold 预测单突变对蛋白质稳定性和功能的影响。
PLoS One. 2023 Mar 16;18(3):e0282689. doi: 10.1371/journal.pone.0282689. eCollection 2023.
4
Evolutionary-scale prediction of atomic-level protein structure with a language model.用语言模型进行原子级蛋白质结构的进化尺度预测。
Science. 2023 Mar 17;379(6637):1123-1130. doi: 10.1126/science.ade2574. Epub 2023 Mar 16.
5
A structural biology community assessment of AlphaFold2 applications.AlphaFold2 应用的结构生物学社区评估。
Nat Struct Mol Biol. 2022 Nov;29(11):1056-1067. doi: 10.1038/s41594-022-00849-w. Epub 2022 Nov 7.
6
Single-sequence protein structure prediction using a language model and deep learning.基于语言模型和深度学习的单序列蛋白质结构预测。
Nat Biotechnol. 2022 Nov;40(11):1617-1623. doi: 10.1038/s41587-022-01432-w. Epub 2022 Oct 3.
7
Antibody structure prediction using interpretable deep learning.使用可解释深度学习进行抗体结构预测。
Patterns (N Y). 2021 Dec 9;3(2):100406. doi: 10.1016/j.patter.2021.100406. eCollection 2022 Feb 11.
8
Can AlphaFold2 predict the impact of missense mutations on structure?AlphaFold2能否预测错义突变对结构的影响?
Nat Struct Mol Biol. 2022 Jan;29(1):1-2. doi: 10.1038/s41594-021-00714-2.
9
SAbDab in the age of biotherapeutics: updates including SAbDab-nano, the nanobody structure tracker.在生物治疗时代的 SAbDab:更新内容包括 SAbDab-nano,纳米体结构追踪器。
Nucleic Acids Res. 2022 Jan 7;50(D1):D1368-D1372. doi: 10.1093/nar/gkab1050.
10
Improved protein structure prediction by deep learning irrespective of co-evolution information.通过深度学习改进蛋白质结构预测,与共进化信息无关。
Nat Mach Intell. 2021 Jul;3:601-609. doi: 10.1038/s42256-021-00348-5. Epub 2021 May 20.