• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

多重序列比对在分子结构与功能预测中的历史演变及意义

The Historical Evolution and Significance of Multiple Sequence Alignment in Molecular Structure and Function Prediction.

作者信息

Zhang Chenyue, Wang Qinxin, Li Yiyang, Teng Anqi, Hu Gang, Wuyun Qiqige, Zheng Wei

机构信息

NITFID, School of Statistics and Data Science, LPMC and KLMDASR, Nankai University, Tianjin 300071, China.

Suzhou New & High-Tech Innovation Service Center, Suzhou 215011, China.

出版信息

Biomolecules. 2024 Nov 29;14(12):1531. doi: 10.3390/biom14121531.

DOI:10.3390/biom14121531
PMID:39766238
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11673352/
Abstract

Multiple sequence alignment (MSA) has evolved into a fundamental tool in the biological sciences, playing a pivotal role in predicting molecular structures and functions. With broad applications in protein and nucleic acid modeling, MSAs continue to underpin advancements across a range of disciplines. MSAs are not only foundational for traditional sequence comparison techniques but also increasingly important in the context of artificial intelligence (AI)-driven advancements. Recent breakthroughs in AI, particularly in protein and nucleic acid structure prediction, rely heavily on the accuracy and efficiency of MSAs to enhance remote homology detection and guide spatial restraints. This review traces the historical evolution of MSA, highlighting its significance in molecular structure and function prediction. We cover the methodologies used for protein monomers, protein complexes, and RNA, while also exploring emerging AI-based alternatives, such as protein language models, as complementary or replacement approaches to traditional MSAs in application tasks. By discussing the strengths, limitations, and applications of these methods, this review aims to provide researchers with valuable insights into MSA's evolving role, equipping them to make informed decisions in structural prediction research.

摘要

多序列比对(MSA)已发展成为生物科学中的一项基础工具,在预测分子结构和功能方面发挥着关键作用。由于在蛋白质和核酸建模中有着广泛应用,MSA仍然是一系列学科进步的基础。MSA不仅是传统序列比较技术的基础,在人工智能(AI)驱动的进步背景下也日益重要。AI领域的最新突破,尤其是在蛋白质和核酸结构预测方面,在很大程度上依赖于MSA的准确性和效率,以加强远程同源性检测并指导空间限制。本综述追溯了MSA的历史演变,强调了其在分子结构和功能预测中的重要性。我们涵盖了用于蛋白质单体、蛋白质复合物和RNA的方法,同时还探索了新兴的基于AI的替代方法,如蛋白质语言模型,作为传统MSA在应用任务中的补充或替代方法。通过讨论这些方法的优点、局限性和应用,本综述旨在为研究人员提供关于MSA不断演变的作用的宝贵见解,使他们能够在结构预测研究中做出明智的决策。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cd97/11673352/69544de9eab0/biomolecules-14-01531-g012.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cd97/11673352/5fd6fe28335d/biomolecules-14-01531-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cd97/11673352/3c9a5ad1d2b3/biomolecules-14-01531-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cd97/11673352/92d97a99909b/biomolecules-14-01531-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cd97/11673352/f50d5fdccf90/biomolecules-14-01531-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cd97/11673352/266fc721bd67/biomolecules-14-01531-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cd97/11673352/72ca4e175e24/biomolecules-14-01531-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cd97/11673352/ad9af0eff656/biomolecules-14-01531-g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cd97/11673352/31cc66a78104/biomolecules-14-01531-g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cd97/11673352/a818349ddfe5/biomolecules-14-01531-g009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cd97/11673352/ae7917367639/biomolecules-14-01531-g010.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cd97/11673352/a4f6f20b07e3/biomolecules-14-01531-g011.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cd97/11673352/69544de9eab0/biomolecules-14-01531-g012.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cd97/11673352/5fd6fe28335d/biomolecules-14-01531-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cd97/11673352/3c9a5ad1d2b3/biomolecules-14-01531-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cd97/11673352/92d97a99909b/biomolecules-14-01531-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cd97/11673352/f50d5fdccf90/biomolecules-14-01531-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cd97/11673352/266fc721bd67/biomolecules-14-01531-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cd97/11673352/72ca4e175e24/biomolecules-14-01531-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cd97/11673352/ad9af0eff656/biomolecules-14-01531-g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cd97/11673352/31cc66a78104/biomolecules-14-01531-g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cd97/11673352/a818349ddfe5/biomolecules-14-01531-g009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cd97/11673352/ae7917367639/biomolecules-14-01531-g010.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cd97/11673352/a4f6f20b07e3/biomolecules-14-01531-g011.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cd97/11673352/69544de9eab0/biomolecules-14-01531-g012.jpg

相似文献

1
The Historical Evolution and Significance of Multiple Sequence Alignment in Molecular Structure and Function Prediction.多重序列比对在分子结构与功能预测中的历史演变及意义
Biomolecules. 2024 Nov 29;14(12):1531. doi: 10.3390/biom14121531.
2
Unveiling the evolution of policies for enhancing protein structure predictions: A comprehensive analysis.揭示增强蛋白质结构预测政策的演变:全面分析。
Comput Biol Med. 2024 Sep;179:108815. doi: 10.1016/j.compbiomed.2024.108815. Epub 2024 Jul 11.
3
Analyzing effect of quadruple multiple sequence alignments on deep learning based protein inter-residue distance prediction.分析四重序列比对对基于深度学习的蛋白质残基间距离预测的影响。
Sci Rep. 2021 Apr 7;11(1):7574. doi: 10.1038/s41598-021-87204-z.
4
Protein language-model embeddings for fast, accurate, and alignment-free protein structure prediction.基于蛋白质语言模型的嵌入来实现快速、准确且无需对齐的蛋白质结构预测。
Structure. 2022 Aug 4;30(8):1169-1177.e4. doi: 10.1016/j.str.2022.05.001. Epub 2022 May 23.
5
Assessing the role of evolutionary information for enhancing protein language model embeddings.评估进化信息在增强蛋白质语言模型嵌入中的作用。
Sci Rep. 2024 Sep 5;14(1):20692. doi: 10.1038/s41598-024-71783-8.
6
BetaAlign: a deep learning approach for multiple sequence alignment.BetaAlign:一种用于多序列比对的深度学习方法。
Bioinformatics. 2024 Dec 26;41(1). doi: 10.1093/bioinformatics/btaf009.
7
Protein multiple sequence alignment benchmarking through secondary structure prediction.通过二级结构预测进行蛋白质多序列比对基准测试。
Bioinformatics. 2017 May 1;33(9):1331-1337. doi: 10.1093/bioinformatics/btw840.
8
A critical address to advancements and challenges in computational strategies for structural prediction of protein in recent past.近期关于蛋白质结构预测计算策略的进展与挑战的重要演讲。
Comput Biol Chem. 2025 Aug;117:108430. doi: 10.1016/j.compbiolchem.2025.108430. Epub 2025 Mar 16.
9
Improved structure-related prediction for insufficient homologous proteins using MSA enhancement and pre-trained language model.利用多序列比对增强和预训练语言模型提高同源蛋白不足的结构相关预测。
Brief Bioinform. 2023 Jul 20;24(4). doi: 10.1093/bib/bbad217.
10
Characterization of multiple sequence alignment errors using complete-likelihood score and position-shift map.使用完全似然得分和位置偏移图对多序列比对错误进行表征。
BMC Bioinformatics. 2016 Mar 18;17:133. doi: 10.1186/s12859-016-0945-5.

引用本文的文献

1
Selective Modulation of PAR-2-Driven Inflammatory Pathways by Oleocanthal: Attenuation of TNF-α and Calcium Dysregulation in Colorectal Cancer Models.橄榄苦苷对PAR-2驱动的炎症途径的选择性调节:在结直肠癌模型中减轻肿瘤坏死因子-α和钙失调
Int J Mol Sci. 2025 Mar 24;26(7):2934. doi: 10.3390/ijms26072934.

本文引用的文献

1
Protein domain embeddings for fast and accurate similarity search.蛋白质结构域嵌入用于快速准确的相似性搜索。
Genome Res. 2024 Oct 11;34(9):1434-1444. doi: 10.1101/gr.279127.124.
2
Pairing interacting protein sequences using masked language modeling.使用掩蔽语言模型对相互作用的蛋白质序列进行配对。
Proc Natl Acad Sci U S A. 2024 Jul 2;121(27):e2311887121. doi: 10.1073/pnas.2311887121. Epub 2024 Jun 24.
3
PLMSearch: Protein language model powers accurate and fast sequence search for remote homology.PLMSearch:蛋白质语言模型为远程同源性的准确快速序列搜索提供动力。
Nat Commun. 2024 Mar 30;15(1):2775. doi: 10.1038/s41467-024-46808-5.
4
Protein embedding based alignment.基于蛋白质嵌入的对齐。
BMC Bioinformatics. 2024 Feb 28;25(1):85. doi: 10.1186/s12859-024-05699-5.
5
Recent Progress of Protein Tertiary Structure Prediction.蛋白质三级结构预测的最新进展。
Molecules. 2024 Feb 13;29(4):832. doi: 10.3390/molecules29040832.
6
Embedding-based alignment: combining protein language models with dynamic programming alignment to detect structural similarities in the twilight-zone.基于嵌入的对齐:将蛋白质语言模型与动态规划对齐相结合,以检测“黄昏地带”中的结构相似性。
Bioinformatics. 2024 Jan 2;40(1). doi: 10.1093/bioinformatics/btad786.
7
Improving deep learning protein monomer and complex structure prediction using DeepMSA2 with huge metagenomics data.利用 DeepMSA2 和海量宏基因组学数据改进深度学习蛋白质单体和复合物结构预测。
Nat Methods. 2024 Feb;21(2):279-289. doi: 10.1038/s41592-023-02130-4. Epub 2024 Jan 2.
8
Accurate prediction of protein-nucleic acid complexes using RoseTTAFoldNA.使用 RoseTTAFoldNA 准确预测蛋白质-核酸复合物。
Nat Methods. 2024 Jan;21(1):117-121. doi: 10.1038/s41592-023-02086-5. Epub 2023 Nov 23.
9
Enhancing alphafold-multimer-based protein complex structure prediction with MULTICOM in CASP15.利用 MULTICOM 增强基于 AlphaFold-Multimer 的蛋白质复合物结构预测在 CASP15 中的应用。
Commun Biol. 2023 Nov 10;6(1):1140. doi: 10.1038/s42003-023-05525-3.
10
trRosettaRNA: automated prediction of RNA 3D structure with transformer network.trRosettaRNA:基于 Transformer 网络的 RNA 三维结构自动预测。
Nat Commun. 2023 Nov 9;14(1):7266. doi: 10.1038/s41467-023-42528-4.