• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

MergeAlign:通过动态重建共识多重序列比对来提高多重序列比对性能。

MergeAlign: improving multiple sequence alignment performance by dynamic reconstruction of consensus multiple sequence alignments.

机构信息

Marine Biological Association of the United Kingdom, The Laboratory, Citadel Hill, Plymouth PL1 2PBDevon, UK.

出版信息

BMC Bioinformatics. 2012 May 30;13:117. doi: 10.1186/1471-2105-13-117.

DOI:10.1186/1471-2105-13-117
PMID:22646090
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3413523/
Abstract

BACKGROUND

The generation of multiple sequence alignments (MSAs) is a crucial step for many bioinformatic analyses. Thus improving MSA accuracy and identifying potential errors in MSAs is important for a wide range of post-genomic research. We present a novel method called MergeAlign which constructs consensus MSAs from multiple independent MSAs and assigns an alignment precision score to each column.

RESULTS

Using conventional benchmark tests we demonstrate that on average MergeAlign MSAs are more accurate than MSAs generated using any single matrix of sequence substitution. We show that MergeAlign column scores are related to alignment precision and hence provide an ab initio method of estimating alignment precision in the absence of curated reference MSAs. Using two novel and independent alignment performance tests that utilise a large set of orthologous gene families we demonstrate that increasing MSA performance leads to an increase in the performance of downstream phylogenetic analyses.

CONCLUSION

Using multiple tests of alignment performance we demonstrate that this novel method has broad general application in biological research.

摘要

背景

多序列比对(MSA)的生成是许多生物信息学分析的关键步骤。因此,提高 MSA 的准确性并识别 MSA 中的潜在错误对于广泛的后基因组研究非常重要。我们提出了一种称为 MergeAlign 的新方法,该方法可以从多个独立的 MSA 构建共识 MSA,并为每个列分配一个对齐精度得分。

结果

使用常规基准测试,我们证明平均而言,MergeAlign MSA 比使用任何单个序列替换矩阵生成的 MSA 更准确。我们表明,MergeAlign 列得分与对齐精度相关,因此提供了一种在没有经过精心整理的参考 MSA 的情况下估计对齐精度的初始方法。使用两个利用大型直系同源基因家族的新的独立对齐性能测试,我们证明了增加 MSA 性能会导致下游系统发育分析性能的提高。

结论

使用多种对齐性能测试,我们证明了这种新方法在生物研究中有广泛的应用。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3577/3413523/a3d06337f4c1/1471-2105-13-117-6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3577/3413523/88a139f6325c/1471-2105-13-117-1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3577/3413523/a4609b3fcc81/1471-2105-13-117-2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3577/3413523/c1365b1fa9de/1471-2105-13-117-3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3577/3413523/a175d3ebb917/1471-2105-13-117-4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3577/3413523/14c4c5cd68cb/1471-2105-13-117-5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3577/3413523/a3d06337f4c1/1471-2105-13-117-6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3577/3413523/88a139f6325c/1471-2105-13-117-1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3577/3413523/a4609b3fcc81/1471-2105-13-117-2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3577/3413523/c1365b1fa9de/1471-2105-13-117-3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3577/3413523/a175d3ebb917/1471-2105-13-117-4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3577/3413523/14c4c5cd68cb/1471-2105-13-117-5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3577/3413523/a3d06337f4c1/1471-2105-13-117-6.jpg

相似文献

1
MergeAlign: improving multiple sequence alignment performance by dynamic reconstruction of consensus multiple sequence alignments.MergeAlign:通过动态重建共识多重序列比对来提高多重序列比对性能。
BMC Bioinformatics. 2012 May 30;13:117. doi: 10.1186/1471-2105-13-117.
2
SuiteMSA: visual tools for multiple sequence alignment comparison and molecular sequence simulation.SuiteMSA:用于多序列比对比较和分子序列模拟的可视化工具。
BMC Bioinformatics. 2011 May 21;12:184. doi: 10.1186/1471-2105-12-184.
3
Characterization of multiple sequence alignment errors using complete-likelihood score and position-shift map.使用完全似然得分和位置偏移图对多序列比对错误进行表征。
BMC Bioinformatics. 2016 Mar 18;17:133. doi: 10.1186/s12859-016-0945-5.
4
An alignment confidence score capturing robustness to guide tree uncertainty.一种对齐置信度评分,可捕捉对引导树不确定性的稳健性。
Mol Biol Evol. 2010 Aug;27(8):1759-67. doi: 10.1093/molbev/msq066. Epub 2010 Mar 5.
5
Protein multiple sequence alignment benchmarking through secondary structure prediction.通过二级结构预测进行蛋白质多序列比对基准测试。
Bioinformatics. 2017 May 1;33(9):1331-1337. doi: 10.1093/bioinformatics/btw840.
6
Obtaining extremely large and accurate protein multiple sequence alignments from curated hierarchical alignments.从已编辑的层级比对获取超大量且精确的蛋白质多重序列比对。
Database (Oxford). 2020 Jan 1;2020. doi: 10.1093/database/baaa042.
7
TCS: a new multiple sequence alignment reliability measure to estimate alignment accuracy and improve phylogenetic tree reconstruction.TCS:一种新的多重序列比对可靠性度量方法,用于估计比对准确性并改进系统发育树重建。
Mol Biol Evol. 2014 Jun;31(6):1625-37. doi: 10.1093/molbev/msu117. Epub 2014 Apr 1.
8
Evaluating the usefulness of alignment filtering methods to reduce the impact of errors on evolutionary inferences.评估对齐过滤方法在减少错误对进化推断影响方面的有用性。
BMC Evol Biol. 2019 Jan 11;19(1):21. doi: 10.1186/s12862-019-1350-2.
9
Current Methods for Automated Filtering of Multiple Sequence Alignments Frequently Worsen Single-Gene Phylogenetic Inference.当前用于多序列比对自动过滤的方法常常会使单基因系统发育推断变差。
Syst Biol. 2015 Sep;64(5):778-91. doi: 10.1093/sysbio/syv033. Epub 2015 Jun 1.
10
The effect of the guide tree on multiple sequence alignments and subsequent phylogenetic analyses.引导树对多序列比对及后续系统发育分析的影响。
Pac Symp Biocomput. 2008:25-36. doi: 10.1142/9789812776136_0004.

引用本文的文献

1
Increased chloroplast area in the rice bundle sheath through cell-specific perturbation of brassinosteroid signaling.通过油菜素内酯信号的细胞特异性扰动增加水稻维管束鞘中的叶绿体面积。
Plant Physiol. 2025 Mar 28;197(4). doi: 10.1093/plphys/kiaf108.
2
BetaAlign: a deep learning approach for multiple sequence alignment.BetaAlign:一种用于多序列比对的深度学习方法。
Bioinformatics. 2024 Dec 26;41(1). doi: 10.1093/bioinformatics/btaf009.
3
AAindexNC: Estimating the Physicochemical Properties of Non-Canonical Amino Acids, Including Those Derived from the PDB and PDBeChem Databank.

本文引用的文献

1
A comprehensive benchmark study of multiple sequence alignment methods: current challenges and future perspectives.多种序列比对方法的综合基准研究:当前的挑战与未来展望。
PLoS One. 2011 Mar 31;6(3):e18093. doi: 10.1371/journal.pone.0018093.
2
Archaeal phylogenomics provides evidence in support of a methanogenic origin of the Archaea and a thaumarchaeal origin for the eukaryotes.古菌系统发生基因组学为古菌的产甲烷起源和真核生物的泉古菌起源提供了证据。
Proc Biol Sci. 2011 Apr 7;278(1708):1009-18. doi: 10.1098/rspb.2010.1427. Epub 2010 Sep 29.
3
DendroPy: a Python library for phylogenetic computing.
AAindexNC:估算非标准氨基酸的物理化学性质,包括那些源自蛋白质数据库(PDB)和蛋白质数据银行化学数据库(PDBeChem)的非标准氨基酸。
Int J Mol Sci. 2024 Nov 22;25(23):12555. doi: 10.3390/ijms252312555.
4
TPMA: A two pointers meta-alignment tool to ensemble different multiple nucleic acid sequence alignments.TPMA:一种双指针元比对工具,用于集成不同的多个核酸序列比对。
PLoS Comput Biol. 2024 Apr 1;20(4):e1011988. doi: 10.1371/journal.pcbi.1011988. eCollection 2024 Apr.
5
Multiple ubiquitin E3 ligase genes antagonistically regulate chloroplast-associated protein degradation.多个泛素E3连接酶基因拮抗调节叶绿体相关蛋白降解。
Curr Biol. 2023 Mar 27;33(6):1138-1146.e5. doi: 10.1016/j.cub.2023.01.060. Epub 2023 Feb 22.
6
The Chromosome Number and rDNA Loci Evolution in (Fabaceae).在(豆科)中染色体数目和 rDNA 基因座的演化。
Int J Mol Sci. 2022 Sep 20;23(19):11033. doi: 10.3390/ijms231911033.
7
Molecular and Cytogenetic Analysis of rDNA Evolution in Crepis Sensu Lato.rDNA 进化的分子和细胞遗传学分析。
Int J Mol Sci. 2022 Mar 26;23(7):3643. doi: 10.3390/ijms23073643.
8
Descending Dysploidy and Bidirectional Changes in Genome Size Accompanied (Asteraceae) Evolution.伴随着(菊科)进化的降倍性和基因组大小的双向变化。
Genes (Basel). 2021 Sep 17;12(9):1436. doi: 10.3390/genes12091436.
9
Ligand-Receptor Interaction: AMA-1 Contains Small Regions Governing Bovine Erythrocyte Binding.配体-受体相互作用:AMA-1 包含控制牛红细胞结合的小区域。
Int J Mol Sci. 2021 Jan 13;22(2):714. doi: 10.3390/ijms22020714.
10
LMAP_S: Lightweight Multigene Alignment and Phylogeny eStimation.LMAP_S:轻量级多基因对齐与系统发育估算。
BMC Bioinformatics. 2019 Dec 30;20(1):739. doi: 10.1186/s12859-019-3292-5.
DendroPy:一个用于系统发育计算的 Python 库。
Bioinformatics. 2010 Jun 15;26(12):1569-71. doi: 10.1093/bioinformatics/btq228. Epub 2010 Apr 25.
4
FastTree 2--approximately maximum-likelihood trees for large alignments.FastTree 2--用于大型比对的近似最大似然树。
PLoS One. 2010 Mar 10;5(3):e9490. doi: 10.1371/journal.pone.0009490.
5
Quality measures for protein alignment benchmarks.蛋白质比对基准的质量度量。
Nucleic Acids Res. 2010 Apr;38(7):2145-53. doi: 10.1093/nar/gkp1196. Epub 2010 Jan 4.
6
The Pfam protein families database.Pfam 蛋白质家族数据库。
Nucleic Acids Res. 2010 Jan;38(Database issue):D211-22. doi: 10.1093/nar/gkp985. Epub 2009 Nov 17.
7
Multiple sequence alignment using ClustalW and ClustalX.使用ClustalW和ClustalX进行多序列比对。
Curr Protoc Bioinformatics. 2002 Aug;Chapter 2:Unit 2.3. doi: 10.1002/0471250953.bi0203s00.
8
Phylogeny-aware gap placement prevents errors in sequence alignment and evolutionary analysis.系统发育感知缺口放置可防止序列比对和进化分析中的错误。
Science. 2008 Jun 20;320(5883):1632-5. doi: 10.1126/science.1158395.
9
AAindex: amino acid index database, progress report 2008.AAindex:氨基酸索引数据库,2008年进展报告。
Nucleic Acids Res. 2008 Jan;36(Database issue):D202-5. doi: 10.1093/nar/gkm998. Epub 2007 Nov 12.
10
Improvement of phylogenies after removing divergent and ambiguously aligned blocks from protein sequence alignments.从蛋白质序列比对中去除分歧和比对不明确的区域后系统发育树的改进。
Syst Biol. 2007 Aug;56(4):564-77. doi: 10.1080/10635150701472164.