• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

将比对不确定性纳入费尔斯滕森系统发育自展法以提高其可靠性。

Incorporating alignment uncertainty into Felsenstein's phylogenetic bootstrap to improve its reliability.

作者信息

Chang Jia-Ming, Floden Evan W, Herrero Javier, Gascuel Olivier, Di Tommaso Paolo, Notredame Cedric

机构信息

European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Cambridge CB10 1SD, UK.

Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Barcelona 08003, Spain.

出版信息

Bioinformatics. 2021 Jul 12;37(11):1506-1514. doi: 10.1093/bioinformatics/btz082.

DOI:10.1093/bioinformatics/btz082
PMID:30726875
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8275982/
Abstract

MOTIVATION

Most evolutionary analyses are based on pre-estimated multiple sequence alignment. Wong et al. established the existence of an uncertainty induced by multiple sequence alignment when reconstructing phylogenies. They were able to show that in many cases different aligners produce different phylogenies, with no simple objective criterion sufficient to distinguish among these alternatives.

RESULTS

We demonstrate that incorporating MSA induced uncertainty into bootstrap sampling can significantly increase correlation between clade correctness and its corresponding bootstrap value. Our procedure involves concatenating several alternative multiple sequence alignments of the same sequences, produced using different commonly used aligners. We then draw bootstrap replicates while favoring columns of the more unique aligner among the concatenated aligners. We named this concatenation and bootstrapping method, Weighted Partial Super Bootstrap (wpSBOOT). We show on three simulated datasets of 16, 32 and 64 tips that our method improves the predictive power of bootstrap values. We also used as a benchmark an empirical collection of 853 one to one orthologous genes from seven yeast species and found wpSBOOT to significantly improve discrimination capacity between topologically correct and incorrect trees. Bootstrap values of wpSBOOT are comparable to similar readouts estimated using a single method. However, for reduced trees by 50 and 95% bootstrap thresholds, wpSBOOT comes out the lowest Type I error (less FP).

AVAILABILITY AND IMPLEMENTATION

The automated generation of replicates has been implemented in the T-Coffee package, which is available as open source freeware available from www.tcoffee.org.

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

摘要

动机

大多数进化分析基于预先估计的多序列比对。Wong等人在重建系统发育树时证实了多序列比对会引发不确定性。他们发现,在许多情况下,不同的比对工具会产生不同的系统发育树,且没有简单的客观标准足以区分这些不同结果。

结果

我们证明,将多序列比对引发的不确定性纳入自展抽样能够显著提高分支正确性与其相应自展值之间的相关性。我们的方法包括将使用不同常用比对工具生成的同一序列的多个替代多序列比对连接起来。然后在连接后的比对中更倾向于选择更独特的比对工具的列来进行自展重复抽样。我们将这种连接和自展方法命名为加权部分超级自展(wpSBOOT)。我们在三个分别包含16、32和64个末端的模拟数据集上表明,我们的方法提高了自展值的预测能力。我们还以来自七个酵母物种的853个一对一直系同源基因的实证集合作为基准,发现wpSBOOT显著提高了拓扑正确和不正确树之间的区分能力。wpSBOOT的自展值与使用单一方法估计的类似读数相当。然而,对于通过50%和95%自展阈值简化的树,wpSBOOT的I型错误率最低(假阳性更少)。

可用性和实现方式

重复抽样的自动生成已在T-Coffee软件包中实现,该软件包可从www.tcoffee.org作为开源免费软件获取。

补充信息

补充数据可在《生物信息学》在线获取。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/751f/8275982/f585d5aa708b/btz082f5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/751f/8275982/3d818415b8a9/btz082f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/751f/8275982/900c16848593/btz082f2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/751f/8275982/c8f8c7df4383/btz082f3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/751f/8275982/19af9988991f/btz082f4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/751f/8275982/f585d5aa708b/btz082f5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/751f/8275982/3d818415b8a9/btz082f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/751f/8275982/900c16848593/btz082f2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/751f/8275982/c8f8c7df4383/btz082f3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/751f/8275982/19af9988991f/btz082f4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/751f/8275982/f585d5aa708b/btz082f5.jpg

相似文献

1
Incorporating alignment uncertainty into Felsenstein's phylogenetic bootstrap to improve its reliability.将比对不确定性纳入费尔斯滕森系统发育自展法以提高其可靠性。
Bioinformatics. 2021 Jul 12;37(11):1506-1514. doi: 10.1093/bioinformatics/btz082.
2
Robustness of Felsenstein's Versus Transfer Bootstrap Supports With Respect to Taxon Sampling.费雪氏与转移自举检验支持率的强健性:关于分类单元取样的研究
Syst Biol. 2023 Dec 30;72(6):1280-1295. doi: 10.1093/sysbio/syad052.
3
Renewing Felsenstein's phylogenetic bootstrap in the era of big data.大数据时代复兴菲舍耳氏系统发育 bootstrap 法。
Nature. 2018 Apr;556(7702):452-456. doi: 10.1038/s41586-018-0043-0. Epub 2018 Apr 18.
4
A machine-learning-based alternative to phylogenetic bootstrap.基于机器学习的替代系统,用于替代系统发育 bootstrap 分析。
Bioinformatics. 2024 Jun 28;40(Suppl 1):i208-i217. doi: 10.1093/bioinformatics/btae255.
5
TCS: a new multiple sequence alignment reliability measure to estimate alignment accuracy and improve phylogenetic tree reconstruction.TCS:一种新的多重序列比对可靠性度量方法,用于估计比对准确性并改进系统发育树重建。
Mol Biol Evol. 2014 Jun;31(6):1625-37. doi: 10.1093/molbev/msu117. Epub 2014 Apr 1.
6
Fast and accurate bootstrap confidence limits on genome-scale phylogenies using little bootstraps.使用少量自展法对基因组规模系统发育树进行快速准确的自展置信区间估计。
Nat Comput Sci. 2021 Sep;1(9):573-577. doi: 10.1038/s43588-021-00129-5. Epub 2021 Sep 22.
7
Using the T-Coffee package to build multiple sequence alignments of protein, RNA, DNA sequences and 3D structures.使用 T-Coffee 包构建蛋白质、RNA、DNA 序列和 3D 结构的多重序列比对。
Nat Protoc. 2011 Nov;6(11):1669-82. doi: 10.1038/nprot.2011.393.
8
R-Coffee: a method for multiple alignment of non-coding RNA.R-Coffee:一种非编码RNA多重比对的方法。
Nucleic Acids Res. 2008 May;36(9):e52. doi: 10.1093/nar/gkn174. Epub 2008 Apr 17.
9
Using tertiary structure for the computation of highly accurate multiple RNA alignments with the SARA-Coffee package.使用三级结构计算具有 SARA-Coffee 包的高度精确的多个 RNA 比对。
Bioinformatics. 2013 May 1;29(9):1112-9. doi: 10.1093/bioinformatics/btt096. Epub 2013 Feb 28.
10
Inferring species phylogenies from multiple genes: concatenated sequence tree versus consensus gene tree.从多个基因推断物种系统发育:串联序列树与一致基因树。
J Exp Zool B Mol Dev Evol. 2005 Jan 15;304(1):64-74. doi: 10.1002/jez.b.21026.

引用本文的文献

1
A machine-learning-based alternative to phylogenetic bootstrap.基于机器学习的替代系统,用于替代系统发育 bootstrap 分析。
Bioinformatics. 2024 Jun 28;40(Suppl 1):i208-i217. doi: 10.1093/bioinformatics/btae255.
2
Muscle5: High-accuracy alignment ensembles enable unbiased assessments of sequence homology and phylogeny.肌肉 5:高精度比对集合可实现序列同源性和系统发育的无偏评估。
Nat Commun. 2022 Nov 15;13(1):6968. doi: 10.1038/s41467-022-34630-w.
3
Build a better bootstrap and the RAWR shall beat a random path to your door: phylogenetic support estimation revisited.
构建更好的引导程序,RAWR 将随机找到通往你家门的路:重新审视系统发育支持估计。
Bioinformatics. 2021 Jul 12;37(Suppl_1):i111-i119. doi: 10.1093/bioinformatics/btab263.