Suppr超能文献

PHYLIP程序的一种改进:解决冗余聚类问题的方法,以及对从原始数据推断出的树进行自动自展的实现。

A modification of the PHYLIP program: A solution for the redundant cluster problem, and an implementation of an automatic bootstrapping on trees inferred from original data.

作者信息

Shimada Makoto K, Nishida Tsunetoshi

机构信息

Institute for Comprehensive Medical Science, Fujita Health University, 1-98 Dengakugakubo, Kutsukake-cho, Toyoake, Aichi 470-1192, Japan.

Institute for Comprehensive Medical Science, Fujita Health University, 1-98 Dengakugakubo, Kutsukake-cho, Toyoake, Aichi 470-1192, Japan.

出版信息

Mol Phylogenet Evol. 2017 Apr;109:409-414. doi: 10.1016/j.ympev.2017.02.012. Epub 2017 Feb 20.

Abstract

Felsenstein's PHYLIP package of molecular phylogeny tools has been used globally since 1980. The programs are receiving renewed attention because of their character-based user interface, which has the advantage of being scriptable for use with large-scale data studies based on super-computers or massively parallel computing clusters. However, occasionally we found, the PHYLIP Consense program output text file displays two or more divided bootstrap values for the same cluster in its result table, and when this happens the output Newick tree file incorrectly assigns only the last value to that cluster that disturbs correct estimation of a consensus tree. We ascertained the cause of this aberrant behavior in the bootstrapping calculation. Our rewrite of the Consense program source code outputs bootstrap values, without redundancy, in its result table, and a Newick tree file with appropriate, corresponding bootstrap values. Furthermore, we developed an add-on program and shell script, add_bootstrap.pl and fasta2tre_bs.bsh, to generate a Newick tree containing the topology and branch lengths inferred from the original data along with valid bootstrap values, and to actualize the automated inference of a phylogenetic tree containing the originally inferred topology and branch lengths with bootstrap values, from multiple unaligned sequences, respectively. These programs can be downloaded at: https://github.com/ShimadaMK/PHYLIP_enhance/.

摘要

自1980年以来,费尔斯滕森的分子系统发育工具PHYLIP软件包一直在全球范围内使用。由于其基于字符的用户界面,这些程序正受到新的关注,该界面具有可编写脚本的优势,可用于基于超级计算机或大规模并行计算集群的大规模数据研究。然而,我们偶尔发现,PHYLIP Consense程序输出的文本文件在其结果表中为同一聚类显示两个或更多分开的自展值,当这种情况发生时,输出的Newick树文件仅将最后一个值错误地分配给该聚类,这会干扰一致树的正确估计。我们确定了自展计算中这种异常行为的原因。我们对Consense程序源代码的改写在其结果表中输出无冗余的自展值,并输出一个带有适当对应自展值的Newick树文件。此外,我们开发了一个附加程序和外壳脚本add_bootstrap.pl和fasta2tre_bs.bsh,以生成一个包含从原始数据推断出的拓扑结构和分支长度以及有效自展值的Newick树,并分别从多个未比对序列中实现包含原始推断拓扑结构和带有自展值的分支长度的系统发育树的自动推断。这些程序可在以下网址下载:https://github.com/ShimadaMK/PHYLIP_enhance/

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验