Comparative Genomics group, Bioinformatics and Genomics Programme, Centre for Genomic Regulation, Barcelona, Spain.
BMC Bioinformatics. 2010 Jan 13;11:24. doi: 10.1186/1471-2105-11-24.
Many bioinformatics analyses, ranging from gene clustering to phylogenetics, produce hierarchical trees as their main result. These are used to represent the relationships among different biological entities, thus facilitating their analysis and interpretation. A number of standalone programs are available that focus on tree visualization or that perform specific analyses on them. However, such applications are rarely suitable for large-scale surveys, in which a higher level of automation is required. Currently, many genome-wide analyses rely on tree-like data representation and hence there is a growing need for scalable tools to handle tree structures at large scale.
Here we present the Environment for Tree Exploration (ETE), a python programming toolkit that assists in the automated manipulation, analysis and visualization of hierarchical trees. ETE libraries provide a broad set of tree handling options as well as specific methods to analyze phylogenetic and clustering trees. Among other features, ETE allows for the independent analysis of tree partitions, has support for the extended newick format, provides an integrated node annotation system and permits to link trees to external data such as multiple sequence alignments or numerical arrays. In addition, ETE implements a number of built-in analytical tools, including phylogeny-based orthology prediction and cluster validation techniques. Finally, ETE's programmable tree drawing engine can be used to automate the graphical rendering of trees with customized node-specific visualizations.
ETE provides a complete set of methods to manipulate tree data structures that extends current functionality in other bioinformatic toolkits of a more general purpose. ETE is free software and can be downloaded from http://ete.cgenomics.org.
从基因聚类到系统发生学,许多生物信息学分析都将层次树作为其主要结果。这些树用于表示不同生物实体之间的关系,从而有助于对它们进行分析和解释。有许多独立的程序专注于树的可视化,或对其进行特定的分析。然而,这些应用程序很少适用于需要更高自动化水平的大规模调查。目前,许多全基因组分析都依赖于树状数据表示,因此需要可扩展的工具来大规模处理树结构。
在这里,我们介绍了用于树探索的环境(ETE),这是一个 Python 编程工具包,可协助自动操作、分析和可视化层次树。ETE 库提供了广泛的树处理选项以及分析系统发生树和聚类树的特定方法。除其他功能外,ETE 允许独立分析树分区,支持扩展的新拓扑格式,提供集成的节点注释系统,并允许将树链接到外部数据,如多重序列比对或数值数组。此外,ETE 实现了许多内置的分析工具,包括基于系统发生的同源性预测和聚类验证技术。最后,ETE 的可编程树绘制引擎可用于自动以自定义节点特定的可视化方式呈现树的图形化渲染。
ETE 提供了一套完整的方法来操作树数据结构,扩展了其他更通用的生物信息学工具包中的现有功能。ETE 是免费软件,可以从 http://ete.cgenomics.org 下载。