Key Laboratory of Marine Genetics and Breeding (OUC), Ministry of Education, Qingdao, China; College of Marine Life Sciences, Ocean University of China, Qingdao, China.
Key Laboratory of Marine Genetics and Breeding (OUC), Ministry of Education, Qingdao, China; College of Marine Life Sciences, Ocean University of China, Qingdao, China; Laboratory for Marine Biology and Biotechnology, Qingdao National Laboratory for Marine Science and Technology, Qingdao, China.
Genomics. 2018 Jan;110(1):18-22. doi: 10.1016/j.ygeno.2017.08.001. Epub 2017 Aug 3.
Organelle phylogenomic analysis requires precisely constructed multi-gene alignment matrices concatenated by pre-aligned single gene datasets. For non-bioinformaticians, it can take days to weeks to manually create high-quality multi-gene alignments comprising tens or hundreds of homologous genes. Here, we describe a new and highly efficient pipeline, HomBlocks, which uses a homologous block searching method to construct multiple sequence alignment. This approach can automatically recognize locally collinear blocks among organelle genomes and excavate phylogenetically informative regions to construct multiple sequence alignment in a few hours. In addition, HomBlocks supports organelle genomes without annotation and makes adjustment to different taxon datasets, thereby enabling the inclusion of as many common genes as possible. Topology comparison of trees built by conventional multi-gene and HomBlocks alignments implemented in different taxon categories shows that the same efficiency can be achieved by HomBlocks as when using the traditional method. The availability of Homblocks makes organelle phylogenetic analyses more accessible to non-bioinformaticians, thereby promising to lead to a better understanding of phylogenic relationships at an organelle genome level.
HomBlocks is implemented in Perl and is supported by Unix-like operative systems, including Linux and macOS. The Perl source code is freely available for download from https://github.com/fenghen360/HomBlocks.git, and documentation and tutorials are available at https://github.com/fenghen360/HomBlocks.
细胞器系统发生基因组分析需要精确构建的多基因对齐矩阵,这些矩阵由预对齐的单基因数据集串联而成。对于非生物信息学家来说,手动创建包含数十个或数百个同源基因的高质量多基因对齐可能需要数天到数周的时间。在这里,我们描述了一种新的、高效的流水线 HomBlocks,它使用同源块搜索方法构建多序列比对。这种方法可以自动识别细胞器基因组中的局部共线性块,并挖掘系统发育信息丰富的区域,在几个小时内构建多序列比对。此外,HomBlocks 支持无注释的细胞器基因组,并对不同的分类数据集进行调整,从而能够尽可能多地包含常见基因。在不同分类类别中使用传统多基因和 HomBlocks 对齐构建的树的拓扑比较表明,HomBlocks 可以实现与传统方法相同的效率。HomBlocks 的可用性使细胞器系统发生分析更容易为非生物信息学家所接受,从而有望更好地理解细胞器基因组水平的系统发育关系。
HomBlocks 是用 Perl 编写的,支持类 Unix 操作系统,包括 Linux 和 macOS。Perl 源代码可从 https://github.com/fenghen360/HomBlocks.git 免费下载,文档和教程可在 https://github.com/fenghen360/HomBlocks 上获得。