Biodesign Center for Fundamental and Applied Microbiomics, Arizona State University, Tempe, AZ, USA.
School of Life Sciences, Arizona State University, Tempe, AZ, USA.
Methods Mol Biol. 2022;2569:137-165. doi: 10.1007/978-1-0716-2691-7_7.
Phylogenomics is the inference of phylogenetic trees based on multiple marker genes sampled in the genomes of interest. An important challenge in phylogenomics is the potential incongruence among the evolutionary histories of individual genes, which can be widespread in microorganisms due to the prevalence of horizontal gene transfer. This protocol introduces the procedures for building a phylogenetic tree of a large number of microbial genomes using a broad sampling of marker genes that are representative of whole-genome evolution. The protocol highlights the use of a gene tree summary method, which can effectively reconstruct the species tree while accounting for the topological conflicts among individual gene trees. The pipeline described in this protocol is scalable to tens of thousands of genomes while retaining high accuracy. We discussed multiple software tools, libraries, and scripts to enable convenient adoption of the protocol. The protocol is suitable for microbiology and microbiome studies based on public genomes and metagenomic data.
系统发生基因组学是基于目标基因组中采样的多个标记基因来推断系统发育树。系统发生基因组学的一个重要挑战是单个基因的进化历史之间可能存在不一致,由于水平基因转移的普遍存在,这种不一致在微生物中很常见。本方案介绍了使用广泛采样的代表全基因组进化的标记基因构建大量微生物基因组系统发育树的程序。该方案强调了使用基因树汇总方法,该方法可以有效地重建物种树,同时考虑到个体基因树之间的拓扑冲突。该方案中描述的流程可扩展到数万个基因组,同时保持高精度。我们讨论了多个软件工具、库和脚本,以方便采用该方案。该方案适用于基于公共基因组和宏基因组数据的微生物学和微生物组研究。