Department of Pediatrics, School of Medicine, University of California, San Diego, California 92093, USA.
Center for Microbiome Innovation, Jacobs School of Engineering, University of California San Diego, La Jolla, California 92093, USA.
Genome Res. 2021 Nov;31(11):2131-2137. doi: 10.1101/gr.275777.121. Epub 2021 Sep 3.
The number of publicly available microbiome samples is continually growing. As data set size increases, bottlenecks arise in standard analytical pipelines. Faith's phylogenetic diversity (Faith's PD) is a highly utilized phylogenetic alpha diversity metric that has thus far failed to effectively scale to trees with millions of vertices. Stacked Faith's phylogenetic diversity (SFPhD) enables calculation of this widely adopted diversity metric at a much larger scale by implementing a computationally efficient algorithm. The algorithm reduces the amount of computational resources required, resulting in more accessible software with a reduced carbon footprint, as compared to previous approaches. The new algorithm produces identical results to the previous method. We further demonstrate that the phylogenetic aspect of Faith's PD provides increased power in detecting diversity differences between younger and older populations in the FINRISK study's metagenomic data.
公开可用的微生物组样本数量在不断增加。随着数据集规模的增加,标准分析流程出现了瓶颈。Faith 的系统发育多样性(Faith's PD)是一种高度使用的系统发育 alpha 多样性指标,迄今为止,它未能有效地扩展到具有数百万个顶点的树。堆叠 Faith 的系统发育多样性(SFPhD)通过实现一种计算效率高的算法,能够在更大的范围内计算这个广泛采用的多样性指标。与以前的方法相比,该算法减少了所需的计算资源,从而使软件更易于访问,碳足迹也更小。新算法产生的结果与以前的方法完全相同。我们进一步证明,在 FINRISK 研究的宏基因组数据中,Faith's PD 的系统发育方面在检测年轻和年老人群之间的多样性差异方面提供了更高的功效。