Zhang Yufeng, Jing Gongchao, Chen Yuzhu, Li Jinhua, Su Xiaoquan
College of Computer Science and Technology, Qingdao University, Qingdao, Shandong 266071, China.
Single-Cell Center, Qingdao Institute of BioEnergy and Bioprocess Technology, Chinese Academy of Sciences, Qingdao, Shandong 266101, China.
Bioinform Adv. 2021 May 12;1(1):vbab003. doi: 10.1093/bioadv/vbab003. eCollection 2021.
Functional beta-diversity analysis on numerous microbiomes interprets the linkages between metabolic functions and their meta-data. To evaluate the microbiome beta-diversity, widely used distance metrices only count overlapped gene families but omit their inherent relationships, resulting in erroneous distances due to the sparsity of high-dimensional function profiles. Here we propose (HMS) to tackle such problem. HMS contains two core components: (i) a dissimilarity algorithm that comprehensively measures functional distances among microbiomes using multi-level metabolic hierarchy and (ii) a fast Principal Co-ordinates Analysis (PCoA) implementation that deduces the beta-diversity pattern optimized by parallel computing. Results showed HMS can detect the variations of microbial functions in upper-level metabolic pathways, however, always missed by other methods. In addition, HMS accomplished the pairwise distance matrix and PCoA for 20 000 microbiomes in 3.9 h on a single computing node, which was 23 times faster and 80% less RAM consumption compared to existing methods, enabling the in-depth data mining among microbiomes on a high resolution. HMS takes microbiome functional profiles as input, produces their pairwise distance matrix and PCoA coordinates.
It is coded in C/C++ with parallel computing and released in two alternative forms: a standalone software (https://github.com/qdu-bioinfo/hierarchical-meta-storms) and an equivalent R package (https://github.com/qdu-bioinfo/hrms).
Supplementary data are available at online.
对众多微生物群落进行的功能β多样性分析解释了代谢功能与其元数据之间的联系。为了评估微生物群落的β多样性,广泛使用的距离度量仅计算重叠的基因家族,却忽略了它们之间的内在关系,由于高维功能谱的稀疏性导致距离计算错误。在此,我们提出了层次元风暴(HMS)来解决此类问题。HMS包含两个核心组件:(i)一种差异算法,它使用多级代谢层次综合测量微生物群落之间的功能距离;(ii)一种快速主坐标分析(PCoA)实现,通过并行计算推导出优化的β多样性模式。结果表明,HMS能够检测到其他方法常常遗漏的上层代谢途径中微生物功能的变化。此外,HMS在单个计算节点上3.9小时内完成了20000个微生物群落的成对距离矩阵和PCoA分析,与现有方法相比,速度快23倍,内存消耗减少80%,能够在高分辨率下对微生物群落进行深入的数据挖掘。HMS以微生物群落功能谱作为输入,生成它们的成对距离矩阵和PCoA坐标。
它用C/C++编码并采用并行计算,以两种替代形式发布:一个独立软件(https://github.com/qdu-bioinfo/hierarchical-meta-storms)和一个等效的R包(https://github.com/qdu-bioinfo/hrms)。
补充数据可在网上获取。