Department of Biostatistics, University of Florida, Gainesville, FL, USA.
Department of Population Health Science and Policy, Icahn School of Medicine at Mount Sinai, New York, NY, USA.
BMC Bioinformatics. 2024 Mar 18;25(1):117. doi: 10.1186/s12859-024-05689-7.
A recent breakthrough in differential network (DN) analysis of microbiome data has been realized with the advent of next-generation sequencing technologies. The DN analysis disentangles the microbial co-abundance among taxa by comparing the network properties between two or more graphs under different biological conditions. However, the existing methods to the DN analysis for microbiome data do not adjust for other clinical differences between subjects.
We propose a Statistical Approach via Pseudo-value Information and Estimation for Differential Network Analysis (SOHPIE-DNA) that incorporates additional covariates such as continuous age and categorical BMI. SOHPIE-DNA is a regression technique adopting jackknife pseudo-values that can be implemented readily for the analysis. We demonstrate through simulations that SOHPIE-DNA consistently reaches higher recall and F1-score, while maintaining similar precision and accuracy to existing methods (NetCoMi and MDiNE). Lastly, we apply SOHPIE-DNA on two real datasets from the American Gut Project and the Diet Exchange Study to showcase the utility. The analysis of the Diet Exchange Study is to showcase that SOHPIE-DNA can also be used to incorporate the temporal change of connectivity of taxa with the inclusion of additional covariates. As a result, our method has found taxa that are related to the prevention of intestinal inflammation and severity of fatigue in advanced metastatic cancer patients.
SOHPIE-DNA is the first attempt of introducing the regression framework for the DN analysis in microbiome data. This enables the prediction of characteristics of a connectivity of a network with the presence of additional covariate information in the regression. The R package with a vignette of our methodology is available through the CRAN repository ( https://CRAN.R-project.org/package=SOHPIE ), named SOHPIE (pronounced as Sofie). The source code and user manual can be found at https://github.com/sjahnn/SOHPIE-DNA .
随着下一代测序技术的出现,微生物组数据的差异网络(DN)分析取得了新的突破。DN 分析通过比较不同生物条件下两个或更多图形的网络属性,将微生物分类群之间的共同丰度分离开来。然而,现有的微生物组数据 DN 分析方法并没有针对受试者之间的其他临床差异进行调整。
我们提出了一种通过伪值信息和估计进行差异网络分析的统计方法(SOHPIE-DNA),该方法可以纳入其他协变量,如连续的年龄和分类的 BMI。SOHPIE-DNA 是一种回归技术,采用了可以方便地用于分析的自举伪值。我们通过模拟表明,SOHPIE-DNA 始终达到更高的召回率和 F1 分数,同时保持与现有方法(NetCoMi 和 MDiNE)类似的精度和准确性。最后,我们将 SOHPIE-DNA 应用于来自美国肠道计划和饮食交换研究的两个真实数据集,以展示其效用。饮食交换研究的分析表明,SOHPIE-DNA 还可以用于将分类群连接的时间变化与额外协变量的纳入结合起来。结果,我们的方法发现了与预防肠道炎症和晚期转移性癌症患者疲劳严重程度相关的分类群。
SOHPIE-DNA 是首次尝试在微生物组数据中引入回归框架进行 DN 分析。这使得在回归中存在额外协变量信息的情况下,可以预测网络连接特征。带有我们方法的说明手册的 R 包可通过 CRAN 存储库(https://CRAN.R-project.org/package=SOHPIE)获得,名为 SOHPIE(发音为 Sofie)。源代码和用户手册可在 https://github.com/sjahnn/SOHPIE-DNA 找到。