Blavatnik School of Computer Science, Tel Aviv University, Tel Aviv, Israel.
Faculty of Medical and Health Sciences, Tel Aviv University, Tel Aviv, Israel.
Nat Commun. 2024 Mar 23;15(1):2621. doi: 10.1038/s41467-024-46888-3.
Multi-omic studies of the human gut microbiome are crucial for understanding its role in disease across multiple functional layers. Nevertheless, integrating and analyzing such complex datasets poses significant challenges. Most notably, current analysis methods often yield extensive lists of disease-associated features (e.g., species, pathways, or metabolites), without capturing the multi-layered structure of the data. Here, we address this challenge by introducing "MintTea", an intermediate integration-based approach combining canonical correlation analysis extensions, consensus analysis, and an evaluation protocol. MintTea identifies "disease-associated multi-omic modules", comprising features from multiple omics that shift in concord and that collectively associate with the disease. Applied to diverse cohorts, MintTea captures modules with high predictive power, significant cross-omic correlations, and alignment with known microbiome-disease associations. For example, analyzing samples from a metabolic syndrome study, MintTea identifies a module with serum glutamate- and TCA cycle-related metabolites, along with bacterial species linked to insulin resistance. In another dataset, MintTea identifies a module associated with late-stage colorectal cancer, including Peptostreptococcus and Gemella species and fecal amino acids, in line with these species' metabolic activity and their coordinated gradual increase with cancer development. This work demonstrates the potential of advanced integration methods in generating systems-level, multifaceted hypotheses underlying microbiome-disease interactions.
对人类肠道微生物组进行多组学研究对于理解其在多种功能层面上的疾病作用至关重要。然而,整合和分析如此复杂的数据集带来了重大挑战。最值得注意的是,当前的分析方法通常会产生大量与疾病相关的特征列表(例如,物种、途径或代谢物),而无法捕捉到数据的多层次结构。在这里,我们通过引入“MintTea”来解决这一挑战,这是一种基于中间整合的方法,结合了典型相关分析扩展、共识分析和评估协议。MintTea 识别出“与疾病相关的多组学模块”,其中包含来自多个组学的特征,这些特征协同变化并与疾病相关。在多个队列中的应用表明,MintTea 可以捕获具有高预测能力、显著跨组学相关性且与已知微生物组-疾病关联一致的模块。例如,在对代谢综合征研究的样本进行分析时,MintTea 识别出一个与血清谷氨酸和 TCA 循环相关代谢物以及与胰岛素抵抗相关的细菌物种相关的模块。在另一个数据集上,MintTea 识别出与晚期结直肠癌相关的一个模块,其中包括与代谢活性相关的消化链球菌属和栖牙菌属物种以及粪便氨基酸,这与这些物种的代谢活性及其与癌症发展相关的逐渐增加相一致。这项工作展示了先进的整合方法在生成微生物组-疾病相互作用的系统级、多方面假设方面的潜力。