Suppr超能文献

PathIntegrate:基于通路的多组学数据整合的多元建模方法。

PathIntegrate: Multivariate modelling approaches for pathway-based multi-omics data integration.

机构信息

Section of Bioinformatics, Division of Systems Medicine, Department of Metabolism, Digestion, and Reproduction, Faculty of Medicine, Imperial College London, London, United Kingdom.

Toxalim (Research Centre in Food Toxicology), Université de Toulouse, INRAE, ENVT, INP-Purpan, UPS, Toulouse, France.

出版信息

PLoS Comput Biol. 2024 Mar 25;20(3):e1011814. doi: 10.1371/journal.pcbi.1011814. eCollection 2024 Mar.

Abstract

As terabytes of multi-omics data are being generated, there is an ever-increasing need for methods facilitating the integration and interpretation of such data. Current multi-omics integration methods typically output lists, clusters, or subnetworks of molecules related to an outcome. Even with expert domain knowledge, discerning the biological processes involved is a time-consuming activity. Here we propose PathIntegrate, a method for integrating multi-omics datasets based on pathways, designed to exploit knowledge of biological systems and thus provide interpretable models for such studies. PathIntegrate employs single-sample pathway analysis to transform multi-omics datasets from the molecular to the pathway-level, and applies a predictive single-view or multi-view model to integrate the data. Model outputs include multi-omics pathways ranked by their contribution to the outcome prediction, the contribution of each omics layer, and the importance of each molecule in a pathway. Using semi-synthetic data we demonstrate the benefit of grouping molecules into pathways to detect signals in low signal-to-noise scenarios, as well as the ability of PathIntegrate to precisely identify important pathways at low effect sizes. Finally, using COPD and COVID-19 data we showcase how PathIntegrate enables convenient integration and interpretation of complex high-dimensional multi-omics datasets. PathIntegrate is available as an open-source Python package.

摘要

随着兆字节的多组学数据的产生,对于能够促进此类数据的整合和解释的方法的需求也在不断增加。目前的多组学整合方法通常输出与结果相关的分子的列表、聚类或子网络。即使具有专业的领域知识,辨别所涉及的生物过程也是一项耗时的活动。在这里,我们提出了基于途径的多组学数据集整合方法 PathIntegrate,旨在利用对生物系统的了解,从而为此类研究提供可解释的模型。PathIntegrate 采用单样本途径分析将多组学数据集从分子水平转换为途径水平,并应用预测性单视图或多视图模型来整合数据。模型输出包括按对结果预测的贡献、每个组学层的贡献以及途径中每个分子的重要性对途径进行排名的多组学途径。使用半合成数据,我们证明了将分子分组到途径中以在低信噪比情况下检测信号的益处,以及 PathIntegrate 以低效应大小精确识别重要途径的能力。最后,使用 COPD 和 COVID-19 数据,我们展示了 PathIntegrate 如何方便地整合和解释复杂的高维多组学数据集。PathIntegrate 作为一个开源的 Python 包提供。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f44c/10994553/14212068a730/pcbi.1011814.g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验