Suppr超能文献

利用稀疏组套索将代谢网络特征与表型联系起来。

Linking metabolic network features to phenotypes using sparse group lasso.

机构信息

Algorithmic Bioinformatics, Bonn-Aachen International Center for IT, Bonn D-53113, Germany.

DIMNP UMR CNRS 5235, University of Montpellier, Montpellier, France.

出版信息

Bioinformatics. 2017 Nov 1;33(21):3445-3453. doi: 10.1093/bioinformatics/btx427.

Abstract

MOTIVATION

Integration of metabolic networks with '-omics' data has been a subject of recent research in order to better understand the behaviour of such networks with respect to differences between biological and clinical phenotypes. Under the conditions of steady state of the reaction network and the non-negativity of fluxes, metabolic networks can be algebraically decomposed into a set of sub-pathways often referred to as extreme currents (ECs). Our objective is to find the statistical association of such sub-pathways with given clinical outcomes, resulting in a particular instance of a self-contained gene set analysis method. In this direction, we propose a method based on sparse group lasso (SGL) to identify phenotype associated ECs based on gene expression data. SGL selects a sparse set of feature groups and also introduces sparsity within each group. Features in our model are clusters of ECs, and feature groups are defined based on correlations among these features.

RESULTS

We apply our method to metabolic networks from KEGG database and study the association of network features to prostate cancer (where the outcome is tumor and normal, respectively) as well as glioblastoma multiforme (where the outcome is survival time). In addition, simulations show the superior performance of our method compared to global test, which is an existing self-contained gene set analysis method.

AVAILABILITY AND IMPLEMENTATION

R code (compatible with version 3.2.5) is available from http://www.abi.bit.uni-bonn.de/index.php?id=17.

CONTACT

samal@combine.rwth-aachen.de or frohlich@bit.uni-bonn.de.

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

摘要

动机

为了更好地理解代谢网络在生物和临床表型差异方面的行为,将代谢网络与“-omics”数据进行整合一直是最近研究的主题。在反应网络的稳态条件和通量的非负性条件下,代谢网络可以代数分解为一组通常称为极端电流 (ECs) 的子途径。我们的目标是找到这些子途径与给定临床结果的统计关联,从而形成一种特定的自包含基因集分析方法。在这一方向上,我们提出了一种基于稀疏组套索 (SGL) 的方法,基于基因表达数据来识别与表型相关的 ECs。SGL 选择了一组稀疏的特征组,并在每个组中引入了稀疏性。我们模型中的特征是 ECs 的聚类,并且特征组是基于这些特征之间的相关性定义的。

结果

我们将我们的方法应用于 KEGG 数据库中的代谢网络,并研究网络特征与前列腺癌(其中结局分别为肿瘤和正常)以及胶质母细胞瘤多形性(其中结局为生存时间)的关联。此外,模拟表明我们的方法与现有自包含基因集分析方法全局检验相比具有更好的性能。

可用性和实现

可从 http://www.abi.bit.uni-bonn.de/index.php?id=17 获得与版本 3.2.5 兼容的 R 代码。

联系方式

samal@combine.rwth-aachen.defrohlich@bit.uni-bonn.de

补充信息

补充数据可在 Bioinformatics 在线获得。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验