Suppr超能文献

有向部分相关:通过诱导拓扑破坏推断大规模基因调控网络。

Directed partial correlation: inferring large-scale gene regulatory network through induced topology disruptions.

机构信息

Cancer Research UK, Cambridge Research Institute, Cambridge, United Kingdom.

出版信息

PLoS One. 2011 Apr 6;6(4):e16835. doi: 10.1371/journal.pone.0016835.

Abstract

Inferring regulatory relationships among many genes based on their temporal variation in transcript abundance has been a popular research topic. Due to the nature of microarray experiments, classical tools for time series analysis lose power since the number of variables far exceeds the number of the samples. In this paper, we describe some of the existing multivariate inference techniques that are applicable to hundreds of variables and show the potential challenges for small-sample, large-scale data. We propose a directed partial correlation (DPC) method as an efficient and effective solution to regulatory network inference using these data. Specifically for genomic data, the proposed method is designed to deal with large-scale datasets. It combines the efficiency of partial correlation for setting up network topology by testing conditional independence, and the concept of Granger causality to assess topology change with induced interruptions. The idea is that when a transcription factor is induced artificially within a gene network, the disruption of the network by the induction signifies a genes role in transcriptional regulation. The benchmarking results using GeneNetWeaver, the simulator for the DREAM challenges, provide strong evidence of the outstanding performance of the proposed DPC method. When applied to real biological data, the inferred starch metabolism network in Arabidopsis reveals many biologically meaningful network modules worthy of further investigation. These results collectively suggest DPC is a versatile tool for genomics research. The R package DPC is available for download (http://code.google.com/p/dpcnet/).

摘要

基于转录物丰度的时间变化推断许多基因之间的调控关系一直是一个热门的研究课题。由于微阵列实验的性质,经典的时间序列分析工具由于变量的数量远远超过样本的数量而失去了作用。在本文中,我们描述了一些适用于数百个变量的现有多元推断技术,并展示了小样本、大规模数据的潜在挑战。我们提出了一种有向部分相关(DPC)方法,作为使用这些数据进行调控网络推断的有效和有效的解决方案。具体针对基因组数据,所提出的方法旨在处理大规模数据集。它结合了部分相关的效率,通过测试条件独立性来建立网络拓扑结构,以及格兰杰因果关系的概念来评估带有诱导中断的拓扑变化。其思想是,当转录因子在基因网络中被人为诱导时,诱导对网络的破坏表明该基因在转录调控中的作用。使用 GeneNetWeaver 进行基准测试的结果,该模拟程序是 DREAM 挑战的模拟器,为所提出的 DPC 方法的出色性能提供了强有力的证据。当应用于真实的生物数据时,拟南芥淀粉代谢网络的推断揭示了许多值得进一步研究的有生物学意义的网络模块。这些结果共同表明,DPC 是基因组学研究的通用工具。DPC 的 R 包可在以下网址下载(http://code.google.com/p/dpcnet/)。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2b19/3071805/923d30e0724f/pone.0016835.g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验