Suppr超能文献

比较时间序列表达谱的连续表示以识别差异表达基因。

Comparing the continuous representation of time-series expression profiles to identify differentially expressed genes.

作者信息

Bar-Joseph Ziv, Gerber Georg, Simon Itamar, Gifford David K, Jaakkola Tommi S

机构信息

Laboratory for Computer Science, Massachusetts Institute of Technology, 200 Technology Square, Cambridge, MA 02139, USA.

出版信息

Proc Natl Acad Sci U S A. 2003 Sep 2;100(18):10146-51. doi: 10.1073/pnas.1732547100. Epub 2003 Aug 21.

Abstract

We present a general algorithm to detect genes differentially expressed between two nonhomogeneous time-series data sets. As increasing amounts of high-throughput biological data become available, a major challenge in genomic and computational biology is to develop methods for comparing data from different experimental sources. Time-series whole-genome expression data are a particularly valuable source of information because they can describe an unfolding biological process such as the cell cycle or immune response. However, comparisons of time-series expression data sets are hindered by biological and experimental inconsistencies such as differences in sampling rate, variations in the timing of biological processes, and the lack of repeats. Our algorithm overcomes these difficulties by using a continuous representation for time-series data and combining a noise model for individual samples with a global difference measure. We introduce a corresponding statistical method for computing the significance of this differential expression measure. We used our algorithm to compare cell-cycle-dependent gene expression in wild-type and knockout yeast strains. Our algorithm identified a set of 56 differentially expressed genes, and these results were validated by using independent protein-DNA-binding data. Unlike previous methods, our algorithm was also able to identify 22 non-cell-cycle-regulated genes as differentially expressed. This set of genes is significantly correlated in a set of independent expression experiments, suggesting additional roles for the transcription factors Fkh1 and Fkh2 in controlling cellular activity in yeast.

摘要

我们提出了一种通用算法,用于检测两个非齐次时间序列数据集之间差异表达的基因。随着越来越多的高通量生物学数据可用,基因组学和计算生物学面临的一个主要挑战是开发比较来自不同实验来源数据的方法。时间序列全基因组表达数据是一种特别有价值的信息来源,因为它们可以描述一个正在展开的生物学过程,如细胞周期或免疫反应。然而,时间序列表达数据集的比较受到生物学和实验不一致性的阻碍,如采样率差异、生物过程时间的变化以及缺乏重复。我们的算法通过使用时间序列数据的连续表示,并将单个样本的噪声模型与全局差异度量相结合来克服这些困难。我们引入了一种相应的统计方法来计算这种差异表达度量的显著性。我们使用我们的算法比较野生型和基因敲除酵母菌株中细胞周期依赖性基因的表达。我们的算法识别出一组56个差异表达基因,并且这些结果通过使用独立的蛋白质-DNA结合数据得到了验证。与以前的方法不同,我们的算法还能够将22个非细胞周期调节基因识别为差异表达基因。这组基因在一组独立的表达实验中显著相关,表明转录因子Fkh1和Fkh2在控制酵母细胞活性方面有额外作用。

相似文献

9
Identification of cell cycle-regulated genes in fission yeast.裂殖酵母中细胞周期调控基因的鉴定。
Mol Biol Cell. 2005 Mar;16(3):1026-42. doi: 10.1091/mbc.e04-04-0299. Epub 2004 Dec 22.

引用本文的文献

4
Modeling Persistent Trends in Distributions.分布中持续趋势的建模
J Am Stat Assoc. 2018;113(523):1296-1310. doi: 10.1080/01621459.2017.1341412. Epub 2018 Jun 12.
6
Generalized Correlation Coefficient for Non-Parametric Analysis of Microarray Time-Course Data.用于微阵列时间进程数据非参数分析的广义相关系数
J Integr Bioinform. 2017 Jun 6;14(2):/j/jib.2017.14.issue-2/jib-2017-0011/jib-2017-0011.xml. doi: 10.1515/jib-2017-0011.

本文引用的文献

6
Cluster analysis of gene expression dynamics.基因表达动力学的聚类分析
Proc Natl Acad Sci U S A. 2002 Jul 9;99(14):9121-6. doi: 10.1073/pnas.132656399. Epub 2002 Jun 24.
7
Human macrophage activation programs induced by bacterial pathogens.由细菌病原体诱导的人类巨噬细胞激活程序。
Proc Natl Acad Sci U S A. 2002 Feb 5;99(3):1503-8. doi: 10.1073/pnas.022649799. Epub 2002 Jan 22.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验