Suppr超能文献

用于微阵列时间进程基因表达数据的SCAD回归分析组。

Group SCAD regression analysis for microarray time course gene expression data.

作者信息

Wang Lifeng, Chen Guang, Li Hongzhe

机构信息

Department of Biostatistics and Epidemiology, University of Pennsylvania School of Medicine, Philadelphia, PA 19104, USA.

出版信息

Bioinformatics. 2007 Jun 15;23(12):1486-94. doi: 10.1093/bioinformatics/btm125. Epub 2007 Apr 26.

Abstract

MOTIVATION

Since many important biological systems or processes are dynamic systems, it is important to study the gene expression patterns over time in a genomic scale in order to capture the dynamic behavior of gene expression. Microarray technologies have made it possible to measure the gene expression levels of essentially all the genes during a given biological process. In order to determine the transcriptional factors (TFs) involved in gene regulation during a given biological process, we propose to develop a functional response model with varying coefficients in order to model the transcriptional effects on gene expression levels and to develop a group smoothly clipped absolute deviation (SCAD) regression procedure for selecting the TFs with varying coefficients that are involved in gene regulation during a biological process.

RESULTS

Simulation studies indicated that such a procedure is quite effective in selecting the relevant variables with time-varying coefficients and in estimating the coefficients. Application to the yeast cell cycle microarray time course gene expression data set identified 19 of the 21 known TFs related to the cell cycle process. In addition, we have identified another 52 TFs that also have periodic transcriptional effects on gene expression during the cell cycle process. Compared to simple linear regression (SLR) analysis at each time point, our procedure identified more known cell cycle related TFs.

CONCLUSIONS

The proposed group SCAD regression procedure is very effective for identifying variables with time-varying coefficients, in particular, for identifying the TFs that are related to gene expression over time. By identifying the TFs that are related to gene expression variations over time, the procedure can potentially provide more insight into the gene regulatory networks.

摘要

动机

由于许多重要的生物系统或过程都是动态系统,因此在基因组规模上研究基因表达模式随时间的变化,对于捕捉基因表达的动态行为非常重要。微阵列技术使在给定生物过程中测量几乎所有基因的表达水平成为可能。为了确定给定生物过程中参与基因调控的转录因子(TFs),我们建议开发一种具有可变系数的功能反应模型,以模拟转录对基因表达水平的影响,并开发一种分组平滑截断绝对偏差(SCAD)回归程序,用于选择在生物过程中参与基因调控的具有可变系数的TFs。

结果

模拟研究表明,该程序在选择具有时变系数的相关变量以及估计系数方面非常有效。将其应用于酵母细胞周期微阵列时间进程基因表达数据集,识别出了与细胞周期过程相关的21个已知TFs中的19个。此外,我们还识别出另外52个TFs,它们在细胞周期过程中对基因表达也具有周期性转录影响。与在每个时间点进行的简单线性回归(SLR)分析相比,我们的程序识别出了更多已知的与细胞周期相关的TFs。

结论

所提出的分组SCAD回归程序在识别具有时变系数的变量方面非常有效,特别是在识别随时间与基因表达相关的TFs方面。通过识别随时间与基因表达变化相关的TFs,该程序有可能为基因调控网络提供更多见解。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验