Suppr超能文献

一种用于基因表达调控建模的二维回归树方法。

A bi-dimensional regression tree approach to the modeling of gene expression regulation.

作者信息

Ruan Jianhua, Zhang Weixiong

机构信息

Department of Computer Science and Engineering, Washington University in St Louis, St Louis, MO 63130, USA.

出版信息

Bioinformatics. 2006 Feb 1;22(3):332-40. doi: 10.1093/bioinformatics/bti792. Epub 2005 Nov 22.

Abstract

MOTIVATION

The transcriptional regulation of a gene depends on the binding of cis-regulatory elements on its promoter to some transcription factors and the expression levels of the transcription factors. Most existing approaches to studying transcriptional regulation model these dependencies separately, i.e. either from promoters to gene expression or from the expression levels of transcription factors to the expression levels of genes. Little effort has been devoted to a single model for integrating both dependencies.

RESULTS

We propose a novel method to model gene expression using both promoter sequences and the expression levels of putative regulators. The proposed method, called bi-dimensional regression tree (BDTree), extends a multivariate regression tree approach by applying it simultaneously to both genes and conditions of an expression matrix. The method produces hypotheses about the condition-specific binding motifs and regulators for each gene. As a side-product, the method also partitions the expression matrix into small submatrices in a way similar to bi-clustering. We propose and compare several splitting functions for building the tree. When applied to two microarray datasets of the yeast Saccharomyces cerevisiae, BDTree successfully identifies most motifs and regulators that are known to regulate the biological processes underlying the datasets. Comparing with an existing algorithm, BDTree provides a higher prediction accuracy in cross-validations.

摘要

动机

基因的转录调控取决于其启动子上的顺式调控元件与某些转录因子的结合以及转录因子的表达水平。大多数现有的研究转录调控的方法分别对这些依赖性进行建模,即要么从启动子到基因表达,要么从转录因子的表达水平到基因的表达水平。很少有人致力于构建一个整合这两种依赖性的单一模型。

结果

我们提出了一种使用启动子序列和假定调控因子的表达水平来对基因表达进行建模的新方法。所提出的方法称为二维回归树(BDTree),它通过将多元回归树方法同时应用于表达矩阵的基因和条件来扩展该方法。该方法产生关于每个基因的条件特异性结合基序和调控因子的假设。作为一个副产品,该方法还以类似于双聚类的方式将表达矩阵划分为小的子矩阵。我们提出并比较了几种用于构建树的分裂函数。当应用于酿酒酵母的两个微阵列数据集时,BDTree成功识别出了大多数已知调控数据集所涉及生物过程的基序和调控因子。与现有算法相比,BDTree在交叉验证中提供了更高的预测准确性。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验