Allocco Dominic J, Kohane Isaac S, Butte Atul J
Informatics Program, Children's Hospital, Boston, MA, USA.
BMC Bioinformatics. 2004 Feb 25;5:18. doi: 10.1186/1471-2105-5-18.
It is thought that genes with similar patterns of mRNA expression and genes with similar functions are likely to be regulated via the same mechanisms. It has been difficult to quantitatively test these hypotheses on a large scale because there has been no general way of determining whether genes share a common regulatory mechanism. Here we use data from a recent genome wide binding analysis in combination with mRNA expression data and existing functional annotations to quantify the likelihood that genes with varying degrees of similarity in mRNA expression profile or function will be bound by a common transcription factor.
Genes with strongly correlated mRNA expression profiles are more likely to have their promoter regions bound by a common transcription factor. This effect is present only at relatively high levels of expression similarity. In order for two genes to have a greater than 50% chance of sharing a common transcription factor binder, the correlation between their expression profiles (across the 611 microarrays used in our study) must be greater than 0.84. Genes with similar functional annotations are also more likely to be bound by a common transcription factor. Combining mRNA expression data with functional annotation results in a better predictive model than using either data source alone.
We demonstrate how mRNA expression data and functional annotations can be used together to estimate the probability that genes share a common regulatory mechanism. Existing microarray data and known functional annotations are sufficient to identify only a relatively small percentage of co-regulated genes.
人们认为,具有相似mRNA表达模式的基因以及具有相似功能的基因可能通过相同的机制进行调控。由于一直没有确定基因是否共享共同调控机制的通用方法,因此很难大规模地对这些假设进行定量测试。在这里,我们结合最近全基因组结合分析的数据、mRNA表达数据和现有的功能注释,来量化在mRNA表达谱或功能上具有不同程度相似性的基因被共同转录因子结合的可能性。
mRNA表达谱高度相关的基因,其启动子区域更有可能被共同的转录因子结合。这种效应仅在相对较高的表达相似性水平上存在。为了使两个基因有超过50%的机会共享一个共同的转录因子结合物,它们的表达谱之间(在我们研究中使用的611个微阵列上)的相关性必须大于0.84。具有相似功能注释的基因也更有可能被共同的转录因子结合。将mRNA表达数据与功能注释相结合,比单独使用任何一个数据源能产生更好的预测模型。
我们展示了如何将mRNA表达数据和功能注释结合起来,以估计基因共享共同调控机制的概率。现有的微阵列数据和已知的功能注释仅足以识别相对较小比例的共调控基因。