Genome Biol Evol. 2010;2:697-707. doi: 10.1093/gbe/evq054. Epub 2010 Sep 9.
Identifying the nucleotides that cause gene expression variation is a critical step in dissecting the genetic basis of complex traits. Here, we focus on polymorphisms that are predicted to alter transcription factor binding sites (TFBSs) in the yeast, Saccharomyces cerevisiae. We assembled a confident set of transcription factor motifs using recent protein binding microarray and ChIP-chip data and used our collection of motifs to predict a comprehensive set of TFBSs across the S. cerevisiae genome. We used a population genomics analysis to show that our predictions are accurate and significantly improve on our previous annotation. Although predicting gene expression from sequence is thought to be difficult in general, we identified a subset of genes for which changes in predicted TFBSs correlate well with expression divergence between yeast strains. Our analysis thus demonstrates both the accuracy of our new TFBS predictions and the feasibility of using simple models of gene regulation to causally link differences in gene expression to variation at individual nucleotides.
确定导致基因表达变异的核苷酸是剖析复杂性状遗传基础的关键步骤。在这里,我们专注于预测会改变酵母酿酒酵母转录因子结合位点 (TFBS) 的多态性。我们使用最近的蛋白质结合微阵列和 ChIP-chip 数据组装了一组可靠的转录因子基序,并使用我们的基序集合来预测酿酒酵母基因组中的一组全面的 TFBS。我们使用群体基因组学分析表明,我们的预测是准确的,并显著优于我们之前的注释。尽管一般认为从序列预测基因表达很困难,但我们确定了一组基因,其预测的 TFBS 变化与酵母菌株之间的表达差异很好地相关。因此,我们的分析既证明了我们新的 TFBS 预测的准确性,也证明了使用简单的基因调控模型将基因表达差异与单个核苷酸的变异联系起来的可行性。