Institute of Computer Science, University of Tartu, Tartu, Estonia.
PLoS One. 2011 Jan 31;6(1):e14559. doi: 10.1371/journal.pone.0014559.
Transcription factors are proteins that bind to motifs on the DNA and thus affect gene expression regulation. The qualitative description of the corresponding processes is therefore important for a better understanding of essential biological mechanisms. However, wet lab experiments targeted at the discovery of the regulatory interplay between transcription factors and binding sites are expensive. We propose a new, purely computational method for finding putative associations between transcription factors and motifs. This method is based on a linear model that combines sequence information with expression data. We present various methods for model parameter estimation and show, via experiments on simulated data, that these methods are reliable. Finally, we examine the performance of this model on biological data and conclude that it can indeed be used to discover meaningful associations. The developed software is available as a web tool and Scilab source code at http://biit.cs.ut.ee/gmat/.
转录因子是能够与 DNA 上的模体结合的蛋白质,从而影响基因表达调控。因此,对相应过程进行定性描述对于更好地理解基本的生物学机制是很重要的。然而,针对转录因子和结合位点之间的调控相互作用的湿实验的实验是昂贵的。我们提出了一种新的、纯粹基于计算的方法,用于寻找转录因子和模体之间的假定关联。该方法基于一个线性模型,该模型将序列信息与表达数据相结合。我们提出了各种模型参数估计方法,并通过对模拟数据的实验证明了这些方法的可靠性。最后,我们在生物数据上检验了该模型的性能,并得出结论,它确实可以用于发现有意义的关联。开发的软件可作为网络工具和 Scilab 源代码在 http://biit.cs.ut.ee/gmat/ 上获得。