Malousi Andigoni, Kouidou Sofia, Maglaveras Nicos
Student Member, IEEE, Lab. of Medical Informatics, Faculty of Medicine, Aristotle University of Thessaloniki, 54124, P.O.Box 323, Greece,
Annu Int Conf IEEE Eng Med Biol Soc. 2007;2007:139-42. doi: 10.1109/IEMBS.2007.4352242.
Alternative pre-mRNA splicing is a biological mechanism with significant prevalence in complex organisms and experimentally verified association with numerous disease-causing factors. Splicing-related proteins play a significant regulatory role during this process. In this study, we applied a stochastic analysis of alternatively spliced human genes based on Gibbs sampling, in order to identify short consensus sequences that are over-represented compared to a reference Markov model describing constitutive exons of the same genes. The analysis resulted in a set of statistically significant over-represented motifs. The biological importance of these motifs was assessed by estimating the likelihood of being identified by cis-acting elements that correspond to the binding domains of splicing enhancers/silencers. The results indicate that the identified over-represented sequences are often similar to those recognized by known regulatory splicing elements.
可变前体mRNA剪接是一种在复杂生物体中普遍存在且经实验验证与众多致病因素相关的生物学机制。剪接相关蛋白在此过程中发挥着重要的调节作用。在本研究中,我们基于吉布斯采样对可变剪接的人类基因进行了随机分析,以识别与描述同一基因组成型外显子的参考马尔可夫模型相比过度富集的短共有序列。分析得出了一组具有统计学意义的过度富集基序。通过估计被对应于剪接增强子/沉默子结合域的顺式作用元件识别的可能性来评估这些基序的生物学重要性。结果表明,所识别的过度富集序列通常与已知调控剪接元件所识别的序列相似。