Talavera David, Hospital Adam, Orozco Modesto, de la Cruz Xavier
Molecular Modelling and Bioinformatics Unit, Institut de Recerca Biomèdica, Parc Científic de Barcelona, Barcelona, Spain.
BMC Bioinformatics. 2007 Jul 19;8:260. doi: 10.1186/1471-2105-8-260.
The study of the functional role of alternative splice isoforms of a gene is a very active area of research in biology. The difficulty of the experimental approach (in particular, in its high-throughput version) leaves ample room for the development of bioinformatics tools that can provide a useful first picture of the problem. Among the possible approaches, one of the simplest is to follow classical protein function annotation protocols and annotate target alternative splice events with the information available from conserved events in other species. However, the application of this protocol requires a procedure capable of recognising such events. Here we present a simple but accurate method developed for this purpose.
We have developed a method for identifying homologous, or equivalent, alternative splicing events, based on the combined use of neural networks and sequence searches. The procedure comprises four steps: (i) BLAST search for homologues of the two isoforms defining the target alternative splicing event; (ii) construction of all possible candidate events; (iii) scoring of the latter with a series of neural networks; and (iv) filtering of the results. When tested in a set of 473 manually annotated pairs of homologous events, our method showed a good performance, with an accuracy of 0.99, a precision of 0.98 and a sensitivity of 0.93. When no candidates were available, the specificity of our method varied between 0.81 and 0.91.
The method described in this article allows the identification of homologous alternative splicing events, with a good success rate, indicating that such method could be used for the development of functional annotation of alternative splice isoforms.
基因可变剪接异构体功能作用的研究是生物学中一个非常活跃的研究领域。实验方法(特别是高通量版本)的难度为生物信息学工具的开发留下了很大空间,这些工具可以提供该问题的有用初步图景。在可能的方法中,最简单的方法之一是遵循经典的蛋白质功能注释协议,并用其他物种保守事件中的可用信息注释目标可变剪接事件。然而,该协议的应用需要一种能够识别此类事件的程序。在此,我们提出了一种为此目的开发的简单而准确的方法。
我们开发了一种基于神经网络和序列搜索相结合来识别同源或等效可变剪接事件的方法。该程序包括四个步骤:(i)对定义目标可变剪接事件的两种异构体进行BLAST同源物搜索;(ii)构建所有可能的候选事件;(iii)用一系列神经网络对后者进行评分;(iv)对结果进行筛选。在一组473对人工注释的同源事件中进行测试时,我们的方法表现良好,准确率为0.99,精确率为0.98,灵敏度为0.93。当没有候选事件时,我们方法的特异性在0.81至0.91之间变化。
本文所述方法能够成功识别同源可变剪接事件,表明该方法可用于可变剪接异构体的功能注释开发。