Department of Molecular Genetics and Microbiology and University of Florida Genetics Institute, University of Florida, Gainesville, Florida, USA.
PLoS One. 2009 Oct 28;4(10):e7631. doi: 10.1371/journal.pone.0007631.
As research into alternative splicing reveals the fundamental importance of this phenomenon in the genome expression of higher organisms, there is an increasing need for a standardized, consistent and unique identifier for alternatively spliced isoforms. Such an identifier would be useful to eliminate ambiguities in references to gene isoforms, and would allow for the reliable comparison of isoforms from different sources (e.g., known genes vs. computational predictions). Commonly used identifiers for gene transcripts prove to be unsuitable for this purpose.
We propose an algorithm to compute an isoform signature based on the arrangement of exons and introns in a primary transcript. The isoform signature uniquely identifies a transcript structure, and can therefore be used as a key in databases of alternatively spliced isoforms, or to compare alternative splicing predictions produced by different methods. In this paper we present the algorithm to generate isoform signatures, we provide some examples of its application, and we describe a web-based resource to generate isoform signatures and use them in database searches.
Isoform signatures are simple, so that they can be easily generated and included in publications and databases, but flexible enough to unambiguously represent all possible isoform structures, including information about coding sequence position and variable transcription start and end sites. We believe that the adoption of isoform signatures can help establish a consistent, unambiguous nomenclature for alternative splicing isoforms. The system described in this paper is freely available at http://genome.ufl.edu/genesig/, and supplementary materials can be found at http://genome.ufl.edu/genesig-files/.
随着对可变剪接的研究揭示了这一现象在高等生物基因组表达中的重要性,人们越来越需要一种标准化、一致且唯一的可变剪接异构体标识符。这样的标识符将有助于消除对基因异构体引用的歧义,并允许可靠地比较来自不同来源的异构体(例如,已知基因与计算预测)。常用的基因转录本标识符在这方面证明是不合适的。
我们提出了一种基于初级转录本中外显子和内含子排列来计算异构体特征的算法。异构体特征唯一地标识了转录本结构,因此可以用作可变剪接异构体数据库中的关键字,或用于比较不同方法生成的可变剪接预测。本文介绍了生成异构体特征的算法,提供了一些应用示例,并描述了一个基于网络的资源,用于生成异构体特征并在数据库搜索中使用它们。
异构体特征简单,易于生成并包含在出版物和数据库中,但足够灵活,可以明确表示所有可能的异构体结构,包括编码序列位置和可变转录起始和结束位点的信息。我们相信采用异构体特征可以帮助建立可变剪接异构体的一致、明确的命名法。本文描述的系统可在 http://genome.ufl.edu/genesig/ 上免费获得,补充材料可在 http://genome.ufl.edu/genesig-files/ 上找到。