Martelli Pier L, D'Antonio Mattia, Bonizzoni Paola, Castrignanò Tiziana, D'Erchia Anna M, D'Onorio De Meo Paolo, Fariselli Piero, Finelli Michele, Licciulli Flavio, Mangiulli Marina, Mignone Flavio, Pavesi Giulio, Picardi Ernesto, Rizzi Raffaella, Rossi Ivan, Valletti Alessio, Zauli Andrea, Zambelli Federico, Casadio Rita, Pesole Graziano
Biocomputing Group, University of Bologna, Bologna 40126, Italy.
Nucleic Acids Res. 2011 Jan;39(Database issue):D80-5. doi: 10.1093/nar/gkq1073. Epub 2010 Nov 4.
Alternative splicing is emerging as a major mechanism for the expansion of the transcriptome and proteome diversity, particularly in human and other vertebrates. However, the proportion of alternative transcripts and proteins actually endowed with functional activity is currently highly debated. We present here a new release of ASPicDB which now provides a unique annotation resource of human protein variants generated by alternative splicing. A total of 256,939 protein variants from 17,191 multi-exon genes have been extensively annotated through state of the art machine learning tools providing information of the protein type (globular and transmembrane), localization, presence of PFAM domains, signal peptides, GPI-anchor propeptides, transmembrane and coiled-coil segments. Furthermore, full-length variants can be now specifically selected based on the annotation of CAGE-tags and polyA signal and/or polyA sites, marking transcription initiation and termination sites, respectively. The retrieval can be carried out at gene, transcript, exon, protein or splice site level allowing the selection of data sets fulfilling one or more features settled by the user. The retrieval interface also enables the selection of protein variants showing specific differences in the annotated features. ASPicDB is available at http://www.caspur.it/ASPicDB/.
可变剪接正逐渐成为转录组和蛋白质组多样性扩展的主要机制,尤其是在人类和其他脊椎动物中。然而,目前关于实际具有功能活性的可变转录本和蛋白质的比例存在很大争议。我们在此展示了ASPicDB的新版本,它现在提供了由可变剪接产生的人类蛋白质变体的独特注释资源。通过先进的机器学习工具,对来自17191个多外显子基因的总共256939个蛋白质变体进行了广泛注释,这些工具提供了蛋白质类型(球状和跨膜)、定位、PFAM结构域的存在、信号肽、GPI锚定前肽、跨膜和卷曲螺旋片段等信息。此外,现在可以根据CAGE标签和聚腺苷酸信号及/或聚腺苷酸位点的注释分别标记转录起始和终止位点,从而专门选择全长变体。检索可以在基因、转录本、外显子、蛋白质或剪接位点水平进行,允许选择满足用户设定的一个或多个特征的数据集。检索界面还能够选择在注释特征上显示特定差异的蛋白质变体。ASPicDB可在http://www.caspur.it/ASPicDB/获取。