Département de Microbiologie et d'infectiologie, Faculté de Médecine et des Sciences de la Santé, Université de Sherbrooke, Sherbrooke, QC, J1E 4K8, Canada.
Département de Biochimie et Génomique Fonctionnelle, Faculté de Médecine et des Sciences de la Santé, Université de Sherbrooke, Sherbrooke, QC, J1E 4K8, Canada.
BMC Bioinformatics. 2022 Jun 24;23(1):250. doi: 10.1186/s12859-022-04804-w.
Alternative splicing can increase the diversity of gene functions by generating multiple isoforms with different sequences and functions. However, the extent to which splicing events have functional consequences remains unclear and predicting the impact of splicing events on protein activity is limited to gene-specific analysis.
To accelerate the identification of functionally relevant alternative splicing events we created SAPFIR, a predictor of protein features associated with alternative splicing events. This webserver tool uses InterProScan to predict protein features such as functional domains, motifs and sites in the human and mouse genomes and link them to alternative splicing events. Alternative protein features are displayed as functions of the transcripts and splice sites. SAPFIR could be used to analyze proteins generated from a single gene or a group of genes and can directly identify alternative protein features in large sequence data sets. The accuracy and utility of SAPFIR was validated by its ability to rediscover previously validated alternative protein domains. In addition, our de novo analysis of public datasets using SAPFIR indicated that only a small portion of alternative protein domains was conserved between human and mouse, and that in human, genes involved in nervous system process, regulation of DNA-templated transcription and aging are more likely to produce isoforms missing functional domains due to alternative splicing.
Overall SAPFIR represents a new tool for the rapid identification of functional alternative splicing events and enables the identification of cellular functions affected by a defined splicing program. SAPFIR is freely available at https://bioinfo-scottgroup.med.usherbrooke.ca/sapfir/ , a website implemented in Python, with all major browsers supported. The source code is available at https://github.com/DelongZHOU/SAPFIR .
可变剪接可以通过产生具有不同序列和功能的多个异构体来增加基因功能的多样性。然而,剪接事件产生功能后果的程度尚不清楚,并且预测剪接事件对蛋白质活性的影响仅限于基因特异性分析。
为了加速鉴定具有功能相关性的可变剪接事件,我们创建了 SAPFIR,这是一种预测与可变剪接事件相关的蛋白质特征的工具。这个网络服务器工具使用 InterProScan 来预测蛋白质特征,如功能域、基序和人类和老鼠基因组中的位点,并将它们与可变剪接事件联系起来。替代蛋白特征作为转录本和剪接位点的函数显示。SAPFIR 可用于分析来自单个基因或一组基因的蛋白质,并可直接在大型序列数据集识别替代蛋白特征。SAPFIR 的准确性和实用性通过其重新发现先前验证的替代蛋白结构域的能力得到了验证。此外,我们使用 SAPFIR 对公共数据集进行的从头分析表明,人类和老鼠之间只有一小部分替代蛋白结构域是保守的,而且在人类中,参与神经系统过程、DNA 模板转录调控和衰老的基因更有可能由于可变剪接而产生缺失功能结构域的异构体。
总的来说,SAPFIR 代表了一种快速鉴定功能可变剪接事件的新工具,并能够识别受特定剪接程序影响的细胞功能。SAPFIR 可在 https://bioinfo-scottgroup.med.usherbrooke.ca/sapfir/ 免费获得,该网站是用 Python 实现的,支持所有主流浏览器。源代码可在 https://github.com/DelongZHOU/SAPFIR 获得。