Hunan Provincial Key Lab on Bioinformatics, School of Computer Science and Engineering.
College of Chemistry and Chemical Engineering, Central South University, Changsha, Hunan 410083, China.
Bioinformatics. 2021 May 1;37(4):522-530. doi: 10.1093/bioinformatics/btaa829.
High resolution annotation of gene functions is a central goal in functional genomics. A single gene may produce multiple isoforms with different functions through alternative splicing. Conventional approaches, however, consider a gene as a single entity without differentiating these functionally different isoforms. Towards understanding gene functions at higher resolution, recent efforts have focused on predicting the functions of isoforms. However, the performance of existing methods is far from satisfactory mainly because of the lack of isoform-level functional annotation.
We present IsoResolve, a novel approach for isoform function prediction, which leverages the information from gene function prediction models with domain adaptation (DA). IsoResolve treats gene-level and isoform-level features as source and target domains, respectively. It uses DA to project the two domains into a latent variable space in such a way that the latent variables from the two domains have similar distribution, which enables the gene domain information to be leveraged for isoform function prediction. We systematically evaluated the performance of IsoResolve in predicting functions. Compared with five state-of-the-art methods, IsoResolve achieved significantly better performance. IsoResolve was further validated by case studies of genes with isoform-level functional annotation.
IsoResolve is freely available at https://github.com/genemine/IsoResolve.
Supplementary data are available at Bioinformatics online.
高分辨率注释基因功能是功能基因组学的核心目标。一个基因可以通过选择性剪接产生具有不同功能的多个异构体。然而,传统方法将一个基因视为一个单一实体,而不区分这些功能不同的异构体。为了在更高的分辨率上理解基因功能,最近的研究集中在预测异构体的功能上。然而,由于缺乏异构体水平的功能注释,现有方法的性能远不能令人满意。
我们提出了 IsoResolve,这是一种新颖的异构体功能预测方法,它利用了具有域自适应(DA)的基因功能预测模型的信息。IsoResolve 将基因水平和异构体水平的特征分别视为源域和目标域。它使用 DA 将两个域投影到一个潜在变量空间中,使得两个域的潜在变量具有相似的分布,从而能够利用基因域信息进行异构体功能预测。我们系统地评估了 IsoResolve 在预测功能方面的性能。与五种最先进的方法相比,IsoResolve 取得了显著更好的性能。IsoResolve 还通过具有异构体水平功能注释的基因的案例研究进行了验证。
IsoResolve 可在 https://github.com/genemine/IsoResolve 上免费获得。
补充数据可在生物信息学在线获得。