Suppr超能文献

一种全面预测异构体特异性功能的期望最大化框架。

An expectation-maximization framework for comprehensive prediction of isoform-specific functions.

机构信息

The Jackson Laboratory for Genomic Medicine, Farmington, CT 06032, United States.

AnacletoLab, Dipartimento di Informatica, Università degli Studi di Milano, Milano, Italy.

出版信息

Bioinformatics. 2023 Apr 3;39(4). doi: 10.1093/bioinformatics/btad132.

Abstract

MOTIVATION

Advances in RNA sequencing technologies have achieved an unprecedented accuracy in the quantification of mRNA isoforms, but our knowledge of isoform-specific functions has lagged behind. There is a need to understand the functional consequences of differential splicing, which could be supported by the generation of accurate and comprehensive isoform-specific gene ontology annotations.

RESULTS

We present isoform interpretation, a method that uses expectation-maximization to infer isoform-specific functions based on the relationship between sequence and functional isoform similarity. We predicted isoform-specific functional annotations for 85 617 isoforms of 17 900 protein-coding human genes spanning a range of 17 430 distinct gene ontology terms. Comparison with a gold-standard corpus of manually annotated human isoform functions showed that isoform interpretation significantly outperforms state-of-the-art competing methods. We provide experimental evidence that functionally related isoforms predicted by isoform interpretation show a higher degree of domain sharing and expression correlation than functionally related genes. We also show that isoform sequence similarity correlates better with inferred isoform function than with gene-level function.

AVAILABILITY AND IMPLEMENTATION

Source code, documentation, and resource files are freely available under a GNU3 license at https://github.com/TheJacksonLaboratory/isopretEM and https://zenodo.org/record/7594321.

摘要

动机

RNA 测序技术的进步在 mRNA 异构体的定量方面取得了前所未有的准确性,但我们对异构体特异性功能的了解却落后了。有必要了解差异剪接的功能后果,这可以通过生成准确和全面的异构体特异性基因本体论注释来支持。

结果

我们提出了异构体解释方法,该方法使用期望最大化根据序列和功能异构体相似性之间的关系来推断异构体特异性功能。我们预测了 17900 个人类蛋白编码基因的 85617 个异构体的异构体特异性功能注释,涵盖了 17430 个不同的基因本体论术语。与人工注释的人类异构体功能的黄金标准语料库进行比较表明,异构体解释显著优于最先进的竞争方法。我们提供了实验证据,表明通过异构体解释预测的功能相关异构体比功能相关基因具有更高的结构域共享和表达相关性。我们还表明,与基因水平的功能相比,异构体序列相似性与推断的异构体功能相关性更好。

可用性和实现

在 GNU3 许可证下,源代码、文档和资源文件可在 https://github.com/TheJacksonLaboratory/isopretEMhttps://zenodo.org/record/7594321 上免费获得。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d21b/10079350/6a5520448c47/btad132f1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验