Suppr超能文献

单基因转录组中剪接相关性的多变量分析与可视化

Multivariate analysis and visualization of splicing correlations in single-gene transcriptomes.

作者信息

Emerick Mark C, Parmigiani Giovanni, Agnew William S

机构信息

Department of Physiology, Johns Hopkins Medical School, Baltimore, MD 21205 USA.

出版信息

BMC Bioinformatics. 2007 Jan 18;8:16. doi: 10.1186/1471-2105-8-16.

Abstract

BACKGROUND

RNA metabolism, through 'combinatorial splicing', can generate enormous structural diversity in the proteome. Alternative domains may interact, however, with unpredictable phenotypic consequences, necessitating integrated RNA-level regulation of molecular composition. Splicing correlations within transcripts of single genes provide valuable clues to functional relationships among molecular domains as well as genomic targets for higher-order splicing regulation.

RESULTS

We present tools to visualize complex splicing patterns in full-length cDNA libraries. Developmental changes in pair-wise correlations are presented vectorially in 'clock plots' and linkage grids. Higher-order correlations are assessed statistically through Monte Carlo analysis of a log-linear model with an empirical-Bayes estimate of the true probabilities of observed and unobserved splice forms. Log-linear coefficients are visualized in a 'spliceprint,' a signature of splice correlations in the transcriptome. We present two novel metrics: the linkage change index, which measures the directional change in pair-wise correlation with tissue differentiation, and the accuracy index, a very simple goodness-of-fit metric that is more sensitive than the integrated squared error when applied to sparsely populated tables, and unlike chi-square, does not diverge at low variance. Considerable attention is given to sparse contingency tables, which are inherent to single-gene libraries.

CONCLUSION

Patterns of splicing correlations are revealed, which span a broad range of interaction order and change in development. The methods have a broad scope of applicability, beyond the single gene--including, for example, multiple gene interactions in the complete transcriptome.

摘要

背景

RNA代谢通过“组合剪接”可在蛋白质组中产生巨大的结构多样性。然而,可变结构域可能会相互作用,但其表型后果却难以预测,因此需要对分子组成进行RNA水平的综合调控。单基因转录本内的剪接相关性为分子结构域之间的功能关系以及高阶剪接调控的基因组靶点提供了有价值的线索。

结果

我们展示了用于可视化全长cDNA文库中复杂剪接模式的工具。成对相关性的发育变化以矢量形式呈现在“时钟图”和连锁网格中。通过对对数线性模型进行蒙特卡罗分析,并对观察到和未观察到的剪接形式的真实概率进行经验贝叶斯估计,从而对高阶相关性进行统计评估。对数线性系数在“剪接印记”中可视化,它是转录组中剪接相关性的一种特征。我们提出了两个新指标:连锁变化指数,用于衡量成对相关性随组织分化的方向变化;准确性指数,这是一种非常简单的拟合优度指标,在应用于稀疏表格时比积分平方误差更敏感,并且与卡方检验不同,在低方差时不会发散。我们对稀疏列联表给予了相当多的关注,因为它是单基因文库所固有的。

结论

揭示了剪接相关性模式,其涵盖了广泛的相互作用顺序和发育变化。这些方法具有广泛的适用性,不仅适用于单基因,还包括例如完整转录组中的多基因相互作用等。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验