Suppr超能文献

仅二级结构通常对于非编码RNA的检测在统计学上并不显著。

Secondary structure alone is generally not statistically significant for the detection of noncoding RNAs.

作者信息

Rivas E, Eddy S R

机构信息

Department of Genetics, Washington University, St. Louis, MO 63110, USA.

出版信息

Bioinformatics. 2000 Jul;16(7):583-605. doi: 10.1093/bioinformatics/16.7.583.

Abstract

MOTIVATION

Several results in the literature suggest that biologically interesting RNAs have secondary structures that are more stable than expected by chance. Based on these observations, we developed a scanning algorithm for detecting noncoding RNA genes in genome sequences, using a fully probabilistic version of the Zuker minimum-energy folding algorithm.

RESULTS

Preliminary results were encouraging, but certain anomalies led us to do a carefully controlled investigation of this class of methods. Ultimately, our results argue that for the probabilistic model there is indeed a statistical effect, but it comes mostly from local base-composition bias and not from RNA secondary structure. For the thermodynamic implementation (which evaluates statistical significance by doing Monte Carlo shuffling in fixed-length sequence windows, thus eliminating the base-composition effect) the signals for noncoding RNAs are still usually indistinguishable from noise, especially when certain statistical artifacts resulting from local base-composition inhomogeneity are taken into account. We conclude that although a distinct, stable secondary structure is undoubtedly important in most noncoding RNAs, the stability of most noncoding RNA secondary structures is not sufficiently different from the predicted stability of a random sequence to be useful as a general genefinding approach.

摘要

动机

文献中的多项结果表明,具有生物学意义的RNA具有比随机预期更稳定的二级结构。基于这些观察结果,我们开发了一种扫描算法,用于使用祖克最小能量折叠算法的完全概率版本在基因组序列中检测非编码RNA基因。

结果

初步结果令人鼓舞,但某些异常情况促使我们对这类方法进行了仔细控制的研究。最终,我们的结果表明,对于概率模型,确实存在统计效应,但它主要来自局部碱基组成偏差,而非RNA二级结构。对于热力学实现方法(通过在固定长度的序列窗口中进行蒙特卡罗洗牌来评估统计显著性,从而消除碱基组成效应),非编码RNA的信号通常仍与噪声难以区分,特别是当考虑到由局部碱基组成不均匀性导致的某些统计假象时。我们得出结论,虽然独特、稳定的二级结构在大多数非编码RNA中无疑很重要,但大多数非编码RNA二级结构的稳定性与随机序列的预测稳定性差异不足,无法作为一种通用的基因发现方法。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验