Suppr超能文献

整合序列-结构基序足以识别 microRNA 前体。

Integrated sequence-structure motifs suffice to identify microRNA precursors.

机构信息

School of Mathematics and Physics, University of Science and Technology Beijing, Beijing, PR China.

出版信息

PLoS One. 2012;7(3):e32797. doi: 10.1371/journal.pone.0032797. Epub 2012 Mar 15.

Abstract

BACKGROUND

Upwards of 1200 miRNA loci have hitherto been annotated in the human genome. The specific features defining a miRNA precursor and deciding its recognition and subsequent processing are not yet exhaustively described and miRNA loci can thus not be computationally identified with sufficient confidence.

RESULTS

We rendered pre-miRNA and non-pre-miRNA hairpins as strings of integrated sequence-structure information, and used the software Teiresias to identify sequence-structure motifs (ss-motifs) of variable length in these data sets. Using only ss-motifs as features in a Support Vector Machine (SVM) algorithm for pre-miRNA identification achieved 99.2% specificity and 97.6% sensitivity on a human test data set, which is comparable to previously published algorithms employing combinations of sequence-structure and additional features. Further analysis of the ss-motif information contents revealed strongly significant deviations from those of the respective training sets, revealing important potential clues as to how the sequence and structural information of RNA hairpins are utilized by the miRNA processing apparatus.

CONCLUSION

Integrated sequence-structure motifs of variable length apparently capture nearly all information required to distinguish miRNA precursors from other stem-loop structures.

摘要

背景

迄今为止,人类基因组中已经注释了超过 1200 个 miRNA 基因座。定义 miRNA 前体并决定其识别和后续加工的具体特征尚未得到详尽描述,因此 miRNA 基因座不能通过计算以足够的置信度来识别。

结果

我们将 miRNA 前体和非 miRNA 发夹表示为整合序列-结构信息的字符串,并使用软件 Teiresias 在这些数据集识别可变长度的序列-结构基序(ss-motif)。仅使用 ss-motif 作为支持向量机(SVM)算法的特征,用于 miRNA 前体识别,在人类测试数据集上达到 99.2%的特异性和 97.6%的灵敏度,与先前发表的使用序列-结构和其他特征组合的算法相当。对 ss-motif 信息含量的进一步分析显示,与各自的训练集存在明显的偏差,揭示了 miRNA 加工装置如何利用 RNA 发夹的序列和结构信息的重要潜在线索。

结论

可变长度的整合序列-结构基序显然可以捕获区分 miRNA 前体和其他茎环结构所需的几乎所有信息。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9347/3305290/8097b9d7811a/pone.0032797.g001.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验