Suppr超能文献

RNAcode:在比较序列数据中稳健地区分编码和非编码区域。

RNAcode: robust discrimination of coding and noncoding regions in comparative sequence data.

机构信息

EMBL-European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire CB101SD, United Kingdom.

出版信息

RNA. 2011 Apr;17(4):578-94. doi: 10.1261/rna.2536111. Epub 2011 Feb 28.

Abstract

With the availability of genome-wide transcription data and massive comparative sequencing, the discrimination of coding from noncoding RNAs and the assessment of coding potential in evolutionarily conserved regions arose as a core analysis task. Here we present RNAcode, a program to detect coding regions in multiple sequence alignments that is optimized for emerging applications not covered by current protein gene-finding software. Our algorithm combines information from nucleotide substitution and gap patterns in a unified framework and also deals with real-life issues such as alignment and sequencing errors. It uses an explicit statistical model with no machine learning component and can therefore be applied "out of the box," without any training, to data from all domains of life. We describe the RNAcode method and apply it in combination with mass spectrometry experiments to predict and confirm seven novel short peptides in Escherichia coli and to analyze the coding potential of RNAs previously annotated as "noncoding." RNAcode is open source software and available for all major platforms at http://wash.github.com/rnacode.

摘要

随着全基因组转录数据和大规模比较测序的出现,区分编码 RNA 和非编码 RNA,以及评估进化保守区域的编码潜力成为了核心分析任务。在这里,我们介绍了 RNAcode,这是一种用于在多重序列比对中检测编码区域的程序,它针对当前蛋白质基因发现软件未涵盖的新兴应用进行了优化。我们的算法将核苷酸替换和空位模式的信息结合在一个统一的框架中,还处理了对齐和测序错误等实际问题。它使用一个没有机器学习组件的显式统计模型,因此可以“开箱即用”,无需任何训练,即可应用于来自生命所有领域的数据。我们描述了 RNAcode 方法,并将其与质谱实验结合使用,以预测和确认大肠杆菌中的七个新的短肽,并分析以前注释为“非编码”的 RNA 的编码潜力。RNAcode 是开源软件,可在所有主要平台上使用,网址为 http://wash.github.com/rnacode。

相似文献

4
RILogo: visualizing RNA-RNA interactions.RILogo:可视化 RNA-RNA 相互作用。
Bioinformatics. 2012 Oct 1;28(19):2523-6. doi: 10.1093/bioinformatics/bts461. Epub 2012 Jul 23.
7
Annotating non-coding RNAs with Rfam.使用Rfam注释非编码RNA。
Curr Protoc Bioinformatics. 2005 Apr;Chapter 12:12.5.1-12.5.12. doi: 10.1002/0471250953.bi1205s9.
10
Identification of protein coding regions in RNA transcripts.RNA转录本中蛋白质编码区域的鉴定。
Nucleic Acids Res. 2015 Jul 13;43(12):e78. doi: 10.1093/nar/gkv227. Epub 2015 Apr 13.

引用本文的文献

3
The hidden bacterial microproteome.隐藏的细菌微蛋白质组
Mol Cell. 2025 Mar 6;85(5):1024-1041.e6. doi: 10.1016/j.molcel.2025.01.025. Epub 2025 Feb 19.
6
The Cryptic Bacterial Microproteome.神秘的细菌微蛋白质组
bioRxiv. 2024 Feb 18:2024.02.17.580829. doi: 10.1101/2024.02.17.580829.
8
Evolutionary Structure Conservation and Covariance Scores.进化结构保守性和协变分数。
Methods Mol Biol. 2024;2726:255-284. doi: 10.1007/978-1-0716-3519-3_11.

本文引用的文献

1
Optimization of parameters for coverage of low molecular weight proteins.优化覆盖低分子量蛋白质的参数。
Anal Bioanal Chem. 2010 Dec;398(7-8):2867-81. doi: 10.1007/s00216-010-4093-x. Epub 2010 Aug 28.
3
Molecular biology. Hiding in plain sight.分子生物学。隐藏于众目睽睽之下。
Science. 2010 Jul 16;329(5989):284-5. doi: 10.1126/science.1192769.
6
The transcription unit architecture of the Escherichia coli genome.大肠杆菌基因组的转录单元结构。
Nat Biotechnol. 2009 Nov;27(11):1043-9. doi: 10.1038/nbt.1582. Epub 2009 Nov 1.
7
The Universal Protein Resource (UniProt) in 2010.2010 年的通用蛋白质资源(UniProt)。
Nucleic Acids Res. 2010 Jan;38(Database issue):D142-8. doi: 10.1093/nar/gkp846. Epub 2009 Oct 20.

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验