Suppr超能文献

爆米花:原核生物中短编码和非编码基因组序列的预测

Popcorn: prediction of short coding and noncoding genomic sequences in prokaryotes.

作者信息

Kyrouz Alison, Liu Lian, Qin Lixin, Tjaden Brian

机构信息

Department of Computer Science, Wellesley College, Wellesley, MA 02481, United States.

出版信息

Bioinformatics. 2025 May 6;41(5). doi: 10.1093/bioinformatics/btaf250.

Abstract

SUMMARY

The most challenging prokaryotic genes to identify often correspond to short ORFs (sORFs) encoding small proteins or to noncoding RNAs. RNA-seq experiments commonly evince small transcripts that do not correspond to annotated genes and are candidates for novel coding sORFs or small regulatory RNAs, but it can be difficult to accurately assess whether the numerous small transcripts are coding or not. We present Popcorn (PrOkaryotic Prediction of Coding OR Noncoding), a novel machine learning method for determining whether prokaryotic sequences are coding or noncoding. We find that Popcorn is effective in distinguishing coding from noncoding sequences, including coding sORFs and noncoding RNAs.

AVAILABILITY AND IMPLEMENTATION

Freely available for use on the web at https://cs.wellesley.edu/∼btjaden/Popcorn. Source code available at https://github.com/btjaden/Popcorn and https://doi.org/10.5281/zenodo.15120075.

摘要

摘要

最难鉴定的原核生物基因通常对应于编码小蛋白的短开放阅读框(sORF)或非编码RNA。RNA测序实验通常会显示出与注释基因不对应的小转录本,这些小转录本是新型编码sORF或小调控RNA的候选者,但很难准确评估众多小转录本是否具有编码功能。我们提出了Popcorn(原核生物编码或非编码预测),这是一种用于确定原核生物序列是编码还是非编码的新型机器学习方法。我们发现Popcorn在区分编码序列和非编码序列方面很有效,包括编码sORF和非编码RNA。

可用性和实现方式

可在https://cs.wellesley.edu/∼btjaden/Popcorn上免费在线使用。源代码可在https://github.com/btjaden/Popcorn和https://doi.org/10.5281/zenodo.15120075上获取。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e66a/12054974/5f10ab44fd10/btaf250f1.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验