Suppr超能文献

巴尔罗格:用于原核基因预测的通用蛋白质模型。

Balrog: A universal protein model for prokaryotic gene prediction.

机构信息

Department of Biomedical Engineering, Johns Hopkins University, Baltimore, Maryland, United States of America.

Center for Computational Biology, Johns Hopkins University, Baltimore, Maryland, United States of America.

出版信息

PLoS Comput Biol. 2021 Feb 26;17(2):e1008727. doi: 10.1371/journal.pcbi.1008727. eCollection 2021 Feb.

Abstract

Low-cost, high-throughput sequencing has led to an enormous increase in the number of sequenced microbial genomes, with well over 100,000 genomes in public archives today. Automatic genome annotation tools are integral to understanding these organisms, yet older gene finding methods must be retrained on each new genome. We have developed a universal model of prokaryotic genes by fitting a temporal convolutional network to amino-acid sequences from a large, diverse set of microbial genomes. We incorporated the new model into a gene finding system, Balrog (Bacterial Annotation by Learned Representation Of Genes), which does not require genome-specific training and which matches or outperforms other state-of-the-art gene finding tools. Balrog is freely available under the MIT license at https://github.com/salzberg-lab/Balrog.

摘要

低成本、高通量测序导致已测序微生物基因组数量的大幅增加,目前公共档案中已有超过 10 万个基因组。自动基因组注释工具对于理解这些生物体至关重要,但必须针对每个新基因组重新训练旧的基因发现方法。我们通过将时间卷积网络拟合到来自大量不同微生物基因组的氨基酸序列,开发了一个普遍的原核基因模型。我们将新模型纳入基因发现系统 Balrog(通过基因的学习表示进行细菌注释)中,该系统不需要针对特定基因组的训练,并且匹配或优于其他最先进的基因发现工具。Balrog 可在 MIT 许可证下免费获得,网址为 https://github.com/salzberg-lab/Balrog。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d6ce/7946324/626783720f39/pcbi.1008727.g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验