Tisza Michael J, Belford Anna K, Domínguez-Huerta Guillermo, Bolduc Benjamin, Buck Christopher B
Lab of Cellular Oncology, NCI, NIH, Bethesda, MD 20892-4263, USA.
Department of Microbiology, Ohio State University, Columbus, OH, USA.
Virus Evol. 2020 Dec 30;7(1):veaa100. doi: 10.1093/ve/veaa100. eCollection 2021 Jan.
Viruses, despite their great abundance and significance in biological systems, remain largely mysterious. Indeed, the vast majority of the perhaps hundreds of millions of viral species on the planet remain undiscovered. Additionally, many viruses deposited in central databases like GenBank and RefSeq are littered with genes annotated as 'hypothetical protein' or the equivalent. Cenote-Taker 2, a virus discovery and annotation tool available on command line and with a graphical user interface with free high-performance computation access, utilizes highly sensitive models of hallmark virus genes to discover familiar or divergent viral sequences from user-input contigs. Additionally, Cenote-Taker 2 uses a flexible set of modules to automatically annotate the sequence features of contigs, providing more gene information than comparable tools. The outputs include readable and interactive genome maps, virome summary tables, and files that can be directly submitted to GenBank. We expect Cenote-Taker 2 to facilitate virus discovery, annotation, and expansion of the known virome.
病毒尽管在生物系统中数量众多且意义重大,但在很大程度上仍然神秘莫测。事实上,地球上可能数以亿计的病毒物种中,绝大多数仍未被发现。此外,许多存入GenBank和RefSeq等中央数据库的病毒都充斥着被注释为“假定蛋白”或类似名称的基因。Cenote-Taker 2是一种可通过命令行使用的病毒发现和注释工具,同时还有带免费高性能计算访问的图形用户界面,它利用标志性病毒基因的高度敏感模型,从用户输入的重叠群中发现熟悉的或不同的病毒序列。此外,Cenote-Taker 2使用一组灵活的模块来自动注释重叠群的序列特征,比同类工具提供更多的基因信息。输出结果包括可读的交互式基因组图谱、病毒群落汇总表,以及可直接提交到GenBank的文件。我们期望Cenote-Taker 2能促进病毒发现、注释以及已知病毒群落的扩展。