Suppr超能文献

病毒群落组装与注释:纳米布沙漠中的一个惊喜。

Virome Assembly and Annotation: A Surprise in the Namib Desert.

作者信息

Hesse Uljana, van Heusden Peter, Kirby Bronwyn M, Olonade Israel, van Zyl Leonardo J, Trindade Marla

机构信息

Institute for Microbial Biotechnology and Metagenomics, University of the Western CapeBellville, South Africa; South African National Bioinformatics Institute, University of the Western CapeBellville, South Africa.

South African National Bioinformatics Institute, University of the Western Cape Bellville, South Africa.

出版信息

Front Microbiol. 2017 Jan 23;8:13. doi: 10.3389/fmicb.2017.00013. eCollection 2017.

Abstract

Sequencing, assembly, and annotation of environmental virome samples is challenging. Methodological biases and differences in species abundance result in fragmentary read coverage; sequence reconstruction is further complicated by the mosaic nature of viral genomes. In this paper, we focus on biocomputational aspects of virome analysis, emphasizing latent pitfalls in sequence annotation. Using simulated viromes that mimic environmental data challenges we assessed the performance of five assemblers (CLC-Workbench, IDBA-UD, SPAdes, RayMeta, ABySS). Individual analyses of relevant scaffold length fractions revealed shortcomings of some programs in reconstruction of viral genomes with excessive read coverage (IDBA-UD, RayMeta), and in accurate assembly of scaffolds ≥50 kb (SPAdes, RayMeta, ABySS). The CLC-Workbench assembler performed best in terms of genome recovery (including highly covered genomes) and correct reconstruction of large scaffolds; and was used to assemble a virome from a copper rich site in the Namib Desert. We found that scaffold network analysis and cluster-specific read reassembly improved reconstruction of sequences with excessive read coverage, and that strict data filtering for non-viral sequences prior to downstream analyses was essential. In this study we describe novel viral genomes identified in the Namib Desert copper site virome. Taxonomic affiliations of diverse proteins in the dataset and phylogenetic analyses of circovirus-like proteins indicated links to the marine habitat. Considering additional evidence from this dataset we hypothesize that viruses may have been carried from the Atlantic Ocean into the Namib Desert by fog and wind, highlighting the impact of the extended environment on an investigated niche in metagenome studies.

摘要

对环境病毒群落样本进行测序、组装和注释具有挑战性。方法学偏差和物种丰度差异导致片段化的读段覆盖;病毒基因组的镶嵌性质使序列重建更加复杂。在本文中,我们聚焦于病毒群落分析的生物计算方面,强调序列注释中潜在的陷阱。利用模拟的病毒群落来模拟环境数据挑战,我们评估了五种组装程序(CLC-Workbench、IDBA-UD、SPAdes、RayMeta、ABySS)的性能。对相关支架长度片段的单独分析揭示了一些程序在重建具有过高读段覆盖的病毒基因组(IDBA-UD、RayMeta)以及准确组装长度≥50 kb的支架(SPAdes、RayMeta、ABySS)方面的不足。CLC-Workbench组装程序在基因组恢复(包括高覆盖度基因组)和大型支架的正确重建方面表现最佳;并被用于组装纳米比亚沙漠一个富铜地点的病毒群落。我们发现支架网络分析和特定簇读段重新组装改善了具有过高读段覆盖的序列的重建,并且在下游分析之前对非病毒序列进行严格的数据过滤至关重要。在本研究中,我们描述了在纳米比亚沙漠铜地点病毒群落中鉴定出的新型病毒基因组。数据集中不同蛋白质的分类归属以及圆环病毒样蛋白质的系统发育分析表明与海洋栖息地存在联系。考虑到该数据集中的其他证据,我们推测病毒可能通过雾和风从大西洋被携带到纳米比亚沙漠,突出了在宏基因组研究中扩展环境对所研究生态位的影响。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dd29/5253355/0c501adab3f4/fmicb-08-00013-g0001.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验