Suppr超能文献

生物信息学分析大规模病毒序列:从数据集的构建到系统发育树的注释。

Bioinformatics analysis of large-scale viral sequences: from construction of data sets to annotation of a phylogenetic tree.

机构信息

Department of Biomedical Sciences and Veterinary Public Health, Section of Virology, Swedish University of Agricultural Sciences, Uppsala, Sweden.

出版信息

Virulence. 2013 Jan 1;4(1):97-106. doi: 10.4161/viru.23161.

Abstract

Due to a significant decrease in the cost of DNA sequencing, the number of sequences submitted to the public databases has dramatically increased in recent years. Efficient analysis of these data sets may lead to a significant understanding of the nature of pathogens such as bacteria, viruses, parasites, etc. However, this has raised questions about the efficacy of currently available algorithms for the study of pathogen evolution and construction of phylogenetic trees. While the advanced algorithms and corresponding programs are being developed, it is crucial to optimize the available ones in order to cope with the current need. The protocol presented in this study is optimized using a number of strategies currently being proposed for handling large-scale DNA sequence data sets, and offers a highly efficacious and accurate method for computing phylogenetic trees with limited computer resources. The protocol may take up to 36 h for construction and annotation of a final tree of about 20,000 sequences.

摘要

由于 DNA 测序成本的大幅降低,近年来提交到公共数据库的序列数量急剧增加。对这些数据集进行有效的分析可能会使人们对细菌、病毒、寄生虫等病原体的本质有更深入的了解。然而,这也引发了对现有算法在病原体进化研究和系统发育树构建方面的功效的质疑。虽然正在开发更先进的算法和相应的程序,但优化现有的算法以应对当前的需求至关重要。本研究提出的方案使用了目前提出的一些策略来处理大规模的 DNA 序列数据集,并且为在有限的计算机资源下计算系统发育树提供了一种高效、准确的方法。该方案构建和注释一个大约 20000 个序列的最终树可能需要长达 36 小时。

相似文献

7
On the quality of tree-based protein classification.论基于树的蛋白质分类的质量。
Bioinformatics. 2005 May 1;21(9):1876-90. doi: 10.1093/bioinformatics/bti244. Epub 2005 Jan 12.
8
DPRml: distributed phylogeny reconstruction by maximum likelihood.DPRml:基于最大似然法的分布式系统发育重建
Bioinformatics. 2005 Apr 1;21(7):969-74. doi: 10.1093/bioinformatics/bti100. Epub 2004 Oct 28.
10
Efficient error correction for next-generation sequencing of viral amplicons.高效的病毒扩增子下一代测序错误校正。
BMC Bioinformatics. 2012 Jun 25;13 Suppl 10(Suppl 10):S6. doi: 10.1186/1471-2105-13-S10-S6.

引用本文的文献

本文引用的文献

5
Next-generation DNA sequencing.下一代DNA测序
Nat Biotechnol. 2008 Oct;26(10):1135-45. doi: 10.1038/nbt1486.
9

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验