Center for Bioinformatics and Computational Biology, University of Maryland, College Park, Maryland 20742, USA.
Genome Res. 2010 Sep;20(9):1165-73. doi: 10.1101/gr.101360.109. Epub 2010 May 27.
Second-generation sequencing technology can now be used to sequence an entire human genome in a matter of days and at low cost. Sequence read lengths, initially very short, have rapidly increased since the technology first appeared, and we now are seeing a growing number of efforts to sequence large genomes de novo from these short reads. In this Perspective, we describe the issues associated with short-read assembly, the different types of data produced by second-gen sequencers, and the latest assembly algorithms designed for these data. We also review the genomes that have been assembled recently from short reads and make recommendations for sequencing strategies that will yield a high-quality assembly.
第二代测序技术现在可以在几天内以低成本对整个人类基因组进行测序。自从该技术问世以来,序列读取长度最初非常短,但已经迅速增加,我们现在看到越来越多的人从这些短读取中从头开始对大型基因组进行测序。在本观点中,我们描述了与短读序列组装相关的问题、第二代测序仪产生的不同类型的数据,以及专门针对这些数据设计的最新组装算法。我们还回顾了最近从短读序列组装的基因组,并为产生高质量组装的测序策略提出了建议。