Jiang Jingwei, Li Jun, Kwan Hoi Shan, Au Chun Hang, Wan Law Patrick Tik, Li Lei, Kam Kai Man, Lun Ling Julia Mei, Leung Frederick C
School of Biological Sciences, Faculty of Science, The University of Hong Kong, Hong Kong, China.
BMC Res Notes. 2012 Jan 31;5:80. doi: 10.1186/1756-0500-5-80.
Pyrosequencing techniques allow scientists to perform prokaryotic genome sequencing to achieve the draft genomic sequences within a few days. However, the assemblies with shotgun sequencing are usually composed of hundreds of contigs. A further multiplex PCR procedure is needed to fill all the gaps and link contigs into complete chromosomal sequence, which is the basis for prokaryotic comparative genomic studies. In this article, we study various pyrosequencing strategies by simulated assembling from 100 prokaryotic genomes.
Simulation study shows that a single end 454 Jr. run combined with a paired end 454 Jr. run (8 kb library) can produce: 1) ~90% of 100 assemblies with < 10 scaffolds and ~95% of 100 assemblies with < 150 contigs; 2) average contig N50 size is over 331 kb; 3) average single base accuracy is > 99.99%; 4) average false gene duplication rate is < 0.7%; 5) average false gene loss rate is < 0.4%.
A single end 454 Jr. run combined with a paired end 454 Jr. run (8 kb library) is a cost-effective way for prokaryotic whole genome sequencing. This strategy provides solution to produce high quality draft assemblies for most of prokaryotic organisms within days. Due to the small number of assembled scaffolds, the following multiplex PCR procedure (for gap filling) would be easy. As a result, large scale prokaryotic whole genome sequencing projects may be finished within weeks.
焦磷酸测序技术使科学家能够进行原核生物基因组测序,在几天内获得基因组草图序列。然而,鸟枪法测序的组装结果通常由数百个重叠群组成。需要进一步的多重PCR程序来填补所有缺口并将重叠群连接成完整的染色体序列,这是原核生物比较基因组研究的基础。在本文中,我们通过对100个原核生物基因组进行模拟组装来研究各种焦磷酸测序策略。
模拟研究表明,单端454 Jr.测序运行与双端454 Jr.测序运行(8 kb文库)相结合可产生:1)100个组装结果中约90%的支架数小于10个,100个组装结果中约95%的重叠群数小于150个;2)重叠群的平均N50大小超过331 kb;3)平均单碱基准确率大于99.99%;4)平均假基因重复率小于0.7%;5)平均假基因丢失率小于0.4%。
单端454 Jr.测序运行与双端454 Jr.测序运行(8 kb文库)相结合是原核生物全基因组测序的一种经济有效的方法。该策略为在数天内为大多数原核生物产生高质量的草图组装提供了解决方案。由于组装的支架数量少,后续的多重PCR程序(用于填补缺口)将很容易。因此,大规模的原核生物全基因组测序项目可能在数周内完成。