Department of Plant and Environmental Sciences, University of Life Sciences, Ås, Norway.
BMC Genomics. 2013 Apr 4;14:222. doi: 10.1186/1471-2164-14-222.
The assembly of the bread wheat genome sequence is challenging due to allohexaploidy and extreme repeat content (>80%). Isolation of single chromosome arms by flow sorting can be used to overcome the polyploidy problem, but the repeat content cause extreme assembly fragmentation even at a single chromosome level. Long jump paired sequencing data (mate pairs) can help reduce assembly fragmentation by joining multiple contigs into single scaffolds. The aim of this work was to assess how mate pair data generated from multiple displacement amplified DNA of flow-sorted chromosomes affect assembly fragmentation of shotgun assemblies of the wheat chromosomes.
Three mate pair (MP) libraries (2 Kb, 3 Kb, and 5 Kb) were sequenced to a total coverage of 89x and 64x for the short and long arm of chromosome 7B, respectively. Scaffolding using SSPACE improved the 7B assembly contiguity and decreased gene space fragmentation, but the degree of improvement was greatly affected by scaffolding stringency applied. At the lowest stringency the assembly N50 increased by ~7 fold, while at the highest stringency N50 was only increased by ~1.5 fold. Furthermore, a strong positive correlation between estimated scaffold reliability and scaffold assembly stringency was observed. A 7BS scaffold assembly with reduced MP coverage proved that assembly contiguity was affected only to a small degree down to ~50% of the original coverage.
The effect of MP data integration into pair end shotgun assemblies of wheat chromosome was moderate; possibly due to poor contig assembly contiguity, the extreme repeat content of wheat, and the use of amplified chromosomal DNA for MP library construction.
由于 allohexaploidy 和极高的重复含量(>80%),组装面包小麦基因组序列具有挑战性。通过流式分选分离单条染色体臂可以克服多倍体问题,但重复含量导致即使在单个染色体水平也会出现极端的组装片段化。长跳配对测序数据(mate pairs)可通过将多个 contigs 连接成单个 scaffolds 来帮助减少组装片段化。本工作旨在评估来自流式分选染色体的多置换扩增 DNA 的 mate pair 数据如何影响小麦染色体的 shotgun 组装的组装片段化。
分别对 7B 染色体的短臂和长臂进行了三个 mate pair (MP) 文库(2 Kb、3 Kb 和 5 Kb)测序,总覆盖率分别为 89x 和 64x。使用 SSPACE 进行支架搭建可提高 7B 组装的连续性并减少基因空间的片段化,但改进程度受到支架搭建严格性的极大影响。在最低严格性下,组装 N50 增加了约 7 倍,而在最高严格性下,N50 仅增加了约 1.5 倍。此外,观察到估计支架可靠性和支架组装严格性之间存在很强的正相关关系。减少 MP 覆盖范围的 7BS 支架组装证明,仅在 ~50%的原始覆盖范围内,组装连续性受到的影响很小。
MP 数据整合到小麦染色体 shotgun 组装中的效果中等;可能是由于较差的 contig 组装连续性、小麦极高的重复含量以及使用扩增的染色体 DNA 构建 MP 文库所致。