School of Biotechnology, Science for Life Laboratory, KTH Royal Institute of Technology, Box 1031, 171 21 Solna, Sweden.
BMC Genomics. 2014 Jun 6;15(1):439. doi: 10.1186/1471-2164-15-439.
Sampling genomes with Fosmid vectors and sequencing of pooled Fosmid libraries on the Illumina platform for massive parallel sequencing is a novel and promising approach to optimizing the trade-off between sequencing costs and assembly quality.
In order to sequence the genome of Norway spruce, which is of great size and complexity, we developed and applied a new technology based on the massive production, sequencing, and assembly of Fosmid pools (FP). The spruce chromosomes were sampled with ~40,000 bp Fosmid inserts to obtain around two-fold genome coverage, in parallel with traditional whole genome shotgun sequencing (WGS) of haploid and diploid genomes. Compared to the WGS results, the contiguity and quality of the FP assemblies were high, and they allowed us to fill WGS gaps resulting from repeats, low coverage, and allelic differences. The FP contig sets were further merged with WGS data using a novel software package GAM-NGS.
By exploiting FP technology, the first published assembly of a conifer genome was sequenced entirely with massively parallel sequencing. Here we provide a comprehensive report on the different features of the approach and the optimization of the process.We have made public the input data (FASTQ format) for the set of pools used in this study:ftp://congenie.org/congenie/Nystedt_2013/Assembly/ProcessedData/FosmidPools/.(alternatively accessible via http://congenie.org/downloads).The software used for running the assembly process is available at http://research.scilifelab.se/andrej_alexeyenko/downloads/fpools/.
使用 Fosmid 载体对基因组进行抽样,并在 Illumina 平台上对汇集的 Fosmid 文库进行大规模平行测序,这是一种优化测序成本和组装质量之间权衡的新方法。
为了对挪威云杉这一大而复杂的基因组进行测序,我们开发并应用了一种基于大规模生产、测序和 Fosmid 池(FP)组装的新技术。用约 40000 bp 的 Fosmid 插入物对云杉染色体进行抽样,以获得大约两倍的基因组覆盖度,同时对单倍体和二倍体基因组进行传统的全基因组鸟枪法测序(WGS)。与 WGS 结果相比,FP 组装的连续性和质量都很高,它们允许我们填补由于重复、低覆盖率和等位基因差异导致的 WGS 缺口。进一步使用一种新的软件包 GAM-NGS 将 FP 拼接集与 WGS 数据进行合并。
通过利用 FP 技术,首次对针叶树基因组进行了完全的大规模平行测序组装。在此,我们全面报告了该方法的不同特点和流程优化。我们已经公开了本研究中使用的一组汇集池的输入数据(FASTQ 格式):ftp://congenie.org/congenie/Nystedt_2013/Assembly/ProcessedData/FosmidPools/。(也可通过 http://congenie.org/downloads 访问)。用于运行组装过程的软件可在 http://research.scilifelab.se/andrej_alexeyenko/downloads/fpools/ 获得。