State Key Laboratory of Crop Stress Biology for Arid Areas, College of Horticulture, Northwest A&F University, Yangling, 712100, China.
State Key Laboratory of Genetic Resources and Evolution, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, 650223, China.
Gigascience. 2016 Aug 8;5(1):35. doi: 10.1186/s13742-016-0139-0.
Domesticated apple (Malus × domestica Borkh) is a popular temperate fruit with high nutrient levels and diverse flavors. In 2012, global apple production accounted for at least one tenth of all harvested fruits. A high-quality apple genome assembly is crucial for the selection and breeding of new cultivars. Currently, a single reference genome is available for apple, assembled from 16.9 × genome coverage short reads via Sanger and 454 sequencing technologies. Although a useful resource, this assembly covers only ~89 % of the non-repetitive portion of the genome, and has a relatively short (16.7 kb) contig N50 length. These downsides make it difficult to apply this reference in transcriptive or whole-genome re-sequencing analyses.
Here we present an improved hybrid de novo genomic assembly of apple (Golden Delicious), which was obtained from 76 Gb (102 × genome coverage) Illumina HiSeq data and 21.7 Gb (29 × genome coverage) PacBio data. The final draft genome is approximately 632.4 Mb, representing ~ 90 % of the estimated genome. The contig N50 size is 111,619 bp, representing a 7 fold improvement. Further annotation analyses predicted 53,922 protein-coding genes and 2,765 non-coding RNA genes.
The new apple genome assembly will serve as a valuable resource for investigating complex apple traits at the genomic level. It is not only suitable for genome editing and gene cloning, but also for RNA-seq and whole-genome re-sequencing studies.
栽培苹果(Malus × domestica Borkh)是一种受欢迎的温带水果,具有高营养水平和多样的风味。2012 年,全球苹果产量至少占所有收获水果的十分之一。高质量的苹果基因组组装对于新品种的选择和培育至关重要。目前,苹果只有一个参考基因组,是通过桑格和 454 测序技术,用 16.9×基因组覆盖度的短读长组装的。尽管这是一个有用的资源,但该组装仅覆盖了基因组非重复部分的约 89%,而且其 contig N50 长度相对较短(16.7 kb)。这些缺点使得在转录或全基因组重测序分析中很难应用该参考基因组。
在这里,我们展示了一个改良的苹果(金冠)从头杂交基因组组装,该组装是从 76 Gb(约 102×基因组覆盖度)Illumina HiSeq 数据和 21.7 Gb(约 29×基因组覆盖度)PacBio 数据中获得的。最终的草图基因组约为 632.4 Mb,代表了估计基因组的约 90%。contig N50 大小为 111619 bp,提高了 7 倍。进一步的注释分析预测了 53922 个蛋白质编码基因和 2765 个非编码 RNA 基因。
新的苹果基因组组装将成为在基因组水平上研究复杂苹果性状的宝贵资源。它不仅适用于基因组编辑和基因克隆,也适用于 RNA-seq 和全基因组重测序研究。