Beijing Advanced Innovation Center for Tree Breeding by Molecular Design, National Engineering Laboratory for Tree Breeding, School of Nature Conservation, College of Biological Sciences and Technology, Beijing Forestry University, Beijing, 100083, China.
College of Life Science, Datong University, Datong, Shanxi, 037009, China.
Gigascience. 2019 Feb 1;8(2). doi: 10.1093/gigascience/giy164.
Malania oleifera, a member of the Olacaceae family, is an IUCN red listed tree, endemic and restricted to the Karst region of southwest China. This tree's seed is valued for its high content of precious fatty acids (especially nervonic acid). However, studies on its genetic makeup and fatty acid biogenesis are severely hampered by a lack of molecular and genetic tools.
We generated 51 Gb and 135 Gb of raw DNA sequences, using Pacific Biosciences (PacBio) single-molecule real-time and 10× Genomics sequencing, respectively. A final genome assembly, with a scaffold N50 size of 4.65 Mb and a total length of 1.51 Gb, was obtained by primary assembly based on PacBio long reads plus scaffolding with 10× Genomics reads. Identified repeats constituted ∼82% of the genome, and 24,064 protein-coding genes were predicted with high support. The genome has low heterozygosity and shows no evidence for recent whole genome duplication. Metabolic pathway genes relating to the accumulation of long-chain fatty acid were identified and studied in detail.
Here, we provide the first genome assembly and gene annotation for M. oleifera. The availability of these resources will be of great importance for conservation biology and for the functional genomics of nervonic acid biosynthesis.
油丹,铁青树科铁青树属植物,IUCN 红色名录中的濒危树种,仅分布于中国西南喀斯特地区。其种子富含珍贵脂肪酸(特别是神经酸),具有较高的经济价值。然而,由于缺乏分子和遗传工具,对其遗传结构和脂肪酸生物合成的研究受到严重阻碍。
我们分别使用 PacBio 单分子实时测序和 10× Genomics 测序技术,获得了 51Gb 和 135Gb 的原始 DNA 序列。通过基于 PacBio 长读长的初步组装,再加上 10× Genomics 读长的支架,获得了最终的基因组组装,其支架 N50 大小为 4.65Mb,总长度为 1.51Gb。鉴定的重复序列构成了基因组的约 82%,并预测到了 24064 个具有高度支持的蛋白质编码基因。该基因组具有较低的杂合度,且没有近期全基因组复制的证据。鉴定并详细研究了与长链脂肪酸积累相关的代谢途径基因。
本研究首次为油丹组装和注释基因组。这些资源的可用性将对保护生物学和神经酸生物合成的功能基因组学具有重要意义。