Flower Research Institute of Yunnan Academy of Agricultural Sciences, National Engineering Research Center For Ornamental Horticulture, No. 2238 Beijing Road, Panlong District, Kunming 650205, China.
Key Lab of Yunnan Flower Breeding, No. 2238 Beijing Road, Panlong District, Kunming 650205, China.
Gigascience. 2017 Oct 1;6(10):1-11. doi: 10.1093/gigascience/gix076.
Rhododendron delavayi Franch. is globally famous as an ornamental plant. Its distribution in southwest China covers several different habitats and environments. However, not much research had been conducted on Rhododendron spp. at the molecular level, which hinders understanding of its evolution, speciation, and synthesis of secondary metabolites, as well as its wide adaptability to different environments. Here, we report the genome assembly and gene annotation of R. delavayi var. delavayi (the second genome sequenced in the Ericaceae), which will facilitate the study of the family. The genome assembly will have further applications in genome-assisted cultivar breeding. The final size of the assembled R. delavayi var. delavayi genome (695.09 Mb) was close to the 697.94 Mb, estimated by k-mer analysis. A total of 336.83 gigabases (Gb) of raw Illumina HiSeq 2000 reads were generated from 9 libraries (with insert sizes ranging from 170 bp to 40 kb), achieving a raw sequencing depth of ×482.6. After quality filtering, 246.06 Gb of clean reads were obtained, giving ×352.55 coverage depth. Assembly using Platanus gave a total scaffold length of 695.09 Mb, with a contig N50 of 61.8 kb and a scaffold N50 of 637.83 kb. Gene prediction resulted in the annotation of 32 938 protein-coding genes. The genome completeness was evaluated by CEGMA and BUSCO and reached 95.97% and 92.8%, respectively. The gene annotation completeness was also evaluated by CEGMA and BUSCO and reached 97.01% and 87.4%, respectively. Genome annotation revealed that 51.77% of the R. delavayi genome is composed of transposable elements, and 37.48% of long terminal repeat elements (LTRs). The de novo assembled genome of R. delavayi var. delavayi (hereinafter referred to as R. delavayi) is the second genomic resource of the family Ericaceae and will provide a valuable resource for research on future comparative genomic studies in Rhododendron species. The availability of the R. delavayi genome sequence will hopefully provide a tool for scientists to tackle open questions regarding molecular mechanisms underlying environmental interactions in the genus Rhododendron, more accurately understand the evolutionary processes and systematics of the genus, facilitate the identification of genes encoding pharmaceutically important compounds, and accelerate molecular breeding to release elite varieties.
马缨杜鹃是全球著名的观赏植物。其在中国西南地区的分布涵盖了多种不同的生境和环境。然而,在分子水平上对马缨杜鹃属的研究还相对较少,这阻碍了人们对其进化、物种形成和次生代谢物合成以及对不同环境的广泛适应性的理解。在这里,我们报告了马缨杜鹃 var. delavayi(Ericaceae 中测序的第二个基因组)的基因组组装和基因注释,这将有助于该科的研究。基因组组装将在基因组辅助品种培育方面有进一步的应用。最终组装的马缨杜鹃 var. delavayi 基因组大小(695.09 Mb)接近通过 k-mer 分析估计的 697.94 Mb。总共从 9 个文库(插入大小范围从 170 bp 到 40 kb)生成了 336.83 Gb 的原始 Illumina HiSeq 2000 读取,实现了 ×482.6 的原始测序深度。经过质量过滤,获得了 246.06 Gb 的清洁读取,覆盖率深度为 ×352.55。使用 Platanus 进行组装得到了 695.09 Mb 的总支架长度,contig N50 为 61.8 kb,支架 N50 为 637.83 kb。基因预测导致注释了 32938 个蛋白质编码基因。通过 CEGMA 和 BUSCO 评估基因组的完整性,分别达到了 95.97%和 92.8%。通过 CEGMA 和 BUSCO 评估基因注释的完整性,分别达到了 97.01%和 87.4%。基因注释表明,马缨杜鹃基因组的 51.77%由转座元件组成,37.48%由长末端重复元件(LTRs)组成。马缨杜鹃 var. delavayi(以下简称马缨杜鹃)的从头组装基因组是 Ericaceae 科的第二个基因组资源,将为未来对马缨杜鹃属的比较基因组研究提供有价值的资源。马缨杜鹃基因组序列的可用性有望为科学家提供一个工具,以解决与马缨杜鹃属中环境相互作用的分子机制相关的开放性问题,更准确地了解该属的进化过程和系统发育,促进编码药用重要化合物的基因的鉴定,并加速分子育种以释放优良品种。