Hainan Key Laboratory of Tropical Oil Crops Biology/Coconut Research Institute, Chinese Academy of Tropical Agricultural Sciences, Av. Wenqing No. 496, Wenchang, Hainan 571339, P. R. China.
BGI Genomics, BGI-Shenzhen, Building NO.7, BGI Park, No. 21 Hongan 3rd Street, Yantian District, Shenzhen 518083, China.
Gigascience. 2017 Nov 1;6(11):1-11. doi: 10.1093/gigascience/gix095.
Coconut palm (Cocos nucifera,2n = 32), a member of genus Cocos and family Arecaceae (Palmaceae), is an important tropical fruit and oil crop. Currently, coconut palm is cultivated in 93 countries, including Central and South America, East and West Africa, Southeast Asia and the Pacific Islands, with a total growth area of more than 12 million hectares [1]. Coconut palm is generally classified into 2 main categories: "Tall" (flowering 8-10 years after planting) and "Dwarf" (flowering 4-6 years after planting), based on morphological characteristics and breeding habits. This Palmae species has a long growth period before reproductive years, which hinders conventional breeding progress. In spite of initial successes, improvements made by conventional breeding have been very slow. In the present study, we obtained de novo sequences of the Cocos nucifera genome: a major genomic resource that could be used to facilitate molecular breeding in Cocos nucifera and accelerate the breeding process in this important crop. A total of 419.67 gigabases (Gb) of raw reads were generated by the Illumina HiSeq 2000 platform using a series of paired-end and mate-pair libraries, covering the predicted Cocos nucifera genome length (2.42 Gb, variety "Hainan Tall") to an estimated ×173.32 read depth. A total scaffold length of 2.20 Gb was generated (N50 = 418 Kb), representing 90.91% of the genome. The coconut genome was predicted to harbor 28 039 protein-coding genes, which is less than in Phoenix dactylifera (PDK30: 28 889), Phoenix dactylifera (DPV01: 41 660), and Elaeis guineensis (EG5: 34 802). BUSCO evaluation demonstrated that the obtained scaffold sequences covered 90.8% of the coconut genome and that the genome annotation was 74.1% complete. Genome annotation results revealed that 72.75% of the coconut genome consisted of transposable elements, of which long-terminal repeat retrotransposons elements (LTRs) accounted for the largest proportion (92.23%). Comparative analysis of the antiporter gene family and ion channel gene families between C. nucifera and Arabidopsis thaliana indicated that significant gene expansion may have occurred in the coconut involving Na+/H+ antiporter, carnitine/acylcarnitine translocase, potassium-dependent sodium-calcium exchanger, and potassium channel genes. Despite its agronomic importance, C. nucifera is still under-studied. In this report, we present a draft genome of C. nucifera and provide genomic information that will facilitate future functional genomics and molecular-assisted breeding in this crop species.
椰棕(Cocos nucifera,2n = 32),属于椰属和棕榈科(槟榔科),是一种重要的热带水果和油料作物。目前,椰子棕榈在 93 个国家种植,包括中美洲和南美洲、东非和西非、东南亚和太平洋岛屿,总面积超过 1200 万公顷[1]。椰子棕榈通常根据形态特征和繁殖习性分为“高”(种植后 8-10 年开花)和“矮”(种植后 4-6 年开花)两类。这个 Palmae 物种在生殖年前的生长期很长,这阻碍了常规的繁殖进展。尽管最初取得了一些成功,但常规繁殖的改进非常缓慢。在本研究中,我们获得了椰棕基因组的从头序列:这是一个主要的基因组资源,可以用于促进椰棕的分子育种,并加速这一重要作物的育种进程。总共生成了 419.67 千兆碱基(Gb)的原始读数,使用一系列配对末端和 mate-pair 文库,通过 Illumina HiSeq 2000 平台进行了生成,覆盖了预测的椰棕基因组长度(2.42 Gb,品种“海南高”)到估计的×173.32 读深度。总共生成了 2.20 Gb 的总支架长度(N50 = 418 Kb),代表基因组的 90.91%。椰子基因组预计含有 28039 个蛋白质编码基因,少于凤凰木(PDK30:28889)、凤凰木(DPV01:41660)和油棕(EG5:34802)。BUSCO 评估表明,获得的支架序列覆盖了椰子基因组的 90.8%,基因组注释的完整性为 74.1%。基因组注释结果表明,椰子基因组的 72.75%由转座元件组成,其中长末端重复反转录转座子元件(LTRs)占最大比例(92.23%)。C. nucifera 和拟南芥之间的协同转运蛋白家族和离子通道基因家族的比较分析表明,在椰子中可能发生了显著的基因扩张,涉及 Na+/H+协同转运蛋白、肉碱/酰基辅酶 A 转移酶、钾依赖性钠-钙交换器和钾通道基因。尽管椰棕具有农业重要性,但它的研究仍然不足。在本报告中,我们提供了椰棕的基因组草图,并提供了基因组信息,这将有助于该作物物种未来的功能基因组学和分子辅助育种。