Liu Yi, Zong Hang, Xing Yaowu, Jiao Xi, Liu Zhuoya, Niu Yusheng, Yang Zhiling, Liu Shimeng, Wang Yongqiang, Zhao Haodong, Chen Xianqing, Li Zhenzhu, Wang Xiao, Cai Jing, Wang Wen, Wang Zhongkai
Shaanxi Key Laboratory of Qinling Ecological Intelligent Monitoring and Protection, School of Ecology and Environment, Northwestern Polytechnical University, Xi'an, Shaanxi, China.
Jiaxing Synbiolab Biotechnology co., Ltd., Jiaxing, China.
Mol Ecol Resour. 2025 Nov;25(8):e70053. doi: 10.1111/1755-0998.70053. Epub 2025 Sep 26.
Arabica coffee (Coffea arabica) dominates global coffee production, accounting for over 60% of the world's coffee trade. The Mundo Novo cultivar, predominantly grown in Yunnan, China, represents a significant germplasm resource. However, the absence of a high-quality reference genome has hindered comprehensive genetic research and in-depth investigation of secondary metabolic pathways in Arabica. In this study, we present the first near telomere-to-telomere (T2T) genome assembly of Arabica, achieved through the integration of PacBio HiFi, Oxford Nanopore ultra-long, and Hi-C sequencing technologies, representing the highest-quality Arabica genome to date. Phylogenetic analysis of N-methyltransferases (NMTs), the key enzymes responsible for caffeine biosynthesis, revealed their independent evolution across caffeine-producing clades including coffee, cacao, and tea. Furthermore, GO enrichment analysis of expanded gene families at the Arabica ancestral node, combined with fruit-specific transcriptomic profiling, revealed that glycosyltransferases likely play a critical role in the secondary metabolism of Arabica. Notably, functional characterisation demonstrated that a UGT (uridine diphosphate glycosyltransferase, UGT) from the UGT29 subfamily, which exhibited increased gene copy number in the Arabica subgenome C than its ancestor, can directly convert Rebaudioside A (Reb A) into Rebaudioside M (Reb M) through a single-step enzymatic glycosylation. This direct pathway represents a crucial advancement over conventional multi-UGTs biosynthetic routes of Reb M, which is a highly desirable sweetener whereas with limited natural abundance. Taken together, this study not only provides a valuable genomic resource for studying the unique secondary metabolic processes in C. arabica but also accelerates innovative research frontiers for the synthetic biological production of the valuable sweetener Reb M.
阿拉比卡咖啡(Coffea arabica)主导着全球咖啡生产,占全球咖啡贸易的60%以上。主要种植于中国云南的蒙多诺沃品种是一种重要的种质资源。然而,缺乏高质量的参考基因组阻碍了对阿拉比卡咖啡的全面遗传研究以及对其次级代谢途径的深入探究。在本研究中,我们通过整合PacBio HiFi、牛津纳米孔超长读长和Hi-C测序技术,首次完成了阿拉比卡咖啡从端粒到端粒(T2T)的基因组组装,这是迄今为止质量最高的阿拉比卡咖啡基因组。对负责咖啡因生物合成的关键酶N - 甲基转移酶(NMTs)的系统发育分析表明,它们在包括咖啡、可可和茶在内的咖啡因产生分支中独立进化。此外,对阿拉比卡祖先节点处扩展基因家族的GO富集分析,结合果实特异性转录组分析,表明糖基转移酶可能在阿拉比卡咖啡的次级代谢中起关键作用。值得注意的是,功能表征表明,来自UGT29亚家族的一个UGT(尿苷二磷酸糖基转移酶,UGT)在阿拉比卡亚基因组C中的基因拷贝数比其祖先增加,该酶可通过单步酶促糖基化直接将莱鲍迪苷A(Reb A)转化为莱鲍迪苷M(Reb M)。这条直接途径相对于传统的由多个UGT参与的Reb M生物合成途径是一个关键进展,Reb M是一种非常理想的甜味剂,但天然丰度有限。综上所述,本研究不仅为研究阿拉比卡咖啡独特的次级代谢过程提供了宝贵的基因组资源,还加速了有价值甜味剂Reb M的合成生物学生产的创新研究前沿。