Embrapa Amazônia Oriental, 66095-903 Belém, PA, Brazil.
Laboratório de Bioinformática e Computação de Alto Desempenho (LaBioCad), Faculdade de Computação (FACOMP), Universidade Federal do Pará, 66075-110 Belém, PA, Brazil.
Gigascience. 2024 Jan 2;13. doi: 10.1093/gigascience/giae027.
Theobroma grandiflorum (Malvaceae), known as cupuassu, is a tree indigenous to the Amazon basin, valued for its large fruits and seed pulp, contributing notably to the Amazonian bioeconomy. The seed pulp is utilized in desserts and beverages, and its seed butter is used in cosmetics. Here, we present the sequenced telomere-to-telomere genome of cupuassu, disclosing its genomic structure, evolutionary features, and phylogenetic relationships within the Malvaceae family.
The cupuassu genome spans 423 Mb, encodes 31,381 genes distributed in 10 chromosomes, and exhibits approximately 65% gene synteny with the Theobroma cacao genome, reflecting a conserved evolutionary history, albeit punctuated with unique genomic variations. The main changes are pronounced by bursts of long-terminal repeat retrotransposons at postspecies divergence, retrocopied and singleton genes, and gene families displaying distinctive patterns of expansion and contraction. Furthermore, positively selected genes are evident, particularly among retained and dispersed tandem and proximal duplicated genes associated with general fruit and seed traits and defense mechanisms, supporting the hypothesis of potential episodes of subfunctionalization and neofunctionalization following duplication, as well as impact from distinct domestication process. These genomic variations may underpin the differences observed in fruit and seed morphology, ripening, and disease resistance between cupuassu and the other Malvaceae species.
The cupuassu genome offers a foundational resource for both breeding improvement and conservation biology, yielding insights into the evolution and diversity within the genus Theobroma.
可可树(锦葵科),又称牛油果,是一种原产于亚马逊流域的树木,因其硕大的果实和种子果肉而受到重视,为亚马逊生物经济做出了重要贡献。种子果肉可用于制作甜点和饮料,种子黄油则用于化妆品。在这里,我们呈现了可可树的端粒到端粒测序基因组,揭示了其基因组结构、进化特征以及在锦葵科家族内的系统发育关系。
可可树基因组跨度为 423Mb,编码 31381 个基因,分布在 10 条染色体上,与可可树基因组的基因相似度约为 65%,反映了保守的进化历史,但也存在独特的基因组变异。主要的变化是物种分化后长末端重复逆转录转座子的爆发、反转录拷贝和单拷贝基因,以及表现出明显扩张和收缩模式的基因家族。此外,还存在正选择基因,特别是与一般果实和种子特征以及防御机制相关的保留和分散串联和近端重复基因,支持了复制后潜在的亚功能化和新功能化以及不同驯化过程的假设。这些基因组变异可能是导致可可树与其他锦葵科物种在果实和种子形态、成熟和抗病性方面存在差异的基础。
可可树基因组为培育改良和保护生物学提供了基础资源,为可可属的进化和多样性提供了深入了解。