Barbosa-Xavier Kevelin, Pedrosa-Silva Francisnei, Almeida-Silva Fabricio, Venancio Thiago M
Laboratório de Química e Função de Proteínas e Peptídeos, Centro de Biociências e Biotecnologia, Universidade Estadual do Norte Fluminense Darcy Ribeiro, Campos dos Goytacazes, RJ, Brazil.
Department of Plant Biotechnology and Bioinformatics, Ghent University, Ghent, Belgium.
Physiol Plant. 2024 Nov-Dec;176(6):e70010. doi: 10.1111/ppl.70010.
Cannabis sativa L., a plant originating from Central Asia, is a versatile crop with applications spanning textiles, construction, pharmaceuticals, and food products. This study aimed to compile and analyze publicly available Cannabis RNA-Seq data and develop an integrated database tool to help advance Cannabis research in various topics such as fiber production, cannabinoid biosynthesis, sex determination, and plant development. We identified 515 publicly available RNA-Seq samples that, after stringent quality control, resulted in a high-quality dataset of 394 samples. Utilizing the Jamaican Lion genome as reference, we constructed a comprehensive database and developed the Cannabis Expression Atlas (https://cannatlas.venanciogroup.uenf.br/), a web application for visualization of gene expression, annotation, and functional classification. Key findings include the quantification of 27,640 Cannabis genes and their classification into seven expression categories: not-expressed, low-expressed, housekeeping, tissue-specific, group-enriched, mixed, and expressed-in-all tissues. The study revealed substantial variability and coherence in gene expression across different tissues and chemotypes. We found 2,382 tissue-specific genes, including 177 transcription factors. The Cannabis Expression Atlas constitutes a valuable tool for exploring gene expression patterns and offers insights into Cannabis biology, supporting research in plant breeding, genetic engineering, biochemistry, and functional genomics.
大麻(Cannabis sativa L.)原产于中亚,是一种用途广泛的作物,其应用涵盖纺织品、建筑、制药和食品等领域。本研究旨在收集和分析公开可用的大麻RNA测序数据,并开发一个综合数据库工具,以推动大麻在纤维生产、大麻素生物合成、性别决定和植物发育等各种主题方面的研究。我们识别出515个公开可用的RNA测序样本,经过严格的质量控制后,得到了一个包含394个样本的高质量数据集。以牙买加狮子大麻基因组作为参考,我们构建了一个综合数据库,并开发了大麻表达图谱(https://cannatlas.venanciogroup.uenf.br/),这是一个用于可视化基因表达、注释和功能分类的网络应用程序。主要发现包括对27640个大麻基因的定量分析,并将它们分为七个表达类别:未表达、低表达、管家基因、组织特异性、组富集、混合和在所有组织中表达。该研究揭示了不同组织和化学类型之间基因表达存在显著的变异性和一致性。我们发现了2382个组织特异性基因,其中包括177个转录因子。大麻表达图谱构成了一个探索基因表达模式的宝贵工具,并为大麻生物学提供了见解,支持植物育种、基因工程、生物化学和功能基因组学等方面的研究。