Suppr超能文献

使用全基因组单核苷酸多态性进行主成分判别分析对加拿大市场上的大麻品种进行分类。

Classification of cannabis strains in the Canadian market with discriminant analysis of principal components using genome-wide single nucleotide polymorphisms.

机构信息

Department of Biomedical Engineering, University of Alberta, Edmonton, Alberta, Canada.

PBG BioPharma Inc., Leduc, Alberta, Canada.

出版信息

PLoS One. 2021 Jun 28;16(6):e0253387. doi: 10.1371/journal.pone.0253387. eCollection 2021.

Abstract

The cannabis community typically uses the terms "Sativa" and "Indica" to characterize drug strains with high tetrahydrocannabinol (THC) levels. Due to large scale, extensive, and unrecorded hybridization in the past 40 years, this vernacular naming convention has become unreliable and inadequate for identifying or selecting strains for clinical research and medicinal production. Additionally, cannabidiol (CBD) dominant strains and balanced strains (or intermediate strains, which have intermediate levels of THC and CBD), are not included in the current classification studies despite the increasing research interest in the therapeutic potential of CBD. This paper is the first in a series of studies proposing that a new classification system be established based on genome-wide variation and supplemented by data on secondary metabolites and morphological characteristics. This study performed a whole-genome sequencing of 23 cannabis strains marketed in Canada, aligned sequences to a reference genome, and, after filtering for minor allele frequency of 10%, identified 137,858 single nucleotide polymorphisms (SNPs). Discriminant analysis of principal components (DAPC) was applied to these SNPs and further identified 344 structural SNPs, which classified individual strains into five chemotype-aligned groups: one CBD dominant, one balanced, and three THC dominant clusters. These structural SNPs were all multiallelic and were predominantly tri-allelic (339/344). The largest portion of these SNPs (37%) occurred on the same chromosome containing genes for CBD acid synthases (CBDAS) and THC acid synthases (THCAS). The remainder (63%) were located on the other nine chromosomes. These results showed that the genetic differences between modern cannabis strains were at a whole-genome level and not limited to THC or CBD production. These SNPs contained enough genetic variation for classifying individual strains into corresponding chemotypes. In an effort to elucidate the confused genetic backgrounds of commercially available cannabis strains, this classification attempt investigated the utility of DAPC for classifying modern cannabis strains and for identifying structural SNPs.

摘要

大麻界通常使用“Sativa”和“Indica”这两个术语来描述高四氢大麻酚(THC)水平的毒品株。由于过去 40 年来大规模、广泛且未记录的杂交,这种通俗的命名约定变得不可靠且不足以识别或选择用于临床研究和药用生产的菌株。此外,尽管 CBD (大麻二酚)优势菌株和平衡菌株(或中间菌株,具有中等 THC 和 CBD 水平)的研究兴趣日益增加,但它们并未包含在当前的分类研究中。本文是一系列研究中的第一篇,提出建立一个基于全基因组变异的新分类系统,并辅以次生代谢物和形态特征数据。本研究对 23 种在加拿大销售的大麻菌株进行了全基因组测序,将序列与参考基因组对齐,然后过滤出频率为 10%的次要等位基因,共鉴定出 137858 个单核苷酸多态性(SNP)。对这些 SNP 应用主成分判别分析(DAPC),并进一步鉴定出 344 个结构 SNP,将个体菌株分为五个化学型一致的组:一个 CBD 优势、一个平衡和三个 THC 优势群。这些结构 SNP 均为多等位基因,主要为三等位基因(339/344)。这些 SNP 中最大的一部分(37%)发生在含有 CBD 酸合成酶(CBDAS)和 THC 酸合成酶(THCAS)基因的同一染色体上。其余(63%)位于其他九个染色体上。这些结果表明,现代大麻菌株之间的遗传差异处于全基因组水平,而不仅仅局限于 THC 或 CBD 的产生。这些 SNP 包含足够的遗传变异,可以将个体菌株分类到相应的化学型中。为了阐明市售大麻菌株混乱的遗传背景,这种分类尝试调查了 DAPC 用于分类现代大麻菌株和识别结构 SNP 的效用。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1b05/8238227/ef7b7fb0bd9a/pone.0253387.g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验