Department of Plant Sciences, University of California, Davis, Davis, CA 95616, USA.
Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD 21218, USA.
G3 (Bethesda). 2022 Jan 4;12(1). doi: 10.1093/g3journal/jkab380.
Sequencing, assembly, and annotation of the 26.5 Gbp hexaploid genome of coast redwood (Sequoia sempervirens) was completed leading toward discovery of genes related to climate adaptation and investigation of the origin of the hexaploid genome. Deep-coverage short-read Illumina sequencing data from haploid tissue from a single seed were combined with long-read Oxford Nanopore Technologies sequencing data from diploid needle tissue to create an initial assembly, which was then scaffolded using proximity ligation data to produce a highly contiguous final assembly, SESE 2.1, with a scaffold N50 size of 44.9 Mbp. The assembly included several scaffolds that span entire chromosome arms, confirmed by the presence of telomere and centromere sequences on the ends of the scaffolds. The structural annotation produced 118,906 genes with 113 containing introns that exceed 500 Kbp in length and one reaching 2 Mb. Nearly 19 Gbp of the genome represented repetitive content with the vast majority characterized as long terminal repeats, with a 2.9:1 ratio of Copia to Gypsy elements that may aid in gene expression control. Comparison of coast redwood to other conifers revealed species-specific expansions for a plethora of abiotic and biotic stress response genes, including those involved in fungal disease resistance, detoxification, and physical injury/structural remodeling and others supporting flavonoid biosynthesis. Analysis of multiple genes that exist in triplicate in coast redwood but only once in its diploid relative, giant sequoia, supports a previous hypothesis that the hexaploidy is the result of autopolyploidy rather than any hybridizations with separate but closely related conifer species.
对 26.5Gb 六倍体海岸红杉(Sequoiasempervirens)基因组的测序、组装和注释已经完成,这有助于发现与气候适应相关的基因,并研究六倍体基因组的起源。对来自单个种子的单倍体组织的深度覆盖短读长 Illumina 测序数据与来自二倍体针叶组织的长读长 Oxford Nanopore Technologies 测序数据进行了组合,创建了一个初始组装,然后使用邻近连接数据对其进行支架构建,生成了一个高度连续的最终组装 SESE2.1,其支架 N50 大小为 44.9 Mbp。该组装包含几个跨越整个染色体臂的支架,这些支架的末端存在端粒和着丝粒序列,这证实了这一点。结构注释生成了 118906 个基因,其中 113 个基因包含长度超过 500 Kbp 的内含子,一个基因的长度达到 2 Mb。基因组的近 19 Gbp 代表重复内容,其中绝大多数被特征化为长末端重复序列,Copia 与 Gypsy 元素的比例为 2.9:1,这可能有助于基因表达控制。将海岸红杉与其他针叶树进行比较,发现了大量与非生物和生物胁迫反应相关的基因的物种特异性扩张,包括那些参与真菌抗病性、解毒和物理损伤/结构重塑的基因,以及其他支持类黄酮生物合成的基因。对存在于海岸红杉三倍体中但在其二倍体亲缘种巨杉中仅存在一次的多个基因进行分析,支持了之前的假设,即六倍体是自多倍体的结果,而不是与单独但密切相关的针叶树物种的任何杂交。