Laboratory of Forest Genomics, Genome Research and Education Center, Siberian Federal University, 660036, Krasnoyarsk, Russian Federation.
Laboratory of Forest Genetics and Selection, V.N. Sukachev Institute of Forest, Siberian Branch of Russian Academy of Sciences, 660036, Krasnoyarsk, Russian Federation.
BMC Bioinformatics. 2019 Feb 5;20(Suppl 1):38. doi: 10.1186/s12859-018-2571-x.
The main objectives of this study were sequencing, assembling, and annotation of chloroplast genome of one of the main Siberian boreal forest tree conifer species Siberian larch (Larix sibirica Ledeb.) and detection of polymorphic genetic markers - microsatellite loci or simple sequence repeats (SSRs) and single nucleotide polymorphisms (SNPs).
We used the data of the whole genome sequencing of three Siberian larch trees from different regions - the Urals, Krasnoyarsk, and Khakassia, respectively. Sequence reads were obtained using the Illumina HiSeq2000 in the Laboratory of Forest Genomics at the Genome Research and Education Center of the Siberian Federal University. The assembling was done using the Bowtie2 mapping program and the SPAdes genomic assembler. The genome annotation was performed using the RAST service. We used the GMATo program for the SSRs search, and the Bowtie2 and UGENE programs for the SNPs detection. Length of the assembled chloroplast genome was 122,561 bp, which is similar to 122,474 bp in the closely related European larch (Larix decidua Mill.). As a result of annotation and comparison of the data with the existing data available only for three larch species - L. decidua, L. potaninii var. chinensis (complete genome 122,492 bp), and L. occidentalis (partial genome of 119,680 bp), we identified 110 genes, 34 of which represented tRNA, 4 rRNA, and 72 protein-coding genes. In total, 13 SNPs were detected; two of them were in the tRNA-Arg and Cell division protein FtsH genes, respectively. In addition, 23 SSR loci were identified.
The complete chloroplast genome sequence was obtained for Siberian larch for the first time. The reference complete chloroplast genomes, such as one described here, would greatly help in the chloroplast resequencing and search for additional genetic markers using population samples. The results of this research will be useful for further phylogenetic and gene flow studies in conifers.
本研究的主要目的是对西伯利亚落叶松(Larix sibirica Ledeb.)这一主要的西伯利亚北方森林树种的叶绿体基因组进行测序、组装和注释,并检测多态性遗传标记——微卫星位点或简单重复序列(SSR)和单核苷酸多态性(SNP)。
我们使用了分别来自三个不同地区(乌拉尔、克拉斯诺亚尔斯克和哈卡斯)的三棵西伯利亚落叶松的全基因组测序数据。序列读取是在西伯利亚联邦大学基因组研究和教育中心的森林基因组学实验室使用 Illumina HiSeq2000 获得的。组装是使用 Bowtie2 映射程序和 SPAdes 基因组组装器完成的。基因组注释是使用 RAST 服务完成的。我们使用 GMATo 程序搜索 SSR,使用 Bowtie2 和 UGENE 程序检测 SNP。组装的叶绿体基因组长度为 122561bp,与亲缘关系较近的欧洲落叶松(Larix decidua Mill.)的 122474bp 相似。通过注释和比较,与现有的仅三种落叶松物种(L. decidua、L. potaninii var. chinensis(完整基因组 122492bp)和 L. occidentalis(部分基因组 119680bp)的数据进行比较,我们鉴定了 110 个基因,其中 34 个是 tRNA,4 个是 rRNA,72 个是蛋白质编码基因。总共检测到 13 个 SNP,其中两个分别位于 tRNA-Arg 和细胞分裂蛋白 FtsH 基因中。此外,还鉴定了 23 个 SSR 位点。
我们首次获得了西伯利亚落叶松的完整叶绿体基因组序列。这样的参考完整叶绿体基因组序列将极大地帮助使用群体样本对叶绿体进行重测序和寻找其他遗传标记。本研究的结果将有助于进一步研究针叶树的系统发育和基因流。