Li Yongping, Pi Mengting, Gao Qi, Liu Zhongchi, Kang Chunying
1Key Laboratory of Horticultural Plant Biology (Ministry of Education), College of Horticulture and Forestry Sciences, Huazhong Agricultural University, Wuhan, Hubei, 430070 China.
2Department of Cell Biology and Molecular Genetics, University of Maryland, College Park, MD 20742 USA.
Hortic Res. 2019 May 1;6:61. doi: 10.1038/s41438-019-0142-6. eCollection 2019.
The diploid strawberry serves as an ideal model plant for cultivated strawberry ( , 8) and the family. The genome was initially published in 2011 using older technologies. Recently, a new and greatly improved genome, designated V4, was published. However, the number of annotated genes is remarkably reduced in V4 (28,588 genes) compared to the prior annotations (32,831 to 33,673 genes). Additionally, the annotation of V4 (v4.0.a1) implements a new nomenclature for gene IDs (FvH4_XgXXXXX), rather than the previous nomenclature (geneXXXXX). Hence, further improvement of the V4 genome annotation and assigning gene expression levels under the new gene IDs with existing transcriptome data are necessary to facilitate the utility of this high-quality genome V4. Here, we built a new and improved annotation, v4.0.a2, for genome V4. The new annotation has a total of 34,007 gene models with 98.1% complete Benchmarking Universal Single-Copy Orthologs (BUSCOs). In this v4.0.a2 annotation, gene models of 8,342 existing genes are modified, 9,029 new genes are added, and 10,176 genes possess alternatively spliced isoforms with an average of 1.90 transcripts per locus. Transcription factors/regulators and protein kinases are globally identified. Interestingly, the transcription factor family () contains 82 genes in v4.0.a2 but only two members in v4.0.a1. Additionally, the expression levels of all genes in the new annotation across a total of 46 different tissues and stages are provided. Finally, miRNAs and their targets are reanalyzed and presented. Altogether, this work provides an updated genome annotation of the V4 genome as well as a comprehensive gene expression atlas with the new gene ID nomenclature, which will greatly facilitate gene functional studies in strawberry and other evolutionarily related plant species.
二倍体草莓是栽培草莓及蔷薇科的理想模式植物。其基因组最初于2011年采用旧技术发表。最近,一个新的、大幅改进的基因组,即V4基因组发表了。然而,与之前的注释(32,831至33,673个基因)相比,V4基因组中注释的基因数量显著减少(28,588个基因)。此外,V4(v4.0.a1)的注释采用了新的基因ID命名法(FvH4_XgXXXXX),而不是之前的命名法(geneXXXXX)。因此,有必要进一步改进V4基因组注释,并利用现有的转录组数据在新基因ID下确定基因表达水平,以促进这个高质量的V4基因组的应用。在此,我们为V4基因组构建了一个新的、改进的注释v4.0.a2。新注释共有34,007个基因模型,其中98.1%的基准通用单拷贝直系同源基因(BUSCOs)完整。在这个v4.0.a2注释中,8342个现有基因的基因模型被修改,增加了9029个新基因,10176个基因具有可变剪接异构体,每个基因座平均有1.90个转录本。全局鉴定了转录因子/调控因子和蛋白激酶。有趣的是,转录因子家族()在v4.0.a2中有82个基因,但在v4.0.a1中只有两个成员。此外,还提供了新注释中所有基因在总共46个不同组织和阶段的表达水平。最后,对miRNA及其靶标进行了重新分析并呈现。总之,这项工作提供了V4基因组的更新注释以及具有新基因ID命名法的综合基因表达图谱,这将极大地促进草莓及其他进化相关植物物种的基因功能研究。