Li Yongping, Wei Wei, Feng Jia, Luo Huifeng, Pi Mengting, Liu Zhongchi, Kang Chunying
Key Laboratory of Horticultural Plant Biology (Ministry of Education), College of Horticulture and Forestry Sciences, Huazhong Agricultural University, Wuhan 430070, China.
Department of Cell Biology and Molecular Genetics, University of Maryland, College Park, MD 20742, USA.
DNA Res. 2018 Feb 1;25(1):61-70. doi: 10.1093/dnares/dsx038.
The genome of the wild diploid strawberry species Fragaria vesca, an ideal model system of cultivated strawberry (Fragaria × ananassa, octoploid) and other Rosaceae family crops, was first published in 2011 and followed by a new assembly (Fvb). However, the annotation for Fvb mainly relied on ab initio predictions and included only predicted coding sequences, therefore an improved annotation is highly desirable. Here, a new annotation version named v2.0.a2 was created for the Fvb genome by a pipeline utilizing one PacBio library, 90 Illumina RNA-seq libraries, and 9 small RNA-seq libraries. Altogether, 18,641 genes (55.6% out of 33,538 genes) were augmented with information on the 5' and/or 3' UTRs, 13,168 (39.3%) protein-coding genes were modified or newly identified, and 7,370 genes were found to possess alternative isoforms. In addition, 1,938 long non-coding RNAs, 171 miRNAs, and 51,714 small RNA clusters were integrated into the annotation. This new annotation of F. vesca is substantially improved in both accuracy and integrity of gene predictions, beneficial to the gene functional studies in strawberry and to the comparative genomic analysis of other horticultural crops in Rosaceae family.
野生二倍体草莓物种森林草莓(Fragaria vesca)的基因组是栽培草莓(八倍体凤梨草莓,Fragaria × ananassa)及其他蔷薇科作物的理想模型系统,其基因组于2011年首次公布,随后进行了新的组装(Fvb)。然而,Fvb的注释主要依赖从头预测,仅包括预测的编码序列,因此非常需要改进注释。在这里,利用一个PacBio文库、90个Illumina RNA测序文库和9个小RNA测序文库,通过一个流程为Fvb基因组创建了一个名为v2.0.a2的新注释版本。总共,18,641个基因(占33,538个基因的55.6%)增加了5'和/或3'非翻译区的信息,13,168个(39.3%)蛋白质编码基因被修改或新鉴定出来,并且发现7,370个基因具有可变异构体。此外,1,938个长链非编码RNA、171个微小RNA和51,714个小RNA簇被整合到注释中。森林草莓的这个新注释在基因预测的准确性和完整性方面都有显著提高,有利于草莓的基因功能研究以及蔷薇科其他园艺作物的比较基因组分析。