Department of Economic Crops, Jiangsu Yanjiang Institute of Agricultural Science, Nantong, China.
Sci Data. 2024 Apr 9;11(1):359. doi: 10.1038/s41597-024-03204-4.
The genome of faba bean was first published in 2023. To promote future molecular breeding studies, we improved the quality of the faba genome based on high-density genetic maps and the Illumina and Pacbio RNA-seq datasets. Two high-density genetic maps were used to conduct the scaffold ordering and orientation of faba bean, culminating in an increased length (i.e., 14.28 Mbp) of chromosomes and a decrease in the number of scaffolds by 45. In gene model mining and optimisation, the PacBio and Illumina RNA-seq datasets from 37 samples allowed for the identification and correction 121,606 transcripts, and the data facilitated a prediction of 15,640 alternative splicing events, 2,148 lncRNAs, and 1,752 fusion transcripts, thus allowing for a clearer understanding of the gene structures underlying the faba genome. Moreover, a total of 38,850 new genes including 56,188 transcripts were identified compared with the reference genome. Finally, the genetic data of the reference genome was integrated and a comprehensive and complete faba bean transcriptome sequence of 103,267 transcripts derived from 54,753 uni-genes was formed.
蚕豆基因组于 2023 年首次公布。为了促进未来的分子育种研究,我们基于高密度遗传图谱和 Illumina 和 Pacbio RNA-seq 数据集提高了蚕豆基因组的质量。使用两个高密度遗传图谱来进行蚕豆的支架排序和定向,最终染色体长度增加(即 14.28 Mbp),支架数量减少 45 个。在基因模型挖掘和优化中,来自 37 个样本的 PacBio 和 Illumina RNA-seq 数据集允许鉴定和校正 121606 个转录本,数据还促进了 15640 个可变剪接事件、2148 个 lncRNA 和 1752 个融合转录本的预测,从而更清楚地了解蚕豆基因组的基因结构。此外,与参考基因组相比,共鉴定了 38850 个新基因,包括 56188 个转录本。最后,整合了参考基因组的遗传数据,形成了一个由 54753 个单基因衍生的 103267 个转录本组成的全面、完整的蚕豆转录组序列。