Cooper James W, Wilson Michael H, Derks Martijn F L, Smit Sandra, Kunert Karl J, Cullis Christopher, Foyer Christine H
Centre for Plant Sciences, Faculty of Biology, University of Leeds, Leeds LS2 9JT, UK.
Bioinformatics Group, Wageningen University, Droevendaalsesteeg 1, 6708PB Wageningen, The Netherlands.
J Exp Bot. 2017 Apr 1;68(8):1941-1953. doi: 10.1093/jxb/erx117.
Grain legume improvement is currently impeded by a lack of genomic resources. The paucity of genome information for faba bean can be attributed to the intrinsic difficulties of assembling/annotating its giant (~13 Gb) genome. In order to address this challenge, RNA-sequencing analysis was performed on faba bean (cv. Wizard) leaves. Read alignment to the faba bean reference transcriptome identified 16 300 high quality unigenes. In addition, Illumina paired-end sequencing was used to establish a baseline for genomic information assembly. Genomic reads were assembled de novo into contigs with a size range of 50-5000 bp. Over 85% of sequences did not align to known genes, of which ~10% could be aligned to known repetitive genetic elements. Over 26 000 of the reference transcriptome unigenes could be aligned to DNA-sequencing (DNA-seq) reads with high confidence. Moreover, this comparison identified 56 668 potential splice points in all identified unigenes. Sequence length data were extended at 461 putative loci through alignment of DNA-seq contigs to full-length, publicly available linkage marker sequences. Reads also yielded coverages of 3466× and 650× for the chloroplast and mitochondrial genomes, respectively. Inter- and intraspecies organelle genome comparisons established core legume organelle gene sets, and revealed polymorphic regions of faba bean organelle genomes.
目前,缺乏基因组资源阻碍了豆科作物的改良。蚕豆基因组信息的匮乏可归因于组装/注释其庞大(约13 Gb)基因组的内在困难。为了应对这一挑战,对蚕豆(品种Wizard)叶片进行了RNA测序分析。将 reads 比对到蚕豆参考转录组中,鉴定出16300个高质量单基因。此外,利用Illumina双末端测序为基因组信息组装建立基线。基因组 reads 被从头组装成大小范围为50 - 5000 bp的重叠群。超过85%的序列与已知基因不匹配,其中约10%可与已知的重复遗传元件匹配。超过26000个参考转录组单基因可与DNA测序(DNA-seq)reads 高度自信地比对。此外,这种比较在所有鉴定出的单基因中确定了56668个潜在剪接位点。通过将DNA-seq重叠群与公开可用的全长连锁标记序列比对,在461个推定基因座处扩展了序列长度数据。reads 还分别为叶绿体和线粒体基因组产生了3466倍和650倍的覆盖度。种间和种内细胞器基因组比较确定了核心豆科细胞器基因集,并揭示了蚕豆细胞器基因组的多态性区域。