Department of Plant Sciences, University of California Davis, Davis, California, United States of America.
PLoS One. 2011;6(8):e24230. doi: 10.1371/journal.pone.0024230. Epub 2011 Aug 31.
The wheat stripe rust fungus (Puccinia striiformis f. sp. tritici, PST) is responsible for significant yield losses in wheat production worldwide. In spite of its economic importance, the PST genomic sequence is not currently available. Fortunately Next Generation Sequencing (NGS) has radically improved sequencing speed and efficiency with a great reduction in costs compared to traditional sequencing technologies. We used Illumina sequencing to rapidly access the genomic sequence of the highly virulent PST race 130 (PST-130).
METHODOLOGY/PRINCIPAL FINDINGS: We obtained nearly 80 million high quality paired-end reads (>50x coverage) that were assembled into 29,178 contigs (64.8 Mb), which provide an estimated coverage of at least 88% of the PST genes and are available through GenBank. Extensive micro-synteny with the Puccinia graminis f. sp. tritici (PGTG) genome and high sequence similarity with annotated PGTG genes support the quality of the PST-130 contigs. We characterized the transposable elements present in the PST-130 contigs and using an ab initio gene prediction program we identified and tentatively annotated 22,815 putative coding sequences. We provide examples on the use of comparative approaches to improve gene annotation for both PST and PGTG and to identify candidate effectors. Finally, the assembled contigs provided an inventory of PST repetitive elements, which were annotated and deposited in Repbase.
CONCLUSIONS/SIGNIFICANCE: The assembly of the PST-130 genome and the predicted proteins provide useful resources to rapidly identify and clone PST genes and their regulatory regions. Although the automatic gene prediction has limitations, we show that a comparative genomics approach using multiple rust species can greatly improve the quality of gene annotation in these species. The PST-130 sequence will also be useful for comparative studies within PST as more races are sequenced. This study illustrates the power of NGS for rapid and efficient access to genomic sequence in non-model organisms.
小麦条锈菌(Puccinia striiformis f. sp. tritici,PST)是导致全球小麦减产的主要原因。尽管其具有重要的经济意义,但 PST 基因组序列目前尚未公布。幸运的是,与传统测序技术相比,新一代测序(NGS)技术极大地提高了测序速度和效率,同时降低了成本。我们使用 Illumina 测序技术快速获得了高度毒性的 PST 菌株 130(PST-130)的基因组序列。
方法/主要发现:我们获得了近 8000 万个高质量的配对末端读取(>50x 覆盖率),这些读取被组装成 29178 个重叠群(64.8Mb),估计至少覆盖了 PST 基因的 88%,并可通过 GenBank 获取。与 Puccinia graminis f. sp. tritici(PGTG)基因组的广泛微同线性和与注释的 PGTG 基因的高度序列相似性支持了 PST-130 重叠群的质量。我们对 PST-130 重叠群中存在的转座元件进行了特征描述,并使用从头预测程序识别并临时注释了 22815 个推定编码序列。我们提供了使用比较方法来改进 PST 和 PGTG 的基因注释以及识别候选效应子的示例。最后,组装的重叠群提供了 PST 重复元件的清单,这些元件被注释并保存在 Repbase 中。
结论/意义:PST-130 基因组的组装和预测的蛋白质为快速识别和克隆 PST 基因及其调控区域提供了有用的资源。尽管自动基因预测存在局限性,但我们表明,使用多个锈菌物种的比较基因组学方法可以极大地提高这些物种的基因注释质量。随着更多菌株的测序,PST-130 序列也将有助于 PST 内的比较研究。本研究说明了 NGS 在快速有效地获取非模式生物基因组序列方面的强大功能。