Department of Plant Biology, Michigan State University, East Lansing, Michigan, United States of America.
PLoS One. 2012;7(3):e33071. doi: 10.1371/journal.pone.0033071. Epub 2012 Mar 16.
Maize is rich in genetic and phenotypic diversity. Understanding the sequence, structural, and expression variation that contributes to phenotypic diversity would facilitate more efficient varietal improvement. RNA based sequencing (RNA-seq) is a powerful approach for transcriptional analysis, assessing sequence variation, and identifying novel transcript sequences, particularly in large, complex, repetitive genomes such as maize. In this study, we sequenced RNA from whole seedlings of 21 maize inbred lines representing diverse North American and exotic germplasm. Single nucleotide polymorphism (SNP) detection identified 351,710 polymorphic loci distributed throughout the genome covering 22,830 annotated genes. Tight clustering of two distinct heterotic groups and exotic lines was evident using these SNPs as genetic markers. Transcript abundance analysis revealed minimal variation in the total number of genes expressed across these 21 lines (57.1% to 66.0%). However, the transcribed gene set among the 21 lines varied, with 48.7% expressed in all of the lines, 27.9% expressed in one to 20 lines, and 23.4% expressed in none of the lines. De novo assembly of RNA-seq reads that did not map to the reference B73 genome sequence revealed 1,321 high confidence novel transcripts, of which, 564 loci were present in all 21 lines, including B73, and 757 loci were restricted to a subset of the lines. RT-PCR validation demonstrated 87.5% concordance with the computational prediction of these expressed novel transcripts. Intriguingly, 145 of the novel de novo assembled loci were present in lines from only one of the two heterotic groups consistent with the hypothesis that, in addition to sequence polymorphisms and transcript abundance, transcript presence/absence variation is present and, thereby, may be a mechanism contributing to the genetic basis of heterosis.
玉米具有丰富的遗传和表型多样性。了解导致表型多样性的序列、结构和表达变化,将有助于更有效地进行品种改良。基于 RNA 的测序(RNA-seq)是一种强大的转录分析方法,可用于评估序列变异,以及鉴定新的转录本序列,特别是在玉米等大型、复杂、重复的基因组中。在这项研究中,我们对 21 个代表不同北美和外来种质资源的玉米自交系的全苗幼苗进行了 RNA 测序。单核苷酸多态性(SNP)检测鉴定出了 351710 个分布在整个基因组中的多态性位点,涵盖了 22830 个注释基因。这些 SNP 作为遗传标记,明显聚类了两个不同的杂种优势群和外来群体。转录丰度分析显示,这 21 个系的总表达基因数量变化很小(57.1%至 66.0%)。然而,21 个系之间的转录基因集存在差异,其中 48.7%的基因在所有系中表达,27.9%的基因在 1 到 20 个系中表达,23.4%的基因在任何系中都不表达。对未映射到参考 B73 基因组序列的 RNA-seq reads 进行从头组装,揭示了 1321 个高可信度的新转录本,其中 564 个基因座存在于所有 21 个系中,包括 B73,757 个基因座局限于部分系中。RT-PCR 验证表明,这些表达的新转录本的计算预测与实验结果的一致性为 87.5%。有趣的是,145 个新从头组装的基因座仅存在于两个杂种优势群中的一个群体的系中,这与以下假设一致,即除了序列多态性和转录本丰度之外,转录本的存在/缺失变异也是存在的,并且可能是杂种优势遗传基础的一个机制。