Wellcome Trust Centre for Human Genetics, University of Oxford, Oxford OX3 7BN, UK.
Nature. 2011 Aug 28;477(7365):419-23. doi: 10.1038/nature10414.
Genetic differences between Arabidopsis thaliana accessions underlie the plant's extensive phenotypic variation, and until now these have been interpreted largely in the context of the annotated reference accession Col-0. Here we report the sequencing, assembly and annotation of the genomes of 18 natural A. thaliana accessions, and their transcriptomes. When assessed on the basis of the reference annotation, one-third of protein-coding genes are predicted to be disrupted in at least one accession. However, re-annotation of each genome revealed that alternative gene models often restore coding potential. Gene expression in seedlings differed for nearly half of expressed genes and was frequently associated with cis variants within 5 kilobases, as were intron retention alternative splicing events. Sequence and expression variation is most pronounced in genes that respond to the biotic environment. Our data further promote evolutionary and functional studies in A. thaliana, especially the MAGIC genetic reference population descended from these accessions.
拟南芥不同品系之间的遗传差异是其广泛表型变异的基础,直到现在,这些差异主要是在 Col-0 这个已注释的参照品系背景下进行解释的。在这里,我们报告了 18 种自然拟南芥品系的基因组测序、组装和注释,以及它们的转录组。根据参照注释进行评估,至少有三分之一的蛋白质编码基因在至少一个品系中被预测为缺失。然而,对每个基因组的重新注释表明,替代的基因模型通常可以恢复编码能力。在幼苗中,近一半的表达基因的表达存在差异,并且经常与 5 千碱基内的顺式变异相关,内含子保留的选择性剪接事件也是如此。在响应生物环境的基因中,序列和表达的变异最为明显。我们的数据进一步促进了拟南芥的进化和功能研究,特别是这些品系衍生的 MAGIC 遗传参考群体。