Institute of Bioorganic Chemistry, Polish Academy of Sciences, 61-704 Poznan, Poland
Institute of Computing Science, Faculty of Computing Science, Poznan University of Technology, Poznan, Poland.
Plant Cell. 2020 Jun;32(6):1797-1819. doi: 10.1105/tpc.19.00640. Epub 2020 Apr 7.
Copy number variations (CNVs) greatly contribute to intraspecies genetic polymorphism and phenotypic diversity. Recent analyses of sequencing data for >1000 Arabidopsis () accessions focused on small variations and did not include CNVs. Here, we performed genome-wide analysis and identified large indels (50 to 499 bp) and CNVs (500 bp and larger) in these accessions. The CNVs fully overlap with 18.3% of protein-coding genes, with enrichment for evolutionarily young genes and genes involved in stress and defense. By combining analysis of both genes and transposable elements (TEs) affected by CNVs, we revealed that the variation statuses of genes and TEs are tightly linked and jointly contribute to the unequal distribution of these elements in the genome. We also determined the gene copy numbers in a set of 1060 accessions and experimentally validated the accuracy of our predictions by multiplex ligation-dependent probe amplification assays. We then successfully used the CNVs as markers to analyze population structure and migration patterns. Finally, we examined the impact of gene dosage variation triggered by a CNV spanning the gene on expression at both the transcript and protein levels. The catalog of CNVs, CNV-overlapping genes, and their genotypes in a top model dicot will stimulate the exploration of the genetic basis of phenotypic variation.
拷贝数变异(CNVs)极大地促进了种内遗传多态性和表型多样性。最近对超过 1000 个拟南芥(Arabidopsis)品系的测序数据分析侧重于小的变异,不包括 CNVs。在这里,我们进行了全基因组分析,在这些品系中鉴定了大的缺失(50 到 499 bp)和 CNVs(500 bp 及更长)。这些 CNVs 完全与 18.3%的编码蛋白基因重叠,在进化上年轻的基因和与应激和防御相关的基因中富集。通过结合对受 CNVs 影响的基因和转座元件(TEs)的分析,我们揭示了基因和 TEs 的变异状态紧密相关,共同导致这些元件在基因组中的不均匀分布。我们还确定了 1060 个品系中的基因拷贝数,并通过多重连接依赖性探针扩增检测实验验证了我们预测的准确性。然后,我们成功地将 CNVs 用作标记来分析群体结构和迁移模式。最后,我们研究了跨越基因的 CNV 引发的基因剂量变化对转录和蛋白水平表达的影响。该模式生物拟南芥的 CNV 目录、CNV 重叠基因及其基因型将激发对表型变异遗传基础的探索。