Zhang Zhi-Qin, Jiang Juan, Xu Yong-Chao, Dent Craig, Sureshkumar Sridevi, Balasubramanian Sureshkumar, Guo Ya-Long
State Key Laboratory of Plant Diversity and Specialty Crops/State Key Laboratory of Systematic and Evolutionary Botany, Institute of Botany, Chinese Academy of Sciences, Beijing, 100093, China.
China National Botanical Garden, Beijing, 100093, China.
Genome Biol. 2025 Aug 12;26(1):242. doi: 10.1186/s13059-025-03720-5.
Short tandem repeat (STR) mutations are major drivers of genetic variation and deeply influence phenotypic diversity and evolution, they are often overlooked despite their significant effects.
Here, we leverage mutation accumulation lines descended from Col-0 accession of Arabidopsis thaliana to assess the variation in the repeat length of STRs (STR mutation rate). We find that STR mutation rate far exceeds single nucleotide polymorphisms rates. Interspecific comparison between A. thaliana and Arabidopsis lyrata reveals rapid STR turnover, with the most majority of the loci occurring only in A. thaliana. Intraspecific comparison of ten assembled A. thaliana genomes reveals that 29.3% of STRs display presence/absence variations, 36.5% show length variation, 21.2% have both types of variations, while only a small proportion have no variation. By association analysis, we find several STRs are associated with diverse phenotypes. Further analysis based on RNA-seq dataset from 413 accessions, we identify 3,871 expression-associated STRs and 651 splicing-associated STRs, of which over one thousand co-localized with known signals for diverse traits detected by genome-wide association studies. Notably, based on analysis of the expression levels of 24,175 genes and splice site strength values of 12,784 splice sites, as well as 16 phenotypes of natural A. thaliana populations, we determine the similar average heritability of these three trait sets explained by STR variation.
Our results reveal the evolutionary dynamics of STRs, and highlight the importance of STR variation as an important contributor to missing heritability in regulating complex traits.
短串联重复序列(STR)突变是遗传变异的主要驱动因素,对表型多样性和进化有深远影响,尽管其影响显著,但常被忽视。
在这里,我们利用拟南芥Col-0生态型的突变积累系来评估STR重复长度的变异(STR突变率)。我们发现STR突变率远超过单核苷酸多态性率。拟南芥和琴叶拟南芥的种间比较显示STR快速更替,大多数位点仅存在于拟南芥中。对十个组装的拟南芥基因组进行种内比较发现,29.3%的STR存在缺失变异,36.5%显示长度变异,21.2%具有两种类型的变异,而只有一小部分没有变异。通过关联分析,我们发现几个STR与多种表型相关。基于来自413个生态型的RNA-seq数据集进行进一步分析,我们鉴定出3871个与表达相关的STR和651个与剪接相关的STR,其中超过一千个与全基因组关联研究检测到的多种性状的已知信号共定位。值得注意的是,基于对24175个基因的表达水平、12784个剪接位点的剪接位点强度值以及拟南芥自然种群的16种表型的分析,我们确定了这三组性状由STR变异解释的相似平均遗传力。
我们的结果揭示了STR的进化动态,并强调了STR变异作为调节复杂性状中缺失遗传力的重要贡献者的重要性。