Department of Genome Sciences, University of Washington, Seattle, Washington 98195, USA.
Molecular and Cellular Biology Program, University of Washington, Seattle, Washington 98195, USA.
Genome Res. 2018 Aug;28(8):1169-1178. doi: 10.1101/gr.231753.117. Epub 2018 Jul 3.
Short tandem repeat (STR) mutations may comprise more than half of the mutations in eukaryotic coding DNA, yet STR variation is rarely examined as a contributor to complex traits. We assessed this contribution across a collection of 96 strains of , genotyping 2046 STR loci each, using highly parallel STR sequencing with molecular inversion probes. We found that 95% of examined STRs are polymorphic, with a median of six alleles per STR across these strains. STR expansions (large copy number increases) are found in most strains, several of which have evident functional effects. These include three of six intronic STR expansions we found to be associated with intron retention. Coding STRs were depleted of variation relative to noncoding STRs, and we detected a total of 56 coding STRs (11%) showing low variation consistent with the action of purifying selection. In contrast, some STRs show hypervariable patterns consistent with diversifying selection. Finally, we detected 133 novel STR-phenotype associations under stringent criteria, most of which could not be detected with SNPs alone, and validated some with follow-up experiments. Our results support the conclusion that STRs constitute a large, unascertained reservoir of functionally relevant genomic variation.
短串联重复 (STR) 突变可能占真核编码 DNA 突变的一半以上,但 STR 变异很少被视为复杂性状的贡献因素。我们使用分子反转探针的高度并行 STR 测序,对 96 株 进行了基因分型,共检测了 2046 个 STR 基因座,评估了这种贡献。我们发现,95%的被检测 STR 是多态的,这些菌株的每个 STR 平均有六个等位基因。在大多数菌株中发现了 STR 扩展(大量拷贝数增加),其中一些具有明显的功能效应。其中包括我们发现与内含子保留相关的六个内含子 STR 扩展中的三个。与非编码 STR 相比,编码 STR 的变异减少,我们总共检测到 56 个编码 STR(11%),其低变异性与纯化选择作用一致。相比之下,一些 STR 显示出与多样化选择一致的高变异模式。最后,我们在严格的标准下检测到 133 个新的 STR-表型关联,其中大多数不能仅通过 SNPs 检测到,并用后续实验验证了一些。我们的结果支持 STR 构成了一个功能相关基因组变异的大型、未确定的储备库的结论。